Connecting to HIVE as a data source - Data360_DQ+ - 11.X

Data360 DQ+ Enterprise Installation

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 DQ+
Version
11.X
Language
English
Product name
Data360 DQ+
Title
Data360 DQ+ Enterprise Installation
Copyright
2024
First publish date
2016
ft:lastEdition
2024-06-06
ft:lastPublication
2024-06-06T12:37:34.761477

After installing the fix pack, and verifying the installation, complete the following steps to connect to and capture data from HIVE using a Kerberos Keytab.

  1. Download build.gradle into the /tmp folder.
  2. Copy build.gradle to the /opt/infogix/dqplus-<version>/scripts folder.
  3. Grant execute permission to the file. For example:
    cp /tmp/build.gradle /opt/infogix/dqplus-10.2/scripts
    chmod 777 build.gradle
  4. Copy the keytab file and hive jar files to /opt/infogix/dqplus-<version>/overrides/hive.
    Tip: Make sure you use distinct names for all keytab files, for example sagacity.hadoop.keytab, sagacity.kafka.keytab, and sagacity.hive.keytab.

    The following drivers have been tested and verified to connect to Hortonworks 2.6 / Hive 1.2 as a data source using Kerberos authentication. Note that other versions of drivers have not been validated and might not work.

    • curator-client-2.6.0.jar
    • curator-framework-2.6.0.jar
    • curator-recipes-2.6.0.jar
    • hive-beeline-1.2.1000.2.6.0.3-8.jar
    • hive-cli-1.2.1000.2.6.0.3-8.jar
    • hive-exec-1.2.1000.2.6.0.3-8-mod.jar
    • hive-jdbc-1.2.1000.2.6.0.3-8.jar
    • hive-metastore-1.2.1000.2.6.0.3-8.jar
    • hive-service-1.2.1000.2.6.0.3-8.jar
  5. For Cloudera 7.1.9 and Hive 2.1 Copy the keytab file , cm-auto-global_truststore.jks and hive jar files to /opt/infogix/dqplus-<version>/overrides/hive.

    The following drivers have been tested and verified to connect to Cloudera 7.1.9 / Hive 2.1 as a data source using Kerberos authentication. Note that other versions of drivers have not been validated and might not work.

    • hive-jdbc-2.1.1-cdh7.1.9-standalone.jar
  6. Execute the deploy command.

After installing the fix pack, and verifying the installation, complete the following steps to connect to and capture data from HIVE using a Kerberos Keytab.

Complete the following steps to connect to and capture data from HIVE using a Kerberos Keytab:

  1. Download build.gradle into the /tmp folder.
  2. Copy build.gradle to the /opt/infogix/dqplus-<version>/scripts folder
  3. Grant execute permission to the file. For example:
    cp /tmp/build.gradle /opt/infogix/dqplus-10.2/scripts
    chmod 777 build.gradle
  4. Copy the keytab file to /opt/infogix/dqplus-<version>/runtime
    Tip: Ensure that you use distinct names for all keytab files, for example sagacity.hadoop.keytab, sagacity.kafka.keytab, and sagacity.hive.keytab.
  5. Execute the deploy command.

Defining the JDBC URL for Cloudera 7.1.9 and Hive 2.1 in a Data Store

Complete the following steps to define the JDBC URL for Cloudera 7.1.9 and Hive 2.1.

In the JDBC URL, add the sslTrustStore property with the full path to the trust store: sslTrustStore=/opt/cafe/util/spark/jars-hive/cm-auto-global_truststore.jks.

For interactive mode, the full JDBC URL is as follows:

jdbc:hive2://<hostname>:10000/;principal=hive/hostname@INFOGIX.COM;ssl=true;sslTrustStore=/opt/cafe/util/spark/jars-hive/cm-auto-global_truststore.jks

After Test Connection, Generate Fields, Test Data Store and Test Analysis are successful in interactive mode, for analysis execution, use sslTrustStore=cm-auto-global_truststore.jks, so the JDBC URL is as follows:

jdbc:hive2://<hostname>:10000/;principal=hive/<hostname>@INFOGIX.COM;ssl=true;sslTrustStore=cm-auto-global_truststore.jks

Using a keytab in an analysis

To run an analysis by using a keytab for hive data store, set the following execution properties in the analysis:

Property Description
cafe.db.hive.keytab.user The keytab principal name
cafe.db.hive.keytab.file

The /opt/cafe/util/spark/jars-hive/keytab file name