This example illustrates how to connect to an external database that uses drivers which are not shipped with Data360 Analyze.
See also Acquiring data from MS Access for a simple example of how to connect to a database that uses a driver which is shipped with the application.
- Downloaded all required JDBC drivers for your Hadoop installation from the database vendor.
- Install the third party drivers that you downloaded in step 1. We recommend that you install the driver files in the following location, ensuring that drivers for different databases are stored in separate sub-directories:
<Data360Analyze site configuration directory>/site-<port>/lib/java/db/<driverName>
For example:
<Data360Analyze site configuration directory>/site-<port>/lib/java/Cloudera/Cloudera_HiveJDBC42
- Obtain the
keytab
file and thekrb5.conf
file for your setup from your Kerberos administrator. Place these files on the Data360 Analyze server and note the location for the next step. - Create a file named
gss-jaas.conf
In this file, include the following information, replacing the
principal
andkeyTab
settings with actual values:Hortonworks example
com.sun.security.jgss.initiate { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true useTicketCache=true principal="yourprincipaluser@yourcompany.com" keyTab="/location/to/your.keytab" debug=false; };
Cloudera example
Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true useTicketCache=true principal="yourprincipaluser@yourcompany.com" keyTab="/location/to/your.keytab" debug=false; };
Tip: Check with your Kerberos administrator if you are unsure what the principal account is. The keytab location is the location where you placed the keytab file in the previous step. - OpenData360 Analyze, then from the Directory select Create > Data Flow.
- In the Nodes panel, search for the JDBC Query or Database Metadata node and drag it onto the canvas.
- Select the node. From the Properties panel, expand the Advanced property group and configure the following properties:
- DbUrl - Enter the database connection URL. There may be specific requirements for your JDBC URL. Please refer to your database administrator or Kerberos administrator to configure the URL appropriately. For example
jdbc:hive2://XXXXXXXXXXX:10000/default;principal=hive/XXXXXX@COMPANY.COM
- DbDriver - Enter the following:
org.apache.hive.jdbc.HiveDriver
- DbDriverClasspath - Specify the full path to the directory that contains your Hive drivers using the
%ls.appDataDir%
substitution for anything installed into your Data360 Analyze site configuration directory. For example if you installed into:
C:\ProgramData\Data360Analyze\site\lib\java\Data3SixtyAnalyze_External
You should reference this using:
%ls.appDataDir%\lib\java\Data3SixtyAnalyze_External
If you want to reference multiple directories, list the directories using a semi colon to separate each entry, for example:
%ls.appDataDir%\lib\java\Hive_drivers1;%ls.appDataDir%\lib\java\Hive_drivers2
- DbUrl - Enter the database connection URL. There may be specific requirements for your JDBC URL. Please refer to your database administrator or Kerberos administrator to configure the URL appropriately. For example
- Select the node and from the menu button in the top right corner of the Properties panel, select Show Hidden Properties.
- In the JvmArguments property, add the following, replacing the locations of the
gss-jaas.conf
andkrb5.conf
files with actual locations:-Djava.security.auth.login.config=/location/to/gss-jaas.conf -Djava.security.krb5.conf=/location/to/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false
Note: There are two similarly named properties, please ensure that you add the above information to the JvmArguments property and not to the _JvmArguments property. Each JVM argument must be on a separate line. - Run the node to import data.