Define a Hive ODBC data source - Connect_ETL - 9.13

Connect ETL Installation Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
Language
English
Product name
Connect ETL
Title
Connect ETL Installation Guide
Copyright
2024
First publish date
2003
Last updated
2024-11-08
Published on
2024-11-08T16:36:35.232000

To identify a Hive ODBC data source, you create a data source name (DSN) and set options required to connect to the data source. 

Define a Hive data source on Windows

Start the ODBC Data Source Administrator by following the instructions for Windows systems at Define ODBC Data Sources.

In the Create New Data Source dialog, select your Hive ODBC vendor's driver from the list, and then click Finish.

Use the following settings in the ODBC Driver DSN Setup dialog for a Hive ODBC data source:

Data Source Name Enter a name to identify the Hive DSN
Host Enter the IP address or hostname of the Hive server.
Port Enter the listening port for the Hive service. The default is 10000.
Database Enter the name of the database schema you want to use. The default schema is default.
Hive Server Type Select Hive Server 2.
Authentication mechanism Most Hive installations use User Name authentication by default. The authentication mechanism for a Hive data source must match the mechanism in use on the Hive server or the connection fails. Check with your Hadoop system administrator.
Advanced Options Check Use Native Query and then click OK. Some ODBC Hive drivers work with both HiveQL and SQL query languages. This setting enables the use of native HiveQL instead of SQL.

Define a Hive data source on Linux

Find general instructions for UNIX systems at Define ODBC Data Sources.

The default location Connect for Big Data uses for the odbc.ini ODBC configuration file is <connect_install>/etc/odbc.ini. If you decide to use a different location and file, set the ODBCSYSINI environment variable to the full path and file name of your file.

In the odbc.ini file, add a new Hive data source entry to the [ODBC Data Sources] section. Use the format, <data-source-name>=<your-driver-description>:
[ODBC Data Sources] 
Sample Hive DSN 64=Hive ODBC Driver 64-bit
Configure the new Hive data source by adding a section similar to the following to the odbc.ini file. Note that sample values are shown. Consult your Hadoop system administrator for guidance on settings appropriate for your environment:
[Sample Hive DSN 64] 
Driver=/<vendor's-Hive-ODBC-driver-installation>/<vendor’s_Hive_ODBC_library_file> 
HiveServerType=2 
HOST=<hive-server> 
PORT=10000 
UseNativeQuery=1 
AuthMech=2

The following table lists valid values:

Odbc.ini options Description
Driver Set the location of the installed Hive ODBC Driver file. Find the driver file <vendor’s_Hive_ODBC_library_file>, for example, libhortonworkshiveodbc64.so, under your installed files /lib/ directory.
HiveServerType

Set the HiveServerType to 2, for HiveServer2. HiveServer2 is a newer version with improvements from that of HiveServer and with additional features.

1 (default) HiveServer

2 HiveServer2

HOST Set the IP address or hostname of the Hive server.
PORT Set the listening port for the service. The default port for Connect for Big Data Hive installation is 10000.
UseNativeQuery

Set the UseNativeQuery value to 1. Some Hive ODBC drivers work with both HiveQL and SQL query languages.

0 (default) enables the SQL Connector feature

1 enables the HiveQL query language and disables the SQL Connector feature

AuthMech

Set the AuthMech value to the number representing the same authentication mechanism as the Hive server. Most Hive installations use User Name authentication (value 2) by default.

0 no authentication

1 Kerberos

2 (default) User Name

User Name and Password

4 User Name and Password (SSL)

5 Windows Azure HDInsight Emulator

6 Windows Azure HDInsight Service

7 HTTP

8 HTTPS