Using a Hive ODBC or JDBC connection, Connect for Big Data can read supported Hive data types
from all of the supported Hive file types, including Apache Avro, Apache Parquet, Optimized
Row Columnar (ORC), Record Columnar (RCFile), and text. JDBC is recommended over ODBC. For
jobs run in the cluster, Connect for Big Data supports reading from Hive sources using JDBC
only.
Note: On an ETL server/edge node, reading from Hive sources via Hive ODBC/JDBC drivers
yields low throughput and is best reserved for no more than a few gigabytes of data, such as
pre-aggregated data for analytics.
JDBC connectivity
When Connect for Big Data reads from a Hive table in the cluster via JDBC, the data is temporarily staged in compressed or uncompressed format to a text-backed Hive table.