In the Database Connection dialog, the general pattern to define a connection to an Impala database is as follows:
- At DBMS, select Impala.
- At Access Method, select JDBC.
- At Database, select a previously defined Impala JDBC database connection URL.
- At Authentication,
select Auto-detect or Kerberos.Note: When Kerberos authentication is required, ensure that Kerberos is selected.
Defining Impala sources
For all Connect for Big Data ETL jobs, Connect for Big Data supports Impala database tables as source and as lookup source.
At the Source Database Table dialog or at the Lookup Source Database Table dialog define either an Impala database table source or lookup source respectively:
- At Connection, select a previously defined Impala source connection or select Add new... to add a new connection.
- On the Parameters tab, the following optional parameters are available for Impala
database table sources and lookup sources:
- Filter - equivalent to the text that follows a WHERE clause in a SQL query, the filter parameter specifies the condition upon which records are extracted from an Impala source table.
- For partitioned Impala database table sources and lookup sources, you can specify a partition predicate in the WHERE clause, which serves as a filter that enables partition pruning and limits scanning to those portions of the table relevant to partitions.
- Work table directory - serves as the parent-level directory beneath which job-specific subdirectories are created for staging data.
- Work table schema - the schema used to create the staging table.
- Impala configuration properties - any Impala configuration property can be entered manually in the parameters grid.