Sqoop In - Data360_DQ+ - Latest

Data360 DQ+ Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 DQ+
Version
Latest
Language
English
Product name
Data360 DQ+
Title
Data360 DQ+ Help
Copyright
2024
First publish date
2016
ft:lastEdition
2024-07-09
ft:lastPublication
2024-07-09T15:09:58.774265

The Sqoop In node can be used to efficiently ingest data from an external, relational database. This feature utilizes Apache Sqoop.

To utilize the node, Arguments and values for those Arguments must be set in the node's property panel. Fields to accept what is returned by the Sqoop In node must also be configured.

Note: This node does not actually execute on a sample of the Data Set when building an Analysis in the UI.

Creating arguments

To use the Sqoop In node, you must define a set of Arguments and values for each Argument.

Arguments should be valid Apache Sqoop Arguments, preceded by a ‘--'.

For example, to connect to a database via a JDBC URL, you would create a ‘--connect' Argument. After creating the ‘--connect' Argument, you would then create a value for it, using the JDBC URL String. For example, ‘jdbc:mysql://172.17.30.206:3306/test'

Within the UI, this argument and value would be specified and listed as follows:

--connect

jdbc:mysql://172.17.30.216:3306/test

Encrypting arguments

Sqoop In Arguments and their values may also be encrypted. This option should be used, for example, when creating the value for a --password argument.

Arguments used to connect to a database

While a full listing of all commands that can be used within Apache Sqoop is beyond the scope of this help, the following example can be used as a model for how to use this node to make a basic connection to an external, relational database.

--connect

jdbc:mysql://172.17.30.216:3306/test

--username

userRoot123

--password

mypassword

--as-textfile

--compress

--null-string

""

--null-non-string

""

--table

customers

--columns

id, name, purchaseAmount

--split-by

id

In this example, the arguments --connect, --username, and --password and their respective values are common Apache Sqoop arguments. The arguments --as-textfile, --compress, --null-string, --null-non-string, --table, --columns, and -- split-by and their respective values are Apache Sqoop Import Control Arguments.

A full description of these, and other Apache Sqoop Arguments, can be found here.

Fields

In addition to creating Arguments, you must also create Fields in the Sqoop In node, to accept what is received by ingestion and to hold the values you want to process within your Analysis. These fields should match those specified underneath the --columns argument in the Arguments tab.