Snowflake requires a JDBC connection configuration with the driver name and location for all connections. Before attempting to connect to Snowflake, do the following:
- Install Connect server on an Amazon Elastic Compute Cloud (EC2) instance or your local machine.
- Specify JDBC and cluster parallelization parameters in the Connect JDBC configuration file.
The parameters outlined in the Connect JDBC configuration file, as defined by the DMX_JDBC_INI_FILE environment variable, provide Connect with the mandatory and optional values required to access an Amazon S3 bucket and to invoke a Snowflake COPY/MERGE query.
- If Connect runs inside EC2, attach an IAM role
to the EC2 instance with the following conditions:
- The attached IAM role must grant Connect read and write access to objects in the work bucket specified in the configuration file.
- Configure the IAM role for Snowflake.
- If the IAM role configured for Snowflake is not the same role attached to EC2, set the IAMROLE
parameter in the configuration file to the IAM role configured for snowflake. Note: When Connect cannot get temporary security credentials from an IAM role, Connect issues an error message and the Connect task stops.
- When Connect is runs outside of an EC2 instance, Connect accesses snowflake using keys-based authentication. If no access keys are provided, Connect issues a UNIAMCRE error message aborts the job.
The parameters outlined in a Connect Snowflake configuration file include the following:
- DriverName - Required JDBC driver name.
- DriverClassPath - Required JDBC class path.
- MAXPARALLELSTREAMS - Optional integer representing Maximum number of parallel streams that can be established for loading data to the staging data. By default, MAXPARALLELSTREAMS is set to the number of CPUs available in the client machine.
- WORKTABLEDIRECTORY - Required path to an s3 bucket or local directory. If the path is an s3 url, s3://<bucket>, Connect creates an external staging data. If the path is a local directory, file://<user/data>, Connect creates an internal staging data using the specified local directory.
- WORKTABLESCHEMA - Optional schema name to create the staging data . The default schema for the staging data is the same as the target data.
- WORKTABLENCRYPTION - Server side encryption algorithm for encrypting staging data in the S3 bucket. Valid values are AES256 and aws:kms.
- AWSACCESSKEYID - A 20-character, alphanumeric string that Amazon provides upon establishing an AWS account. If Connect runs in EC2, AWSACCESSKEYID is optional.
- AWSACCESSKEY - The 40-character string, which is also referred to as the secret access key, which Amazon provides upon establishing an AWS account. If Connect runs in EC2, AWSACCESSKEY is optional.
Connect requires the access key id and the secret access key to send requests to an Amazon S3 bucket.
- IAMROLE - Optional Amazon Resource Name (ARN) for an IAM role that Snowflake uses for authentication and authorization if the same role is not attached to EC2. If EC2 and Snowflake share the same role, this parameter is not required.
- LoadViaPut - Optional character. If WORKTABLEDIRECTORY is not set, Connect uses a PUT command to load data when LoadViaPut is set to "y". If the work table directory is not provided and the LoadViaPut parameter isn't set to "y", the Connect job aborts with an error message.
Defining Snowflake database connections
In the Database Connection dialog, define a connection to a Snowflake database as follows:
- At DBMS, select Snowflake.
- At Access Method, select JDBC.
- At Database, select a previously defined Snowflake JDBC database connection URL.
- At Authentication, select Auto-detect.