Before attempting to connect to Amazon Redshift, do the following:
- Configure the Connect server, which can be either an Amazon Elastic Compute Cloud (EC2) instance or your local machine, to accept SSH connections.
- Depending on the Connect server, consider the following:
- EC2 instance – Set the size of the maximum transmission unit (MTU).
- Local machine - Due to throughput on the wide area network (WAN), you may notice a performance lag at design time and at runtime.
If the local machine is behind a firewall, you may need to configure a Virtual Private Network (VPN) to connect to the local machine from Amazon Redshift.
- Configure the Connect server to include the Amazon Redshift cluster public key and cluster node IP addresses:
- Retrieve the Amazon Redshift cluster public key and cluster node IP addresses.
- Add the Amazon Redshift cluster public key to the Connect host's authorized keys file.
- Configure the Connect host to accept all of the Amazon Redshift cluster node IP addresses.
- Get the public key for the Connect host.
- Specify Amazon Redshift parameters in the Connect Redshift configuration file.
The parameters outlined in the Connect Redshift configuration file, as defined by the
DMX_REDSHIFT_INI_FILE environment variable, provide Connect with the values required to
access an Amazon S3 bucket and to invoke the Amazon Redshift COPY command.
Note: If
DMX_REDSHIFT_INI_FILE is not set, Connect issues an error message upon task initiation and
the Connect task aborts.