Configuring CloudFSUtil - Connect_ETL - 9.13

Connect ETL Installation Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
ft:locale
en-US
Product name
Connect ETL
ft:title
Connect ETL Installation Guide
Copyright
2025
First publish date
2003
ft:lastEdition
2025-01-24
ft:lastPublication
2025-01-24T21:47:52.840000

Before you use CloudFSUtil to transfer your files, create a configuration file on your host storage system and set the properties specific to your envronment, including remote connection and authentication information.

The configuration file location must then be specified in the following environment variable: DMX_REMOTEFILE_INI_FILE=config_file_path

Note: This file is optional when the utility is running inside Cloud VM.

The configuration file contains sections for Databricks and Amazon S3 filesystems. The first line of the section is the remote system acronym, enclosed in brackets, followed by remote system-specific key value pairs, one pair per line. For example:

[remote_system_acronym]

key1=value1

key2=value2

Amazon s3 file system

Specify one or more sets of key=value pairs to authenticate the configuration file.

[s3]

  • AWSACCESSKEYID=AWS_access_key_ID
  • AWSACCESSKEY=AWS_secret_access_key
  • AWSACCESSKEY_REPO=alias_to_access_key_stored_in_Connect_repository

Example:

AWSACCESSKEYID=BSTSBEGQ111JLOKC2C

AWSACCESSKEY=XXUhlABCnU5JmNo05GszQZpjxxxxxxxxxxxxxxxxxx

Note the following:

  • When multiple types are specified, the AWS session token takes precedence. Access key authentication takes precedence over IAM role authentication.
  • If no authentication is specified, the process assumes that the utility is running in an EC2 instance and retrieves access key information from the metadata service.
  • If you do not want to provide sensitive information (AWSTOKEN or AWSACCESSKEY) in clear text, add the information to the Connect repository and specify the repository alias using the corresponding repository variables (AWSTOKEN_REPO or AWSACCESSKEY_REPO). See the Connect help topic “The DMExpress Repository Manager” on how to add sensitive strings to the Connect repository.
  • AWS Storage credentials are optional when CloudFSUtil runs inside EC2.

As an alternative, specify the following:

  • AWSTOKEN=AWS_session_token
  • AWSACCESSKEYID=AWS_access_key_ID
  • AWSACCESSKEY=AWS_secret_access_key
  • AWSTOKEN_REPO=alias_to_temporary_token_stored_in_Connect_repository

    For authentication with IAM roles, specify the following:

    • AWSSAMLIDPPLUGIN=AWS_SAML_Identity_Provider_Plugin
    • AWSIAMROLE=AWS_IAM_Role (optional)

Databricks file system

[dbfs]

  • DBFSHOST=host_url
  • DBFSTOKEN=token

Example:

DBFSHOST=https://xxx-123456798xxxxxxxxxxx.net

DBFSTOKEN=123456789xxxxxxxxxxxxxx

Azure Data Lake Storage (ADLS)

[azure]
  • AZURECLIENTID=azure_client_id
  • AZURECLIENTSECRET=azure_clientsecret
  • AZURETENANTID=azure_tenant_id

Example:

AZURECLIENTID="8437g845-.........-qpo024mn87we"

AZURECLIENTSECRET="secretAlias"

AZURETENANTID="p0ij458s-.........-w344ml09hnb5"