/DATADICTIONARY - Connect_ETL - 9.13

Connect ETL Data Transformation Language (DTL) Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
Language
English
Product name
Connect ETL
Title
Connect ETL Data Transformation Language (DTL) Guide
Copyright
2023
First publish date
2003
Last updated
2023-09-11
Published on
2023-09-11T19:01:45.019000

Purpose

To specify the location of external metadata.

Format

/DATADICTIONARY 
file_name
{ACUCOBOL|MFCOBOL|DMEXPRESS|VSCOBOL|XML| 
APACHEAVRO|APACHEPARQUET} 
[ENCODING encoding|character_set] 
[NATIONAL-LE] 
[SERVERCONNECTION connection] 
[INCLUDE include_list [include_list…]]

where

include_list = {LAYOUTS {(layout_list)| ALL} } {VALUES {(named_value_list)| ALL} } {CONDITIONS {(condition_list)|ALL} } {COLLATINGSEQUENCES {(collatingsequence_list)|ALL}} {DATABASECONNECTIONS {(dbconnection_list)|ALL} } {HCATCONNECTIONS {(hcatconnection_list)|ALL} } {SERVERCONNECTIONS {(serverconnection_list)|ALL} } {SFDCCONNECTIONS {(sfdcconnection_list)|ALL} } {MQCONNECTIONS {(mqconnection_list)|ALL} }
layout_list = layout [, layout…]
named_value_list = named_value [, named_value…]
condition_list = condition [, condition…]
collatingsequence_list = collatingsequence [,collatingsequence…]
dbconnection_list = dbconnection [, dbconnection…]
hcatconnection_ list = hcatconnection [, hcatconnection…]
serverconnection_list = serverconnection [,serverconnection…]
sfdcconnection_ list = sfdcconnection [, sfdcconnection…]
mqconnection_list = mqconnection [, mqconnection…]

Arguments

file_name

The pathname of a file that contains one of the following:

  • A Connect ETL task.
  • At least one complete COBOL data description with a level 01 entry. A complete COBOL program is not accepted.
  • An XML Document Type Definition (DTD) or W3C XML Schema. An XML file with a schema and content is accepted, but the content is ignored.
  • An Apache Avro or Apache Parquet schema

If the file is on a mainframe, the path name must include, as the base directory, the virtual directory /HFS, for files located on the HFS (UNIX) partition, or /MVS, for files located on the z/OS partition. The user’s catalog name must follow the virtual base directory. Any references to datasets in libraries must use a slash (/) to separate the library name from the dataset name in the path.

If the file is on a mainframe z/OS partition and is accessed via a Connect:Direct server connection, the path can include slash (/) as the base directory in place of the /MVS virtual directory

Note: If the file is on a mainframe and accessed via an FTP server connection, the virtual directories /DMX_FTP_HFS and /DMX_FTP_MVS are supported for backward compatibility for older tasks.

For details on specifying a file path, see File Name and Syntax Requirements.

encoding

The encoding argument can be used for encoding COBOL copybook metadata files with one of the following built-in character sets: ASCII, EBCDIC, Locale, UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, or UTF-32LE.

Note: When Locale is specified, the encoding used is that which is defined by the system locale where dmexpress runs.

This encoding value can be specified as a built-in character set keyword without quotes or as a string as outlined in Constants.

For additional information, see the Connect help topic, "External Metadata dialog."

character_set

The character_set argument can be used for encoding COBOL copybook metadata files with one of the ICU library character sets.

This character set value must be specified as a string as outlined in File Name and Syntax Requirements.

For additional information, see the Connect help topic, "External Metadata dialog."

connection A string or identifier that specifies the connection or its alias used to connect to the system where the file is located. The referenced connection should be specified before the /datadictionary option.
layout Name of a layout defined in the file.
named_value Value of the named field defined using /VALUE.
condition Name of the condition defined using /CONDITION.
collatingsequence Name of a standard collating sequence or a customized collating sequence defined using /COLLATINGSEQUENCE.
dbconnection Name of the database connection defined using /DBCONNECTION.
hcatconnection Name of the hcatalog connection defined using /HCATCONNECTION
serverconnection Name of the Connect ETL remote server connection defined using /SERVERCONNECTION.
sfdcconnection Name of the Salesforce server connection defined using /SFDCCONNECTION.
mqconnection Name of the message queue server connection defined using /MQCONNECTION.

Location

The option may appear anywhere in the task definition.

Notes

Connect ETL tasks

Use the DMEXPRESS keyword to specify a file that contains a valid Connect ETL task generated by the Connect ETL task Editor (*.dxt), or a text file containing Connect ETL DTL command options.

The INCLUDE argument only links to metadata explicitly listed. Any other metadata in the external file is not available. For each category of metadata listed, ALL includes all items of that category that are available when the task is run. Explicitly listing the items in a category will avoid including any items that may be added to the linked file. If a category is listed without any argument, ALL is assumed.

When there is a duplicate external metadata definition and the definition is referenced in the local task a warning will be issued and processing will continue. The default will be to use the local metadata definition, if any, or the first external definition.

The RUNTIMEVARIABLES ON clause in the /DTL option applies to the file in which it exists, not to the whole task. If an external metadata file uses environment variables, the external metadata file must explicitly include the RUNTIMEVARIABLES ON clause in the /DTL option.

COBOL copybooks

If a file you want to sort is also used by a COBOL program, it is likely that the records in the file are described through a COBOL data description. You can make the contents of the COBOL data description available to dmexpress through the /DATADICTIONARY option. Fields which are part of the data description can then be used in other dmexpress options without further specification. Details of the data description portion of a COBOL program can be found in the appropriate COBOL language documentation.

When linked metadata is stored in a COBOL copybook, specify one of the following COBOL copybook specifications:

  • ACUCOBOL –COBOL copybook data is stored in ACUCOBOL default format.
  • MFCOBOL –COBOL copybook data is stored in MicroFocus default format.
  • VSCOBOL – COBOL copybook data is stored in VS COBOL II Release 4 default format, which is a standard, mainframe copybook format.

An elementary item which is subordinate to group items can be referred to by its unqualified data name (such as product_code), if it identifies the field unambig­uously, or by its data name qualified by the data names of its superior group items (such as invoice.line_item.product_code).

An elementary or group item which is defined as a table through the OCCURS clause can only be referred to by its data name followed by a subscript enclosed in brackets, for example, group[7], month.day[28], group_code.emp_id.hours_spent[5,4].

To refer to a complete table as a single field, use the elementary or group item without the subscript.

Connect supports level 88 specifications. You can use a condition name from an 88 level as a condition name in a dmexpress option.

dmexpress currently supports the OCCURS a TO b TIMES DEPENDING ON c clause only when it appears at the end of a data description.

The following clauses are accepted but ignored:

GLOBAL

BLANK WHEN ZERO

EXTERNAL

COBOL copybook encoding

Use the encoding and character_set arguments to specify the encoding of COBOL copybook metadata files.

Use the NATIONAL-LE keyword to specify that PIC N fields are treated as UTF-16LE. If you do not specify the NATIONAL-LE keyword, PIC N fields are treated as UTF-16BE by default.

Apache File Format Schema

When writing Apache Avro or Apache Parquet files, you can map the data to a Connect-generated schema based on the input, or you can specify an external schema file.

Avro schema files must:

  • Conform to the supported version (1.7.6) of the Avro specification
  • Be a text file with a size limit of 1 MB
  • Contain only one schema and no data, with the schema consisting only of a high level complex type "record", which in turn contains only supported Avro primitive types and unions, where a union consists of a null type and one of the other primitive types; see Conversion between Apache Avro data types and Connect data types in the Connect help for supported Avro types.

Parquet schema files must:

  • Conform to the supported version (1.6) of the Parquet specification
  • Can be a text (schema only) or binary (data and schema) file, with a size limit of 1 MB
  • Must contain only one schema, consisting of one high level "message", which in turn contains only supported Parquet primitive types, with no groups and no repeated elements; see Conversion between Apache Parquet data types and Connect data types in the Connect help for supported Parquet types.

Example

/DATADICTIONARY company.cbl mfcobol    
This option specifies that the file company.cbl, which is in ANSI standard for­mat, defines the layout of the input records for the task.
/DATADICTIONARY employees.dxt dmexpress
This option specifies that the file employees.dxt, which is a valid Connect ETL task, contains metadata that can be used in the current task.
/datadictionary employees.dxt dmexpress include layouts(employee_records, employee_salaries) conditions(full-time, recent_hire, have_401K)
This option specifies that the file employees.dxt, which is a valid Connect ETL task, contains metadata that can be used in the current task. However, useable metadata is limited to only the record layouts and conditions specified.
/DATADICTIONARY transaction.dtd xml

This option specifies that the file transaction.dtd contains an XML DTD that defines a record layout.