/DBCOLUMNSEXTRACT - Connect_ETL - 9.13

Connect ETL Data Transformation Language (DTL) Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
Language
English
Product name
Connect ETL
Title
Connect ETL Data Transformation Language (DTL) Guide
Copyright
2023
First publish date
2003
Last updated
2023-09-11
Published on
2023-09-11T19:01:45.019000

Purpose

To define the format of the data extracted from columns in a database table.

Format

/DBCOLUMNSEXTRACT TABLE dbtable extracted_column

where

extracted_column = columnname [extract_option…]
extract_option = {EXTRACTAS dataspec [ENCODING encoding] [TREATAS ASCII]} {NULLABLE|NOTNULLABLE } {ALIAS alias }
dataspec = {datatype [length]} {length datatype}
datatype = {[CHARACTER] } {numeric_format } {DATETIME (datetime_pattern)} {BIT } {LOB [filename] }

Arguments

dbtable The name or alias of a source database table defined in a /DBINPUT option.
columnname

The name of a column to be extracted from the table. The name can be in the form of a string or identifier.

For a summary of valid naming and formatting conventions for identifiers and constants, see syntax reference in the Connect help.

encoding

The scheme used for encoding extracted CHARACTER and national character (such as NCHAR or NVARCHAR2) column data. Encoding specifies the character encoding in which the text data is extracted. If necessary, the text data is converted to the specified encoding.

See Connect data types/Text data type in the Connect help for a list of valid values. If EXTRACTAS encoding is not defined, the extracted character encoding is the encoding defined by /DBINPUT. For encoding information, see /DBINPUT Character encoding.

alias

A name you assign to the extracted database column, which you use to reference the column in other options.

For a summary of valid naming and formatting conventions for identifiers and constants, see syntax reference in the Connect help.

When you define an alias for a column, reference the column through the alias only; you can no longer reference the column name directly in other options.

length

The length of the extracted column. The length is in bytes for all supported data types and does not include the length of the null indicator. The acceptable values for the length of an extracted column with one of the recognized data types can be found in the appropriate topic under data types reference/dbms data types in the Connect help. The length can be omitted when the datatype is dfloat or float, as dfloat and float have default lengths. To extract a varchar column as a variable length field, omit the data type and length.

For a lob column, the length is the number of bytes in the lob file names. Dmexpress calculates the exact length required, so omit the length unless an explicit length is required. If the specified length is shorter than the required one, dmexpress automatically increases the value to the required length.

numeric_format The format of the numeric data extracted from the column. See the appropriate topic under data types reference/dbms data types in the Connect help.
datetime_pattern

A mask that describes the layout of an extracted date/time field.

For a summary of valid naming and formatting conventions for identifiers and constants, see syntax reference in the Connect help.

filename Prefix for the name of the file where the LOB data is to be stored. It must be a constant string. If specified, LOB data values are written to files named <filename>_<rownumber>. If omitted, a default name is generated, and the files are written to the workspace directory.

Location

The option may appear anywhere in the task definition.

Notes

The columns specified in the /DBINPUT option are extracted using the default formats. To ensure that the data in a column is extracted in a specific format, use the /DBCOLUMNSEXTRACT option to explicitly specify the extract format. The extracted column must be one of the columns read from the database table. The format assigned to an extracted column must be compatible with the data type of the database column, as defined in the appropriate topic under Data types reference/DBMS data types in the Connect help. Certain conversions could result in an overflow, which would be flagged with a message when the task is run.

You cannot specify multiple extraction formats for the same column in the same task.

Treating Extracted Text as if Encoded in ASCII

When TREATAS ASCII is specified for any CHARACTER or national character (such as NCHAR, NVARCHAR2) column, the text data is extracted in the specified or default encoding and is treated in all subsequent operations as if it were encoded in ASCII. No character encoding conversion takes place as a result of the TREATAS ASCII specification.

Compatibility of Default Database Client Character Set with Platform Locale

Correct character encoding conversion requires that the character encoding of the platform locale is compatible with the default character set of the database client or driver through which Connect ETL accesses the database. When the settings are not compatible, exchange the data with the database in one of the Unicode encodings.

Nullable columns

When data is extracted from a nullable database column, the null indicator is retrieved by default. If you specify the NOTNULLABLE keyword, the null indicator is not retrieved, and the extracted data is filled with spaces when the value in the database column is NULL.

When nullable data is output to a database table column that is also nullable, the null indicator is preserved. When nullable data is output to a file, buffer or procedure, the output value is empty. This is defined as a field of length 0 for a delimited field or is filled with spaces for a fixed length field. For a field of binary data type, each byte of the field is set to hex value "20" or an ASCII space.

When a NULL value is output to a database table column that is not nullable, the record is rejected from the output. You can define an explicit representation for NULL values by using a conditional /VALUE option and by referencing the named value definition instead of referencing the nullable column directly.

When a nullable column is referenced in a /REFORMAT option, the reference is made only to the data value but not to the null indicator.

A NULL value is sorted before all other values when sorted in ascending order and is sorted after all other values when sorted in descending order.

Other operations involving a NULL value, unless described otherwise, produce a NULL value. You can define the result using a conditional field. A NULL value is treated as an empty string when it is involved in string concatenation.

Referencing Database Table Columns

You can refer to a database column by the fully qualified name dbtablename.dbcolumnname or as dbcolumnextractalias. dbtablename should match the alias of a database table or should match the name of the table defined using /DBINPUT when an alias is not provided. dbcolumnname should match the name of the column. dbcolumnextractalias should match the alias assigned to the column name while specifying the extraction format. A partially qualified reference can be used as long as the reference can be uniquely identified.

LOB

LOB column data is extracted to a unique temporary folder in the workspace directory. The default filename is <alias>_<rownumber> when alias is defined and <columnname>_<rownumber> when alias is not defined. When NULLABLE is specified and the LOB value is null, no file is created. When NOTNULLABLE is specified, an error is issued if the column is null.

If LOB data is used for tasks distributed over a grid, make sure the workspace folder is accessible to all tasks.

Examples

/DBCOLUMNSEXTRACT table client_facts amount extractas int 8 alias int_amount
The option specifies that the database column amount in the client_facts table to be extracted as an integer of length 8 bytes. The column is also given an alias int_amount that can be used to reference the extracted column in other options.
/DBCOLUMNSEXTRACT table client_facts client_name extractas char 10 nullable
The option specifies that the database column client_name in table client_facts is to be extracted as a field of datatype character with a length of 10 bytes with the null indicator.
/DBCOLUMNSEXTRACT table client name extractas CHAR 10 encoding UTF-8 nullable
Extracted character data from the name column of the client table will be encoded in UTF-8.
/DBCOLUMNSEXTRACT table client name extractas CHAR 10 encoding UTF-8 treatas ASCII
Extracted character data from the name column of the client table is encoded in UTF-8 and is handled as ASCII in all subsequent operations.
/DBCOLUMNSEXTRACT table employee photo extractas LOB nullable

The option specifies that the nullable database column photo in table employee be extracted as a LOB.