Purpose
To define the format of the data extracted from columns in a database table.
Format
/DBCOLUMNSEXTRACT TABLE dbtable extracted_columnwhere
extracted_column | = | columnname [extract_option…] |
extract_option | = | {EXTRACTAS dataspec [ENCODING encoding] [TREATAS ASCII]} {NULLABLE|NOTNULLABLE } {ALIAS alias } |
dataspec | = | {datatype [length]} {length datatype} |
datatype | = | {[CHARACTER] } {numeric_format } {DATETIME (datetime_pattern)} {BIT } {LOB [filename] } |
Arguments
dbtable | The name or alias of a source database table defined in a /DBINPUT option. |
columnname |
The name of a column to be extracted from the table. The name can be in the form of a string or identifier. For a summary of valid naming and formatting conventions for identifiers and constants, see syntax reference in the Connect help. |
encoding |
The scheme used for encoding extracted CHARACTER and national character (such as NCHAR or NVARCHAR2) column data. Encoding specifies the character encoding in which the text data is extracted. If necessary, the text data is converted to the specified encoding. See Connect data types/Text data type in the Connect help for a list of valid values. If EXTRACTAS encoding is not defined, the extracted character encoding is the encoding defined by /DBINPUT. For encoding information, see /DBINPUT Character encoding. |
alias |
A name you assign to the extracted database column, which you use to reference the column in other options. For a summary of valid naming and formatting conventions for identifiers and constants, see syntax reference in the Connect help. When you define an alias for a column, reference the column through the alias only; you can no longer reference the column name directly in other options. |
length |
The length of the extracted column. The length is in bytes for all supported data types and does not include the length of the null indicator. The acceptable values for the length of an extracted column with one of the recognized data types can be found in the appropriate topic under data types reference/dbms data types in the Connect help. The length can be omitted when the datatype is dfloat or float, as dfloat and float have default lengths. To extract a varchar column as a variable length field, omit the data type and length. For a lob column, the length is the number of bytes in the lob file names. Dmexpress calculates the exact length required, so omit the length unless an explicit length is required. If the specified length is shorter than the required one, dmexpress automatically increases the value to the required length. |
numeric_format | The format of the numeric data extracted from the column. See the appropriate topic under data types reference/dbms data types in the Connect help. |
datetime_pattern |
A mask that describes the layout of an extracted date/time field. For a summary of valid naming and formatting conventions for identifiers and constants, see syntax reference in the Connect help. |
filename | Prefix for the name of the file where the LOB data is to be stored. It must be a constant string. If specified, LOB data values are written to files named <filename>_<rownumber>. If omitted, a default name is generated, and the files are written to the workspace directory. |
Location
The option may appear anywhere in the task definition.
Notes
The columns specified in the /DBINPUT option are extracted using the default formats. To ensure that the data in a column is extracted in a specific format, use the /DBCOLUMNSEXTRACT option to explicitly specify the extract format. The extracted column must be one of the columns read from the database table. The format assigned to an extracted column must be compatible with the data type of the database column, as defined in the appropriate topic under Data types reference/DBMS data types in the Connect help. Certain conversions could result in an overflow, which would be flagged with a message when the task is run.
You cannot specify multiple extraction formats for the same column in the same task.
Treating Extracted Text as if Encoded in ASCII
When TREATAS ASCII is specified for any CHARACTER or national character (such as NCHAR, NVARCHAR2) column, the text data is extracted in the specified or default encoding and is treated in all subsequent operations as if it were encoded in ASCII. No character encoding conversion takes place as a result of the TREATAS ASCII specification.
Compatibility of Default Database Client Character Set with Platform Locale
Correct character encoding conversion requires that the character encoding of the platform locale is compatible with the default character set of the database client or driver through which Connect ETL accesses the database. When the settings are not compatible, exchange the data with the database in one of the Unicode encodings.
Nullable columns
When data is extracted from a nullable database column, the null indicator is retrieved by default. If you specify the NOTNULLABLE keyword, the null indicator is not retrieved, and the extracted data is filled with spaces when the value in the database column is NULL.
When nullable data is output to a database table column that is also nullable, the null indicator is preserved. When nullable data is output to a file, buffer or procedure, the output value is empty. This is defined as a field of length 0 for a delimited field or is filled with spaces for a fixed length field. For a field of binary data type, each byte of the field is set to hex value "20" or an ASCII space.
When a NULL value is output to a database table column that is not nullable, the record is rejected from the output. You can define an explicit representation for NULL values by using a conditional /VALUE option and by referencing the named value definition instead of referencing the nullable column directly.
When a nullable column is referenced in a /REFORMAT option, the reference is made only to the data value but not to the null indicator.
A NULL value is sorted before all other values when sorted in ascending order and is sorted after all other values when sorted in descending order.
Other operations involving a NULL value, unless described otherwise, produce a NULL value. You can define the result using a conditional field. A NULL value is treated as an empty string when it is involved in string concatenation.
Referencing Database Table Columns
You can refer to a database column by the fully qualified name dbtablename.dbcolumnname or as dbcolumnextractalias. dbtablename should match the alias of a database table or should match the name of the table defined using /DBINPUT when an alias is not provided. dbcolumnname should match the name of the column. dbcolumnextractalias should match the alias assigned to the column name while specifying the extraction format. A partially qualified reference can be used as long as the reference can be uniquely identified.
LOB
LOB column data is extracted to a unique temporary folder in the workspace directory. The default filename is <alias>_<rownumber> when alias is defined and <columnname>_<rownumber> when alias is not defined. When NULLABLE is specified and the LOB value is null, no file is created. When NOTNULLABLE is specified, an error is issued if the column is null.
If LOB data is used for tasks distributed over a grid, make sure the workspace folder is accessible to all tasks.
Examples
/DBCOLUMNSEXTRACT table client_facts amount extractas int 8 alias int_amount
/DBCOLUMNSEXTRACT table client_facts client_name extractas char 10 nullable
/DBCOLUMNSEXTRACT table client name extractas CHAR 10 encoding UTF-8 nullable
/DBCOLUMNSEXTRACT table client name extractas CHAR 10 encoding UTF-8 treatas ASCII
/DBCOLUMNSEXTRACT table employee photo extractas LOB nullable
The option specifies that the nullable database column photo in table employee be extracted as a LOB.