/INPROCEDURE - Connect_ETL - 9.13

Connect ETL Data Transformation Language (DTL) Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
Language
English
Product name
Connect ETL
Title
Connect ETL Data Transformation Language (DTL) Guide
Copyright
2023
First publish date
2003
Last updated
2023-09-11
Published on
2023-09-11T19:01:45.019000

Purpose

To specify a procedure within a program as the source of records.

Format

/INPROCEDURE [field_delimiters] [ENCODING encoding_name] [LAYOUT layout [ALIAS layout_alias]

where

field_delimiters = FIELDSEPARATOR field_separator [ENCLOSEDBY leading_character [,trailing_character]]
field_separator = {separator } {UNIXSORTDEFAULT} {NONE }

Arguments

separator

A single byte or multibyte character string that separates adjacent fields in delimited text records.

You can specify the character string in single-quoted or double-quoted format as outlined in Appendix B Constants

At runtime, field separators are treated as follows:

  • Character text constant field separators are treated as locale encoded.
  • Hex text constant field separators are treated as binary representations of the source or target encoding. Hex text constant field separators within the /inprocedure option are encoded in the same encoding as specified in the individual source records in memory. 
leading_character The character in the field-delimited text records that optionally precedes the data in each delimited field.
trailing_character The character in the field-delimited text records that optionally follows the data in each delimited field.
encoding_name The encoding of the character data in the individual records in memory. See dmexpress data types/text data type in the Connect help for a list of the valid values.
layout The name of a record layout that defines the fields of the records in the individual records in memory. A record layout can be defined in a /delimitedrecordlayout or /recordlayout option or defined in an external metadata file using /datadictionary.
layout_alias

A name you assign to the layout, and which you will use, in lieu of the original layout name, to refer to the fields of records originating from this source in other dmexpress options. When a layout alias is defined, you can no longer reference the fields using the original layout name directly in your task.

The name must adhere to the rules described for an identifier. For a summary of valid naming and formatting conventions for identifiers and constants, see Syntax reference in the Connect help.

Location

This option may appear anywhere in the task definition.

Defaults

Field Delimiters

When you provide neither a a single byte or multibyte field separator nor an enclosing character, Connect ETL treats a sequence of one or more spaces and tabs (the UNIX default) as the delimiter in field-delimited text records. The separator is treated as part of the following field.

When you do not provide a field separator, but you do supply an enclosing character, Connect ETL assumes that the separator in field-delimited text records is the comma.

Encoding

The default encoding is ASCII.

Notes

You can initiate dmexpress through a program that defines an input procedure as a source of records to a sort task or copy task. An input procedure cannot be defined as a source of records to a Connect ETL merge task, join task or aggregate task.

The called program invokes the input procedure iteratively until all source records are accessed from memory by the Connect ETL put routine, dmx_put_record, and supplied to the sort or copy task.

Valid target types to which the sort or copy task outputs data consist of the following:

Task Type Target Type
Sort File, pipe, procedure
Copy All target types except procedures

Field Delimiters

A field-delimited text record consists of consecutive fields, with adjacent fields separated by a single byte or multibyte field separator. When the field separator is the UNIX default, each blank in a sequence of blanks is part of the following field.

When the field separator is not the default you must specify it through the FIELDSEPARATOR separator argument. The specified separator is assumed to be in locale and is converted to the encoding of the source file.

Each separator is considered to separate two adjacent fields; therefore, consecutive separators indicate an empty field (one whose length is zero). A non-default separator is not considered part of the following field.

When one or more blank characters (spaces and tabs) separate fields in a text record, and the blanks are part of the data, specify FIELDSEPARATOR UNIXSORTDEFAULT. For this type of separator (the UNIX default), each blank character in a sequence of blanks is taken as part of the following field. It is not possible to have an empty field with this type of separator.

When a file does not contain field-delimited text records, indicate this by specifying FIELDSEPARATOR NONE. This is necessary when there are two or more source files in the task and there is a mixture of delimited records and fixed-position records. If you do not indicate that a fixed-position file in such an application is not delimited, Connect ETL applies the default and assumes the file has delimited records with the default separator. With FIELDSEPARATOR NONE specified for a fixed-position file, when you reference a delimited field in a record from the file, the field either contains the complete record (the first field) or is empty (the second and subsequent fields).

The combination of the UNIX default separator and enclosing characters is not supported.

Each field in a field-delimited text record may be enclosed by two enclosing char­acters. When fields are enclosed, specify the ENCLOSEDBY clause. Any lead­ing and trailing enclosing characters are not part of the field. Two consecutive trailing enclosing characters within the field are treated as a single enclosing character. The leading enclosing character cannot be a blank (tab or space). The leading enclosing character and the trailing enclosing character cannot be a prefix of the field separator. When the leading and the trailing characters are the same, you do not need to specify the trailing character.

Connect ETL ignores any characters between an enclosing character terminating a field and the following field separator or the end of the record. When the first nonblank character following a field separator (or the first nonblank character in the record for the first field) is not a leading enclosing character, Connect ETL assumes that the field is not enclosed.

For sources and targets with multi-byte Locale encodings, the byte sequence of the converted field separator must not occur as part of another multi-byte character. For example, for Japanese locales, the set of valid separator characters are:

  • Japanese Shift-JIS: 0x00 – 0x3F
  • Japanese EUC: 0x00 – 0x7F

Encoding

Use the ENCODING option to describe the character encoding of your procedure. The encoding will be used to interpret stream record terminators as well as any field delimiters or enclosing characters specifed. An encoding of LOCALE will use the system’s character set for this interpretation.

If the encoding is specified as UTF-16 or UTF-32, the byte order of the encoding is treated as big-endian.

Compatible Encoding Types

In some cases, you may choose to specify an encoding that is different from the actual encoding of the data. For instance, ASCII will almost always provide better performance than any other encoding. In order to do so, you need to understand whether the actual encoding is compatible with the specified encoding.

An encoding is compatible with another if the first encoding can be used in place of the other without significant data loss or corruption. However, it may still corrupt data or cause data loss if used for certain ranges of characters. For example, if data is encoded in UTF-8, you may be able to use ASCII as a compatible encoding when all of your characters are expected to be in the 7-bit ASCII range (0x00 – 0x7f). However, since character representations differ between ASCII and UTF-8 beyond this range, undesirable results may occur if you have data above this range and interchange these encodings. An incompatible encoding has no overlapping character ranges and if used would definitely give undesired results. ASCII, LOCALE, and UTF-8 may all be compatible to each other; the rest of the encodings are incompatible.

Record Layout

Use the LAYOUT clause to designate the record layout that defines the fields of the records in the in-memory procedure.

When you wish to include or omit records based on the source from which they originate, define aliases for some or all of the sources. You then code the alias as part of a comparison in a /CONDITION definition.

Record Length

You can define maximum or minimum record length specifications per task using the following options:

  • /INMAXRECORDLENGTH – Through the /INMAXRECORDLENGTH option, you can define the maximum source record length specification per task.
  • /SKIPSHORT - Through the /SKIPSHORT option, you can define a short record filter specification per task.
  • /DATASIZE - Through the /DATASIZE option, you can define the quantity of data released to Connect ETL after all input records are processed.