Purpose
Defines an in-memory table that is populated by a program as a source of records.
Format
/INBUFFER | [ALIAS source_alias ] [record_attribute … ] [record_limit ] [LAYOUT layout [ALIAS layout_alias]] |
where
record_attribute | = | {record_type } {field_delimiters} {ENCODING encoding_name} |
record_type | = | {STREAM CRLF } {STLF } {STCR } {FIXED record_length [record_alignment] } {VARIABLE [LITTLEENDIAN] [record_alignment]} {FORTRANUNFORMATTED } |
record_limit | = | {FDISCARDFIRST num_records} {FDISCARDAFTER num_records} {FDISCARDBLANK } {FDISCARDSHORT min_length } |
record_alignment | = | {ALIGNED2 } {ALIGNED4 } {UNALIGNED} |
field_delimiters | = | FIELDSEPARATOR field_separator [ENCLOSEDBY leading_character [,trailing_character]] |
field_separator | = | {separator } {UNIXSORTDEFAULT} {NONE } |
Arguments
source_alias |
A name you assign to this source, and which you will use to refer to this source in other dmexpress options. The assigned name must adhere to the rules described for an identifier. For a summary of valid naming and formatting conventions for identifiers and constants, see Syntax reference in the Connect help. |
record_length | The length in bytes of the in-memory table record. |
num_records | The number of records from the beginning of the table to be bypassed, and not to be released to the task. |
min_length | Specifies the minium length requirement for records to avoid being discarded as per the FDISCARDSHORT clause. |
separator |
A single byte or multibyte character string that separates adjacent fields in delimited text records. You can specify the character string in single-quoted or double-quoted format as outlined in Appendix B Constants. At runtime, field separators are treated as follows:
|
leading_character | The character in the field-delimited text records that optionally precedes the data in each delimited field. |
trailing_character | The character in the field-delimited text records that optionally follows the data in each delimited field. |
encoding_name | The encoding of the character data in the table of records in memory. See dmexpress data types/Text data type in the Connect help for a list of the valid values. |
layout | The name of a record layout that defines the fields of the records in the in-memory table. A record layout can be defined in a /DELIMITEDRECORDLAYOUT or /RECORDLAYOUT option or defined in an external metadata file using /DATADICTIONARY. |
layout_alias |
A name you assign to the layout, and which you will use, in lieu of the original layout name, to refer to the fields of records originating from this source in other dmexpress options. When a layout alias is defined, you can no longer reference the fields using the original layout name directly in your task. The name must adhere to the rules described for an identifier. For a summary of valid naming and formatting conventions for identifiers and constants, see Syntax reference in the Connect help. |
Location
The /INBUFFER option can be listed anywhere in the task definition for the following task types: copy, sort, merge, and aggregates.
For a join task, the /INBUFFER option must be listed as follows:
- Left side – Before the /JOINKEYS options that defines the left side.
- Right side - After the /JOINKEYS option that defines the left side and before the /JOINKEYS option that defines the right side.
Defaults
Field Delimiters
When you provide neither a single byte or multibyte field separator nor an enclosing character, Connect ETL treats a sequence of one or more spaces and tabs (the UNIX default) as the delimiter in field-delimited text records. The separator is treated as part of the following field.
When you do not provide a field separator, but you do supply an enclosing character, Connect ETL assumes that the separator in field-delimited text records is the comma.
Encoding
The default encoding is ASCII.
Notes
Record Type
For information on supported record types, see Appendix D: Source and Target File Types and Records.
Record Length
You must provide a record length if fixed length records (FIXED) are stored in the buffer.
To specify a maximum record length for variable length records, use the /INMAXRECORDLENGTH option, which specifies the maximum source record length per task.
Record Alignment
If all records are aligned on 2-byte boundaries, you have to supply the ALIGNED2 keyword except for fixed format records of even length. If all records are aligned on 4-byte boundaries, you have to supply the ALIGNED4 keyword except for fixed format records whose length is a multiple of 4 bytes. The default is that records are not aligned.
Record Limit Option Arguments
- FDISCARDSHORT - All records shorter than the specified minimum length are discarded.
- FDISCARDFIRST - The specified number of records are discarded from the beginning of the buffer. If FDISCARDSHORT is specified, short records that are rejected as per the FDISCARDSHORT specification do not count towards that number.
- FDISCARDBLANK - All blank records are discarded. A record is considered to be blank if either the record length is 0 or all bytes in the record are space characters.
- FDISCARDAFTER - Only the specified number of records are retained for further processing.
Field Delimiters
A field-delimited text record consists of consecutive fields, with adjacent fields separated by a single byte or multibyte field separator. When the field separator is the UNIX default, each blank in a sequence of blanks is part of the following field.
When the field separator is not the default you must specify it through the FIELDSEPARATOR separator argument. The specified separator is assumed to be in locale and is converted to the encoding of the source.
Each separator is considered to separate two adjacent fields; therefore, consecutive separators indicate an empty field (one whose length is zero). A non-default separator is not considered part of the following field.
When one or more blank characters (spaces and tabs) separate fields in a text record, and the blanks are part of the data, specify FIELDSEPARATOR UNIXSORTDEFAULT. For this type of separator (the UNIX default), each blank character in a sequence of blanks is taken as part of the following field. It is not possible to have an empty field with this type of separator.
When a source does not contain field-delimited text records, indicate this by specifying FIELDSEPARATOR NONE. This is necessary when there are two or more sources in the task and there is a mixture of delimited records and fixed-position records. If you do not indicate that a fixed-position source in such an application is not delimited, Connect ETL applies the default and assumes the source has delimited records with the default separator. With FIELDSEPARATOR NONE specified for a fixed-position source, when you reference a delimited field in a record from the source, the field either contains the complete record (the first field) or is empty (the second and subsequent fields).
The combination of the UNIX default separator and enclosing characters is not supported.
Each field in a field-delimited text record may be enclosed by two enclosing characters. When fields are enclosed, specify the clause. Any leading and trailing enclosing characters are not part of the field. Two consecutive trailing enclosing characters within the field are treated as a single enclosing character. The leading enclosing character cannot be a blank (tab or space). The leading enclosing character and the trailing enclosing character cannot be a prefix of the field separator. When the leading and the trailing characters are the same, you do not need to specify the trailing character.
Connect ETL ignores any characters between an enclosing character terminating a field and the following field separator or the end of the record. When the first nonblank character following a field separator (or the first nonblank character in the record for the first field) is not a leading enclosing character, Connect ETL assumes that the field is not enclosed.
For sources and targets with multi-byte Locale encodings, the byte sequence of the converted field separator must not occur as part of another multibyte character. For example, for Japanese locales, the set of valid separator characters are:
- Japanese Shift-JIS: 0x00 – 0x3F
- Japanese EUC: 0x00 – 0x7F
Record Layout
Use the LAYOUT clause to designate the record layout that defines the fields of the records in the in-memory table.
Source Alias
When you want to include or omit records based on which source they originate from, you must give aliases to some or all of the sources. You then code the alias as part of a comparison in a /CONDITION definition.
Encoding
Use the ENCODING option to describe the character encoding of your tables of records in memory. The encoding will be used to interpret stream record terminators as well as any field delimiters or enclosing characters specifed. An encoding of LOCALE will use the system’s character set for this interpretation.
If a task has multiple inputtables of records in memory, they must have the same encoding.If the encoding is specified as UTF-16 or UTF-32, the byte order of the encoding is treated as big-endian.
Compatible Encoding Types
In some cases, you may choose to specify an encoding that is different from the actual encoding of the data. For instance, ASCII will almost always provide better performance than any other encoding. In order to do so, you need to understand whether the actual encoding is compatible with the specified encoding.
An encoding is compatible with another if the first encoding can be used in place of the other without significant data loss or corruption. However, it may still corrupt data or cause data loss if used for certain ranges of characters. For example, if data is encoded in UTF-8, you may be able to use ASCII as a compatible encoding when all of your characters are expected to be in the 7-bit ASCII range (0x00 – 0x7f). However, since character representations differ between ASCII and UTF-8 beyond this range, undesirable results may occur if you have data above this range and interchange these encodings. An incompatible encoding has no overlapping character ranges and if used would definitely give undesired results. ASCII, LOCALE, and UTF-8 may all be compatible to each other; the rest of the encodings are incompatible.Compatibility with Other dmexpress Options
Source level record limiting criteria, which is specified through the FDISCARDSHORT, FDISCARDFIRST, FDISCARDBLANK, and FDISCARDAFTER option arguments in the /INBUFFER option, cannot be specified in the same task definition that specifies general bulk filtering for source records through the DISCARDSHORT, DISCARDFIRST, DISCARDBLANK, DISCARDAFTER option arguments in the /FILTER option.