PARSE Parameter (Optional) - mfx - 3.1

Syncsort™ MFX Programmers Guide

Product type
Software
Portfolio
Integrate
Product family
Syncsort™ Software
Product
Syncsort™ MFX > MFX
Version
3.1
Language
English
Content type
Programmer’s Guide
Product name
Syncsort™ MFX
Title
Syncsort™ MFX Programmers Guide
Topic type
How Do I
Copyright
2024
First publish date
2010
Last edition
2024-08-27
Last publish date
2024-08-27T08:14:56.318001

Use PARSE to extract variable-position and variable-length fields from records. The resultant data will be placed into fixed-length parsed fields. The fixed-length parsed fields are specified by %pp, where pp is an integer from 00 to 999. Therefore, up to 1000 fixed-length parsed fields may be defined for each PARSE application.

The criteria for extracting variable fields are specified using the PARSE subparameters. The resultant %pp fields may then be used to the same extent as fixed fields, which have a fixed position p and a fixed-length l, in the FIELDS, BUILD, or OVERLAY parameters associated with the statements.

For use of PARSE with IFTHEN and in the case of missing fields, see PARSE with IFTHEN.

The syntax of PARSE is illustrated below:
Figure 1. PARSE and Subparameters

By default the first PARSE operation will begin at byte 1 for fixed-length records and byte 5 for variable-length records. This represents the initial position of the cursor within the record. The cursor can be repositioned to start the PARSE operation through the use of the ABSPOS, ADDPOS, SUBPOS, STARTAFT, or STARTAT subparameters. The PARSE operation to extract the field continues until the ENDBEFR or ENDAT conditions are satisfied or, in their absence, for the number of bytes specified in the FIXLEN subparameter. The cursor is advanced as the result of processing the above subparameters.

A subsequent PARSE operation, by default, will begin at the byte where the cursor was last positioned by the prior PARSE operation. This position can also be modified as described above. Refer to the descriptions of the subparameters for details on cursor position as a result of their operation.

The order in which the PARSE subparameters are processed is as follows:

  • ABSPOS or ADDPOS or SUBPOS

  • STARTAFT, STARTAT and PAIR

  • ENDBEFR, ENDAT and PAIR

  • FIXLEN

The following describes the PARSE subparameters:

%pp

Defines the fixed-length parsed field with a unique identifier pp, which is an integer from 0 to 999. A %pp field can be defined only once in all PARSE subparameters in an application. Therefore, up to 1000 unique %pp fields can be defined and can be used more than once in BUILD or OVERLAY. Note that the %pp fields defined for a specific control statement can only be used in the FIELDS, BUILD or OVERLAY parameter for that statement.

Variables defined as %n are equivalent to %0n or %00n (for example, %3 is equivalent to %03) and so cannot both be defined in the same application.

%

Specifies that the variable field will be ignored and not extracted. The start position of the cursor for the next parsed field is determined by the remaining subparameters.

ABSPOS=p

Optionally specifies the absolute starting cursor positionp (bytes) for the parsed field. You can set p from 1 to 32752. You can use ABSPOS to override the starting cursor position set by the previous parsed field; if it is less than 5 for a variable-length record, then the cursor position is defaulted to 5. (For fixed-length records, the default position is at byte 1 for the first parsed field. For variable-length records, the default position is at byte 5 for the first parsed field.)

ADDPOS=x

Optionally specifies that the start position of the cursor will be at the current position plus x bytes added. You can set x from 1 to 32752.

SUBPOS=y

Optionally specifies that the start position of the cursor will be at the current position minus x bytes subtracted. You can set y from 1 to 32752. If the result is less than 1 for a fixed-length record, then the cursor position is set to 1. If the result is less than 5 for a variable-length record, then the cursor position is set to 5.

STARTAFT=string

Optionally specifies a string, which indicates the start of the parsed extraction of the variable field one byte after the string (for example, a comma). The start position of the cursor for the next parsed field is then set at the byte after the string. If the string is not present, then blank characters will be inserted into the current parsed field and all subsequent parsed fields.

You can specify the string as a character string constant (C'string') or hexadecimal string constant (X'hh...hh'). For example, a comma would be specified as STARTAFT=C','.

You can specify multiple instances and combinations of any STARTAFT and STARTAT subparameter for a single %pp parsed field. For example, PARSE=(%01=(STARTAFT=C'/', STARTAFT=C'<',STARTAT=C'*',FIXLEN=5)). From left to right, the first STARTAFT or STARTAT criterion to be satisfied will be the one to be implemented.

STARTAFT=alphanum

Optionally specifies a set of alphanumeric characters, which indicates the start of the parsed extraction of the variable field one byte after a character from the set has been found. The start position of the cursor for the next parsed field is then set at the byte after the character that was found. If no characters from the specified set are present, then blank characters will be inserted into the current parsed field and all subsequent parsed fields.

The choices for the alphanum character set include lowercase, uppercase and numeric characters:

LC   for lowercase characters a-z

UC   for uppercase characters A-Z

MC   for mixed case characters a-z and A-Z

LN   for lowercase characters and numerics a-z and 0-9

UN   for uppercase characters and numerics A-Z and 0-9

MN   for mixed case characters and numerics a-z, A-Z and 0-9

NUM  for numerics 0-9

STARTAFT=BLANKS

Optionally specifies the start of the parsed extraction of the variable field at the first nonblank character after one or more  blanks. The start position of the cursor for the next parsed field is then set at the first nonblank character. If a blank is not present, then blank characters will be inserted into the current parsed field and all subsequent parsed fields.

You can specify multiple instances and combinations of any STARTAFT and STARTAT subparameter. See STARTAFT=string above for further description.

STARTAT=string

Optionally specifies a string, which indicates the start of the parsed extraction of the variable field at the position of, and including, the string. The start position of the cursor for the next parsed field is then set at the byte after the string. If the string is not present, then blank characters will be inserted into the current parsed field and all subsequent parsed fields.

You can specify the string as a character string constant (C'string') or hexadecimal string constant (X'hh...hh'). For example, a comma would be specified as STARTAT=C','.

You can specify multiple instances and combinations of any STARTAFT and STARTAT subparameter. See STARTAFT=string above for further description.

STARTAT=alphanum

Optionally specifies a set of alphanumeric characters, which indicates the start of the parsed extraction of the variable field at the position of, and including, the character from the set that was found. The start position of the cursor for the next parsed field is then set at the byte after the character that was found. If no characters from the specified set are present, then blank characters will be inserted into the current parsed field and all subsequent parsed fields.

See STARTAFT=alphanum for a description of the choices for the alphanum character set.

STARTAT=BLANKS

Optionally specifies the start of the parsed extraction of the variable field at the position of, and including, the first blank character. The start position of the cursor for the next parsed field is then set at the first nonblank character. If a blank is not present, then blank characters will be inserted into the current parsed field and all subsequent parsed fields.

You can specify multiple instances and combinations of any STARTAFT and STARTAT subparameter. See STARTAFT=string above for further description.

STARTAT=NONBLANK

Optionally specifies the start of the parsed extraction of the variable field at the position of, and including, the first non­blank character. The start position of the cursor for the next parsed field is then set at the first nonblank character. If a non­blank is not present, then blank characters will be inserted into the current parsed field and all subsequent parsed fields.

You can specify multiple instances and combinations of any STARTAFT and STARTAT subparameter. See STARTAFT=string above for further description.

ENDBEFR=string

Optionally specifies a string, which indicates the end of the parsed extraction of the variable field one byte before the string (for example, a comma). The start position of the cursor for the next parsed field is then set at the byte after the string.

If the string is not present, then data from the field will continue to be extracted up until the end of the record. Blank characters will be inserted into all subsequent parsed fields.

You can specify the string as a character string constant (C'string') or hexadecimal string constant (X'hh...hh'). For example, a comma would be specified as ENDBEFR=C','.

You can specify multiple instances and combinations of any ENDBEFR and ENDAT subparameter for a single %pp parsed field. For example, PARSE=(%01=(ENDBEFR=C'/', END­BEFR=C'<',ENDAT=C'*',FIXLEN=5)). From left to right, the first ENDBEFR or ENDAT criterion to be satisfied will be the one to be implemented.

ENDBEFR=alphanum

Optionally specifies a set of alphanumeric characters, which indicates the end of the parsed extraction of the variable field one byte before a character from the set has been found. The start position of the cursor for the next parsed field is then set at the byte after the character that was found. If no characters from the specified set are present, then data from the field will continue to be extracted up until the end of the record. Blank characters will be inserted into all subsequent parsed fields.

See STARTAFT=alphanum for a description of the choices for the alphanum character set.

ENDBEFR=BLANKS

Optionally specifies the end of the parsed extraction of the variable field one byte before a blank character is encountered. The start position of the cursor for the next parsed field is then set at the first nonblank character after the blank (or group of blanks).

If a blank character is not present, then data from the field will continue to be extracted up until the end of the record. Blank characters will be inserted into all subsequent parsed fields.

You can specify multiple instances and combinations of any ENDBEFR and ENDAT subparameter. See END­BEFR=string above for further description.

ENDAT=string

Optionally specifies the end of the parsed extraction of the variable field at the position of, and including, the last string character. The start position of the cursor for the next parsed field is then set at the byte after the string.

If the string is not present, then data from the field will continue to be extracted up until the end of the record. Blank characters will be inserted into all subsequent parsed fields.

You can specify the string as a character string constant (C'string') or hexadecimal string constant (X'hh...hh'). For example, a comma would be specified as ENDAT=C','.

You can specify multiple instances and combinations of any ENDBEFR and ENDAT subparameter. See END­BEFR=string above for further description.

ENDAT=alphanum

Optionally specifies the end of the parsed extraction of the variable field at the position of, and including, a character from the specified alphanum set that was found. The start position of the cursor for the next parsed field is then set at the byte after the character that was found. If no characters from the specified set are present, then data from the field will continue to be extracted up until the end of the record. Blank characters will be inserted into all subsequent parsed fields.

See STARTAFT=alphanum for a description of the choices for the alphanum character set.

ENDAT=BLANKS

Optionally specifies the end of the parsed extraction of the variable field at the position of, and including, the last blank character. The start position of the cursor for the next parsed field is then set at the first nonblank character after the blank (or group of blanks).

If a blank character is not present, then data from the field will continue to be extracted up until the end of the record. Blank characters will be inserted into all subsequent parsed fields.

You can specify multiple instances and combinations of any ENDBEFR and ENDAT subparameter. See END­BEFR=string above for further description.

PAIR=APOST

Optionally specifies that all characters between pairs of apostrophes ('characters') be ignored when searching for a string or blanks. If only one apostrophe is present, all characters to the right of the apostrophe will be ignored.

PAIR=QUOTE

Optionally specifies that all characters between pairs of quotes ("characters") be ignored when searching for a string or blanks. If only one quote is present, all characters to the right of the quote will be ignored.

FIXLEN=l

Specifies the length l (1 to 32752) in bytes of the %pp field. FIXLEN is required when used with %pp, but optional when used with %. If ENDBEFR or ENDAT is not specified, then FIXLEN indicates the end of the parsed extraction of the variable field at the end of length l. Thus, the start position of the cursor for the next parsed field is set at the next byte following the length.

If the PARSE operation produces a field less than FIXLEN, the parsed field will be left-justified and padded on the right with the difference in blank characters. If the length of the parsed field is greater than l, the data will be truncated after l bytes.

REPEAT=m

Optionally is a shorthand way to specify the repetition of the currently defined parsed field. m can be from 2 to 1000 and defines the total number of identical parsed fields. When used with a % field, m consecutive fields that will be ignored are created. When used with a %nn field, consecutively numbered parsed fields from nn to nn+m-1 will be defined. For instance, specifying %3=(your_subparms,REPEAT=4) will create 4 identically defined parsed fields:

%3=(your_subparms),%4=(your_subparms),%5=(your_sub­parms),%6=(your_subparms)

When defining a %nn field with REPEAT, be sure that you have not defined duplicate %nn fields elsewhere in your control statements. For the above %3 example, you may not define %4, %5 or %6 anywhere else. You must also ensure that the maximum %nn number created by REPEAT is %999.

PARSE with IFTHEN

You can use %pp parsed fields in IFTHEN expressions. If the %pp field is defined in a WHEN=INIT, WHEN=(conditions), WHEN=ANY, or WHEN=NONE expression, it can be used in the IFTHEN BUILD or IFTHEN OVERLAY of that expression. Additionally, for WHEN=INIT, the %pp fields can be used in any subsequent IFTHEN expression. See IFTHEN Parameter (Optional) for further description of the IFTHEN parameter.

A sample application of using PARSE with IFTHEN is when the parse cursor needs to be reset to the default position at the beginning of the record, as in the case of variable records with missing fields. For each WHEN=INIT statement implemented with PARSE, the cursor position is set to byte 1 for fixed-length records and byte 5 for variable-length records. Using PARSE without IFTHEN, a search resulting in a missing field would cause any subsequent fields to be overlooked and not properly parsed into %pp fields. However, using IFTHEN PARSE, each search would reset the cursor to the beginning of the record and fields could be properly parsed into %pp fields independent of each other.