JavaScript Object Notation or JSON is described in its specification at https://www.json.org/ as a lightweight data-interchange format. It has become very popular because it is very easy for humans to read and write as well as for machines to create and parse. It is a language independent text format that uses syntax conventions familiar to programmers of the C-family of languages.
JSON formatted data consists of nested name/value pairs that provide both a description of the data's hierarchical structure as well as the elementary level items in the structure and their value or content. The Apply Engine further simplifies the use of JSON by automatically generating JSON from existing data structures such as COBOL copybooks and Relational DDL including, in the case of IMS source data, Parent Key field names and values. The Apply Engine also recognizes CDC data sources and includes before and after image structures and content. While it is also possible to explicitly define Target JSON structures, generating them from source DATASTORE Descriptions both simplifies the process and ensures accuracy and completeness.
Examples of some of the Objects that will be generated by the Apply Engine include:
- IMS DBD Name - The 8 character DBD name used by IMS to define a database structure
- Segment Name - The 8 character IMS segment name from the DBD
- Table Name - The name of the table from which the data was captured
- Change Operation - A single character that identifies the operation that created the captured data (Insert, Update or Delete)
- Time Stamps - Depending on the source of the data these may be z/OS Storeclock or other database timestamps generated by the database or Connect CDC SQData Capture agent.
- Parent Key - Object containing hierarchically ordered name/value pairs of the Keys from the parent segments of a captured segment. This data is essential when processing data captured from a hierarchical database structure and provide what is typically referred to as "foreign key data" in a relational database.
- After and Before Images - Objects containing the content of a source "record" before and after it was updated. Only After images will be present on Inserts and Before images on Delete operations.
- Group and Elementary items - COBOL source data only
Default data transformation per JSON specification:
- All JSON data object "names" will be taken directly from Source Descriptions unless the Engine OPTION, "USE AVRO COMPATIBLE NAMES " is specified which instructs the Apply Engine to convert all such names to lower_snake_case including all CAPS-HYPHENATED-COBOL-FIELD-NAMES. While not the Default, it is highly recommended by Precisely.
- ALL JSON data will be automatically converted to UTF-8 from the source encoding scheme (CCSID) per JSON spec.
- Numeric fields, regardless of the internal representation on the source platform are converted to integer or decimal numbers and will not be in quotes per JSON spec.
- Leading zeros in all Numeric fields will be dropped per JSON spec.
- Character (both fixed and variable length) will be in quotes per JSON spec.
- Empty (null value) source fields are dropped by default per JSON spec but the name/value pair may be optionally included as name/'null'
Options for overriding default transformation:
While not recommended by Precisely, it is possible to alter the default behavior of the automatic JSON generation by creating Target Datastore Descriptions. Since downstream applications that process JSON formatted data will normally adhere to the JSON specification, one should consider the long term consequences of changes that alter the formatting or content of the data. Structural changes (ONLY), that for example remove unneeded structures/fields or add name/value pairs can be accommodated as required.
Another factor that must be considered when overriding JSON defaults is the eventual migration to AVRO Type Datastores, which also depends on the JSON specification.
The following examples introduce exceptions that are therefore not recommended but are possible.
Quoted Numbers - Target "DDL" structures can define columns that will receive integers as VARCHAR forcing source data transformation into a string. They will not contain containing leading zeros by default but that can also be optionally changed.
Object Name formatting - Target "DDL" structures can specify object/field "names" in any "style" including snake_case, camelCase, (Proper) CamelCase and even kebab-case. Precisely however recommends using the same field/column names and style as the Source DESCRIPTION. Remember also that the Apply Engine will automatically change the style to lower snake_case when generating AVRO formatted datastores.