Determine Kafka payload format - connect_cdc_sqdata - Latest

Connect CDC (SQData) Kafka Quickstart

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (SQData)
Version
Latest
Language
English
Product name
Connect CDC (SQData)
Title
Connect CDC (SQData) Kafka Quickstart
Copyright
2024
First publish date
2000
Last edition
2024-07-30
Last publish date
2024-07-30T20:00:09.892433

Precisely supports a variety of target formats for Kafka Topics, including JSON, AVRO and delimited. JSON is the preferred format for streaming changed data (CDC), particularly while testing as it is self-described. AVRO is essentially identical to JSON except that the Schema is maintained separately and the Kafka messages will have a significantly lighter payload compared to JSON.

With AVRO and JSON, you have the option of replicating an entire source record/row, including the before and after image for updates, without any mapping. This is performed via the REPLICATE function and the examples provided later in this document utilize the source descriptions to generate the target kafka payload.

You also have the option of customizing your output with the source to target mapping and dynamic construction of JSON with a custom payload format using Relational DDL type syntax and explicit source to target column mapping. While the custom target DDL can use any combination of upper and lower case as well as hyphens or underscores as the Kafka object names, we highly recommend against hyphens and also highly recommend using all lower case characters. The reason is that both are the defacto standards in the open source Kafka and Java communities. While you may be just getting started with Kafka, most large Kafka and Hadoop users later choose to separate the schemas from the data payload by using AVRO. While JSON is easy for a human to read it consumes significantly more storage. AVRO is one method of separating the two and it imposes much stricter rules on object names and data types.

You can also create a custom message using the PRINTF function. Please note that the scope of this document is not intended to discuss custom JSON. Visit Precisely https://www.precisely.com/support for assistance with this type of processing.

Note: Both JSON and AVRO formatted records are always in Unicode (UTF-8) with the translation from the source codepage/CCSID (i.e. EBCIDC/1047) performed automatically by the engine.
Syntax - JSON
DATASTORE kafka:///*                       -- specify dynamic topic
        OF JSON                           -- specify JSON format
        AS TARGET
        DESCRIBED BY GROUP DB2_SOURCE     -- use source for REPLICATE
Sample
{“object_name”:SQDATA.EMPLOYEE,
"stck":"00d16918eb93309dc0",
"timestamp":"2018-06-01-13.18.41.780264",
"change_op":"U",
"after_image":
{
"empno":“000010”,
"first_name":“SANDRA”,
”midinit":“i”,
“last_name”:“HAAS”,
“workdept”:“A00”,
“phone_no":"3978",
"hire_date":"1978-09-15",
"job":"PRES",
"ed_level":14,
"sex":"f",
"birth_date":"1954-03-31",
"salary":59250.00,
"bonus":6600.00,
"comm":2340.00
},
"before_image":
{
"empno":“000010”,
"first_name":“CHRISTINE”,
”midinit":“I”,
“last_name”:“HAAS”,
“workdept”:“A00”,
“phone_no":"3978",
"hire_date":"1978-09-15",
"job":"PRES",
"ed_level":14,
"sex":"f",
"birthdate":"1954-03-31",
"salary":59250.00,
"bonus":6600.00,
"comm":2340.00
}
}
Syntax - AVRO
DATASTORE kafka:///*                       -- specify dynamic topic
        OF AVRO                            -- specify AVRO format
        AS TARGET
        DESCRIBED BY GROUP DB2_SOURCE      -- use source for REPLICATE
Sample - AVRO Schema
{"type":"record","namespace":"sqdata","name":"employee__replicate",
"fields":[
{"name":"change_op","type":"string"},
{"name":"object_name","type":"string"},
{"name":"alias","type":"string"},
{"name":"stck","type":"bytes","default":{"\u0000":1}},
{"name":"timestamp","type":"string"},{"name":"seq","type":"long"},
{"name":"after_image","type":["null",{"type":"record","name":"employee",
"fields":[
{"name":"empno","type":"string"},
{"name":"first_name","type":"int"},
{"name":"midinit","type":"string"},
{"name":"last_name","type":"string"},
{"name":"workdept","type":"string"},
{"name":"phone_no","type":"string"},
{"name":"hire_date","type":"string"},
{"name":"job","type":["null","string"]},
{"name":"ed_level","type":["null","string"]},
{"name":"sex","type":["null","string"]},
{"name":"birth_date","type":["null","string"]},
{"name":"salary","type":["null","int"]},
{"name":"bonus","type":"string"},
{"name":"comm","type":["null","string"]}
]}]},
{"name":"before_image","type":["null","employee"]}]} 

Sample - AVRO data payload

No sample payload is included because in addition to removing the schema information and adding a header containing the Schema name, Registry ID number and date/time stamps, other formatting changes make the content of the file all but unreadable. For this reason Precisely suggests testing first with human readable JSON formatted output.