While Connect CDC (SQData) commonly targets relational databases, Kafka is used when change data must be processed as events for downstream applications or processes. In many z/OS environments, IBM MQ has traditionally been used for distributed event communication. Apache Kafka provides an alternative publish‑and‑subscribe platform for building real‑time data pipelines.
Connect CDC (SQData) supports Kafka by bridging z/OS and open systems platforms and publishing streams of captured datastore changes directly to Kafka topics in real time.
Kafka is typically used when requirements go beyond simple replication to relational targets such as Db2/LUW, Oracle, or Microsoft SQL Server, including:
- Event‑driven processing
- Real‑time integration with downstream applications
- Populating large‑scale data repositories for analytics or exploratory workloads where future questions may not yet be defined
Kafka Architecture and SQData integration
- Producer API
- Enables applications to publish streams of events to one or Kafka topics
- Connect API
- Connects Kafka topics to external systems, such as relational databases, by capturing changes to source tables.
The Apply Engine uses the Producer API, treating Kafka as a supported target datastore. The apply engine writes Kafka topics—formatted as JSON, AVRO, or other supported formats—using data captured by SQData capture agents.
Connect CDC (SQData)’s capture, publish, and apply architecture provides a two‑platform solution for z/OS environments where the Kafka Connector API is not natively supported. High‑performance capture agents paired with an Apply Engine running on Linux enable direct, point‑to‑point transfer of captured data from source to target. When properly configured, captured data is written directly to Kafka topics without the use of intermediate staging or temporary storage.
Environmental Requirements
Connect CDC (SQData) supports Kafka targets for change data capture on both z/OS and open systems platforms. The following environmental requirements apply to the Apply Engine for Kafka:
- The Apply Engine for Kafka is supported only on Linux.Note: Although the Apply Engine has been implemented on IBM AIX, Linux is the supported platform.
-
The Kafka external library
librdkafka.sois required and must be version 0.8 or later. The library can be downloaded from GitHub. - The Kafka library must be available in the system library path or
specified using the environment variable
SQDATA_KAFKA_LIBRARY, which points to the library location. - The Kafka target datastore is identified by a Kafka broker URL, which includes the Kafka cluster host name and port, the fully qualified Kafka topic name, and an optional partition.
Kafka Producer Configuration
Kafka producer behavior, including security and broker discovery, is
configured using the sqdata_kafka_producer.conf file.
This file is read by the Apply Engine at startup and supports all
configuration properties provided by librdkafka.
The following examples illustrate common configuration patterns. Actual values depend on the Kafka cluster configuration.
security.protocol=SSL
ssl.ca.location=/app/certificates/dev/abc_root_ca.cert
ssl.certificate.location=/home/<kafka_app_user>/kafkassl/client.pem <-- Client's private key string (PEM format) used for authentication
ssl.key.location=/home/<kafka_app_user>/kafkassl/client.key
ssl.key.password=test1234
metadata.broker.list=<broker_host_01>:<port>,<broker_host_02>:<port>,<broker_host_03>:<port>security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.server.truststore.jks
ssl.truststore.password=test1234
ssl.keystore.location=/var/private/ssl/kafka.server.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234
metadata.broker.list=<broker_host_01>:<port>,<broker_host_02>:<port>,<broker_host_03>:<port>security.protocol=SASL_SSL
sasl.kerberos.service.name=kafka
sasl.kerberos.principal=<kafka_app_user@domain>
sasl.kerberos.keytab=/app/kafkalib/<kafka_app_user>.keytab
metadata.broker.list=<broker_host_01>:<port>,<broker_host_02>:<port>,<broker_host_03>:<port>Kafka Datastore Syntax
DATASTORE command to define Kafka as a target
datastore for the Apply
Engine.------------------------------------------------------------
-- DATASTORE SECTION
------------------------------------------------------------
-- SOURCE DATASTORE
DATASTORE cdc://server:port/capture/target
OF UTSCDC
AS CDCIN
DESCRIBED BY GROUP SOURCE_TABLES;
-- TARGET DATASTORE
DATASTORE kafka://[<hostname>[:<port_number>]] / [<kafka_topic_id>][/ | /<partition> | /key | /root_key]
OF AVRO FORMAT [CONFLUENT | CONFLUENT TOMBSTONE | CONTAINER | PLAIN]
AS TARGET
KEY IS DEPTNO, MGRNO
KEY SUBJECT <kafka_topic_id>-key
DESCRIBED BY GROUP SOURCE_TABLES;| Keyword | Description |
|---|---|
| <hostname>:<port_number> |
Optional. Specifies a Kafka broker host name and TCP/IP port. Kafka broker information is typically resolved dynamically
using the Kafka producer configuration file
( Kafka security configuration is cluster‑specific. For background information, see Apache Kafka security and Confluent security. Configuration details should be defined in coordination with the Kafka cluster administrator. |
| <kafka_topic_id> | <prefix>_*_<suffix> |
Optional. Specifies the Kafka topic name used as the
target datastore. The topic may be defined explicitly using a
fully qualified Kafka topic ID, or dynamically using a
wildcard ( When a wildcard is used, the topic name is resolved at
runtime. The Dynamic topic naming is useful when generating multiple topics or when topic names are long or programmatically derived. Whether Kafka topics must be created in advance depends on the Kafka cluster configuration. |
| [/<partition> | /key | /root_key | / ] |
Specifies how Kafka determines the target partition for produced messages. If omitted, Kafka uses its default partitioning behavior, distributing messages across available partitions. A specific partition number may be specified; however, using explicit partition numbers is not recommended, as it introduces additional administrative overhead when topics are modified. The The A single |
| OF |
Specifies the format used to write Kafka topic messages. Kafka topics can be written in JSON or AVRO format. When AVRO is selected, one of the following AVRO encoding formats must be specified:
|
| AS |
Assigns an alias to the Kafka target datastore. The alias is
used to reference the target datastore within the Apply Engine
script (for example, in The alias |
| KEY IS |
Defines the Kafka message key and the associated Schema Registry subject used to serialize that key. The key identifies Kafka records and is used for partitioning and record identity. The key subject specifies the Schema Registry subject under which the key schema is registered. Both are required when using CONFLUENT TOMBSTONE format, as delete events are represented by records with a non‑null key and a null value, and record identification relies entirely on the key. |
| KEY SUBJECT | |
| DESCRIBED BY GROUP | Specifies the DESCRIPTION group that defines the metadata (such as table or segment structure, keys, topics, and schema details) used by the datastore. |
- Target datastores that use Confluent‑managed schemas can be
written only using the
APPLYorREPLICATEfunction. - The relationship between the DESCRIPTION alias, Kafka
topic, and Schema Registry subject is determined by the
organization’s Schema Registry design. The examples shown are
illustrative and based on the source table and application context
(for example, the
EMPLOYEEandDEPARTMENTtables in theIVP_HRDb2 database). - In the examples, the Schema Registry subject is defined
using the default Confluent Control Center convention, where the
subject name matches the topic name with the suffix
-value. - The Confluent Schema Registry supports multiple topic and subject naming strategies. All are supported by SQData; however, some strategies may not be compatible with other tools, including Confluent Control Center.
- For AVRO formats that use Schema Registry integration, the
schema ID is supplied at run time by Confluent based on the
TOPICandSUBJECTvalues specified on the sourceDESCRIPTION. See the Apply Engine Reference for alternative methods of assigning a schema ID. - Kafka partition assignment can be controlled using the datastore
URL options described above and/or explicitly controlled using the
SETURLKEYfunction.