Kafka Datastores - connect_cdc_sqdata - Latest

Connect CDC (SQData) Apply engine

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (SQData)
Version
Latest
ft:locale
en-US
Product name
Connect CDC (SQData)
ft:title
Connect CDC (SQData) Apply engine
Copyright
2026
First publish date
2000
ft:lastEdition
2026-03-26
ft:lastPublication
2026-03-26T20:24:24.831000
L1_Product_Gateway
Integrate
L2_Product_Segment
Data Integration
L3_Product_Brand
Precisely Connect
L4_Investment_Segment
Application Data Integration
L5_Product_Group
ADI - Connect
L6_Product_Name
Connect CDC

While Connect CDC (SQData) commonly targets relational databases, Kafka is used when change data must be processed as events for downstream applications or processes. In many z/OS environments, IBM MQ has traditionally been used for distributed event communication. Apache Kafka provides an alternative publish‑and‑subscribe platform for building real‑time data pipelines.

Connect CDC (SQData) supports Kafka by bridging z/OS and open systems platforms and publishing streams of captured datastore changes directly to Kafka topics in real time.

Kafka is typically used when requirements go beyond simple replication to relational targets such as Db2/LUW, Oracle, or Microsoft SQL Server, including:

  • Event‑driven processing
  • Real‑time integration with downstream applications
  • Populating large‑scale data repositories for analytics or exploratory workloads where future questions may not yet be defined

Kafka Architecture and SQData integration

Kafka provides four core APIs. Two are particularly relevant in heterogeneous z/OS and open systems environments:
Producer API
Enables applications to publish streams of events to one or Kafka topics
Connect API
Connects Kafka topics to external systems, such as relational databases, by capturing changes to source tables.

The Apply Engine uses the Producer API, treating Kafka as a supported target datastore. The apply engine writes Kafka topics—formatted as JSON, AVRO, or other supported formats—using data captured by SQData capture agents.

Connect CDC (SQData)’s capture, publish, and apply architecture provides a two‑platform solution for z/OS environments where the Kafka Connector API is not natively supported. High‑performance capture agents paired with an Apply Engine running on Linux enable direct, point‑to‑point transfer of captured data from source to target. When properly configured, captured data is written directly to Kafka topics without the use of intermediate staging or temporary storage.

Environmental Requirements

Connect CDC (SQData) supports Kafka targets for change data capture on both z/OS and open systems platforms. The following environmental requirements apply to the Apply Engine for Kafka:

  • The Apply Engine for Kafka is supported only on Linux.
    Note: Although the Apply Engine has been implemented on IBM AIX, Linux is the supported platform.
  • The Kafka external library librdkafka.so is required and must be version 0.8 or later. The library can be downloaded from GitHub.

  • The Kafka library must be available in the system library path or specified using the environment variable SQDATA_KAFKA_LIBRARY, which points to the library location.
  • The Kafka target datastore is identified by a Kafka broker URL, which includes the Kafka cluster host name and port, the fully qualified Kafka topic name, and an optional partition.

Kafka Producer Configuration

Kafka producer behavior, including security and broker discovery, is configured using the sqdata_kafka_producer.conf file. This file is read by the Apply Engine at startup and supports all configuration properties provided by librdkafka.

The following examples illustrate common configuration patterns. Actual values depend on the Kafka cluster configuration.

Example: SSL
security.protocol=SSL
ssl.ca.location=/app/certificates/dev/abc_root_ca.cert
ssl.certificate.location=/home/<kafka_app_user>/kafkassl/client.pem   <-- Client's private key string (PEM format) used for authentication
ssl.key.location=/home/<kafka_app_user>/kafkassl/client.key
ssl.key.password=test1234
metadata.broker.list=<broker_host_01>:<port>,<broker_host_02>:<port>,<broker_host_03>:<port>
security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.server.truststore.jks
ssl.truststore.password=test1234
ssl.keystore.location=/var/private/ssl/kafka.server.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234
metadata.broker.list=<broker_host_01>:<port>,<broker_host_02>:<port>,<broker_host_03>:<port>
Example: SASL_SSL (Kerberos)
security.protocol=SASL_SSL
sasl.kerberos.service.name=kafka
sasl.kerberos.principal=<kafka_app_user@domain>
sasl.kerberos.keytab=/app/kafkalib/<kafka_app_user>.keytab
metadata.broker.list=<broker_host_01>:<port>,<broker_host_02>:<port>,<broker_host_03>:<port>

Kafka Datastore Syntax

Use the DATASTORE command to define Kafka as a target datastore for the Apply Engine.
------------------------------------------------------------
--       DATASTORE SECTION
------------------------------------------------------------
-- SOURCE DATASTORE
DATASTORE cdc://server:port/capture/target
  OF UTSCDC
  AS CDCIN
  DESCRIBED BY GROUP SOURCE_TABLES;
                                            
-- TARGET DATASTORE
DATASTORE kafka://[<hostname>[:<port_number>]] / [<kafka_topic_id>][/ | /<partition> | /key | /root_key]     
   OF AVRO FORMAT [CONFLUENT | CONFLUENT TOMBSTONE | CONTAINER | PLAIN]
   AS TARGET
   KEY IS DEPTNO, MGRNO
   KEY SUBJECT <kafka_topic_id>-key
   DESCRIBED BY GROUP SOURCE_TABLES;
Parameters
Keyword Description
<hostname>:<port_number>

Optional. Specifies a Kafka broker host name and TCP/IP port.

Kafka broker information is typically resolved dynamically using the Kafka producer configuration file (sqdata_kafka_producer.conf) located in the Apply Engine working directory at startup. This file can include any configuration properties supported by librdkafka; however, most deployments specify only a limited subset, such as broker endpoints and producer‑specific security settings.

Kafka security configuration is cluster‑specific. For background information, see Apache Kafka security and Confluent security. Configuration details should be defined in coordination with the Kafka cluster administrator.

<kafka_topic_id> | <prefix>_*_<suffix>

Optional. Specifies the Kafka topic name used as the target datastore. The topic may be defined explicitly using a fully qualified Kafka topic ID, or dynamically using a wildcard (*).

When a wildcard is used, the topic name is resolved at runtime. The * is replaced with the alias name of the source DESCRIPTION by default, or with the name specified using the TOPIC <name> clause in the DESCRIPTION. The wildcard may be preceded and/or followed by additional characters to form the complete topic name (for example, <prefix>_*_<suffix>).

Dynamic topic naming is useful when generating multiple topics or when topic names are long or programmatically derived. Whether Kafka topics must be created in advance depends on the Kafka cluster configuration.

[/<partition> | /key | /root_key | / ]

Specifies how Kafka determines the target partition for produced messages.

If omitted, Kafka uses its default partitioning behavior, distributing messages across available partitions. A specific partition number may be specified; however, using explicit partition numbers is not recommended, as it introduces additional administrative overhead when topics are modified.

The /key option is used for relational, VSAM, and keyed file sources and instructs Kafka to derive the partition from a key value. By default, relational sources use the concatenated list of source key columns. VSAM and keyed file sources require a KEY IS clause on each source DESCRIPTION. A KEY IS clause may also be specified for relational sources to override the default key selection. Using /key ensures that successive changes to the same row or record are written to the same partition and processed by consumers in capture order.

The /root_key option applies only to IMS sources and uses the root segment key to determine the target partition. This ensures that all segments captured under the same IMS root are written to the same partition and processed together in capture order.

A single / must be specified as a placeholder when the SETURLKEY function is used to define a custom partitioning key.

OF

Specifies the format used to write Kafka topic messages. Kafka topics can be written in JSON or AVRO format. When AVRO is selected, one of the following AVRO encoding formats must be specified:

  • CONFLUENT — Writes AVRO‑encoded messages using the Confluent wire format and Schema Registry.
  • CONFLUENT TOMBSTONE — Writes tombstone records, defined as records with a non‑null key and a null value, typically used to represent delete events. For additional details, refer to the Kafka documentation.
  • CONTAINER — Writes AVRO messages using the standard Avro container file format.
  • PLAIN — Writes AVRO‑encoded messages without Confluent framing or container formatting.
AS

Assigns an alias to the Kafka target datastore. The alias is used to reference the target datastore within the Apply Engine script (for example, in REPLICATE, SETURL, or SETURLKEY statements).

The alias TARGET is commonly used by convention.

KEY IS

Defines the Kafka message key and the associated Schema Registry subject used to serialize that key. The key identifies Kafka records and is used for partitioning and record identity. The key subject specifies the Schema Registry subject under which the key schema is registered.

Both are required when using CONFLUENT TOMBSTONE format, as delete events are represented by records with a non‑null key and a null value, and record identification relies entirely on the key.

KEY SUBJECT
DESCRIBED BY GROUP Specifies the DESCRIPTION group that defines the metadata (such as table or segment structure, keys, topics, and schema details) used by the datastore.
Note:
  • Target datastores that use Confluent‑managed schemas can be written only using the APPLY or REPLICATE function.
  • The relationship between the DESCRIPTION alias, Kafka topic, and Schema Registry subject is determined by the organization’s Schema Registry design. The examples shown are illustrative and based on the source table and application context (for example, the EMPLOYEE and DEPARTMENT tables in the IVP_HR Db2 database).
  • In the examples, the Schema Registry subject is defined using the default Confluent Control Center convention, where the subject name matches the topic name with the suffix -value.
  • The Confluent Schema Registry supports multiple topic and subject naming strategies. All are supported by SQData; however, some strategies may not be compatible with other tools, including Confluent Control Center.
  • For AVRO formats that use Schema Registry integration, the schema ID is supplied at run time by Confluent based on the TOPIC and SUBJECT values specified on the source DESCRIPTION. See the Apply Engine Reference for alternative methods of assigning a schema ID.
  • Kafka partition assignment can be controlled using the datastore URL options described above and/or explicitly controlled using the SETURLKEY function.