Configure engine - connect_cdc_sqdata - Latest

Connect CDC (SQData) Change Data Capture

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (SQData)
Version
Latest
Language
English
Product name
Connect CDC (SQData)
Title
Connect CDC (SQData) Change Data Capture
Copyright
2024
First publish date
2000
ft:lastEdition
2024-09-05
ft:lastPublication
2024-09-05T15:00:09.754973

This Capture Agent supports two types of Engines. The Apply Engine which provides maximum control over the replication process and a variety of target datastores and the Replicator Engine, designed for maximum streaming replication performance but with limited target datastore options.

The function of an Apply Engine may be one of simple replication, data transformation, event processing, source datastore unload or a more sophisticated active/active data replication scenario. The actions performed by an Apply Engine are described by an Engine Script the complexity of which depends entirely on the intended function and business rules required to describe that function.

The most common function performed by an Apply Engine is to process data from one of the Change Data Capture (CDC) agents, applying business rules to transform that data so that it can be applied or efficiently replicated to a Target datastore of any type on any operating platform.

The following steps should be followed to configure an Apply Engine:

  1. Determine requirements

    Identify the type of the target datastore; the platform the Apply Engine will run on; and finally the data transformations required, if any, to map the source data to the target data structures.

  2. Prepare Apply Engine Environment

    Once the platform and type of target datastore are known, the environment on that platform must be prepared including the installation of Connect CDC SQData and any other components required by the target datastore. Connect CDC SQData will also utilize your existing native TCP/IP network for publishing data captured on one platform to Engines running on any another platform. Factors including performance requirements and network latency should be considered when selecting the location of the system on which the Engine will execute.

  3. Configure Engine Controller Daemon

    The Engine Controller Daemon is the same program, SQDaemon, as the Capture Controller Daemon but provides local and remote management and control of Engines, Utilities and other User agents on the platform where they execute. Precisely recommends using an Engine Controller Daemon to simplify operation including the optional automatic startup of Engine agents following platform restart.

  4. Create Apply Engine Script

    The Apply Engine utilizes a SQL like scripting language capable of a wide range of operations, from replication of identical source and target structures using a single command to complex business rule based transformations. Connect CDC SQData commands and functions provide full procedural control of data filtering, mapping and transformation including manipulation of data at its most elemental level if required.

  5. End-to-end Component Verification

    Confirm successful Change Data Capture through target datastore content validation.

    The Replicator Engine is controlled by a simple configuration file that merely identifies source and target datastores. It's primary purpose is to operate like a utility offering high performance source to target replication with the focus on streaming targets like Kafka. It runs only on Linux.

The Replicator Engine operates in two modes:

  • As a single purpose version of the Apply Engine designed for pure replication of captured Relational source data to selected target Datastores. The objective is to provide a Utility like Replication solution requiring minimal configuration and eliminating, to the extent possible, all maintenance by supporting what we refer to as Schema Evolution. In the case of Db2 zOS ONLY, changes to Db2 schemas will automatically generate altered JSON schemas for Kafka and when using AVRO, automatically update to the Confluent Schema Repository and the AVRO formatted Kafka topic payload. The Replicator Engine performs no unnecessary data transformations and provides no ability to inject Business Rules to affect the data written to the target. There are limited options providing global control of Kafka topic header information.
  • As a Distributor for data captured by either the IMS TM Exit or IMS Log Capture agent. In this mode the IMS CDC data is written as special purpose Kafka Topics containing CDCRAW data partitioned by the full or partial Key of the Root segments. The partitioning of the Kafka targets provides for subsequent parallel processing of the CDCRAW data by Apply Engines configured as Kafka Consumers.

The following steps should be followed to configure the RDBMS Replicator Engine for JSON or AVRO formatted Kafka topics.

  1. Determine requirements

    Identify the type of the target datastore, JSON or AVRO formatted Kafka topics.

  2. Prepare Replicator Engine Environment

    The Replicator Engine runs on linux and preparation requires only installation of Connect CDC SQData and other components required by the target datastore. A Kafka target for example requires the Open Source librdkafka C language API and the libcurl API if a Confluent Schema Repository is to be used, in addition to Kafka cluster access and the optional Confluent Repository. Like all Engine side processing, the Replicator will also utilize your existing native TCP/IP network to receive publishing data captured on another platform. Factors including performance requirements and network latency should be considered when selecting the location of the system on which the Replicator Engine will execute.

  3. Configure Engine Controller Daemon

    The Engine Controller Daemon is the same program, SQDaemon, as the Capture Controller Daemon but provides local and remote management and control of Engine, Utility and other User agents on the platform where they execute. Precisely recommends using an Engine Controller Daemon to simplify operation including the optional automatic Start of Engine agents following platform restart.

  4. Create Replicator Engine Configuration Script

    The Replicator Engine uses a very simple configuration script with very few options and no data transformation logic. See the Replicator Engine Reference for details.

  5. End-to-end Component Verification

    Confirm successful Change Data Capture through target datastore content validation.

Notes:

  • See Db2/z Straight Replication for an sample Apply engine script and the Apply and Replicator Engine References for a full explanation of the capabilities provided by both types of Engine.
  • See Add Engine Controller Daemon for an example of the configuration.