Prepare target apply environment - connect_cdc_sqdata - Latest

Connect CDC (SQData) Kafka Quickstart

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (SQData)
Version
Latest
Language
English
Product name
Connect CDC (SQData)
Title
Connect CDC (SQData) Kafka Quickstart
Copyright
2024
First publish date
2000
Last edition
2024-07-30
Last publish date
2024-07-30T20:00:09.892433

This section prepares the Linux environment where the Engine will operate.

Note: Before you begin, make sure Connect CDC (SQData) is installed on Linux.

Create variable directories

Once Linux, UNIX and Windows source and target systems and datastores have been identified, you can configure the Capture Agents, Apply Engines and Controller Daemon's. You will need to create directories and files for variable portions of the configuration.

The recommended location is:
/opt/sqdata or
/home/<sqdata_user>/sqdata
If an environment variable is used, the recommended value is:
SQDATA_DIR
Controller Daemons, Capture Agents and Engines require the creation of directories and files for variable portions of their configurations. Just as the location of the base product installation can be modified, the location of variable directories can be adjusted conform to the operating system and to accommodate areas of responsibility, including the associated "application" and "environments" such as Test and Production. The location most commonly used on Linux, AIX and Windows is:
/var/opt/sqdata[/<application>[/<environment>]] or
/home/<sqdata_user>[/<application>[/<environment>]] or simply
/home/sqdata[/<application>[/<environment>]]
If an environment variable is used, the recommended value is:
SQDATA_VAR_DIR
While only the base variable directory is required and the location of the daemon directory is optional, the recommended structure is:
  • <SQDATA_VAR_DIR>/daemon - The working directory used by the Daemon that also contains two sub

    directories.

  • <SQDATA_VAR_DIR>/daemon/cfg - A configuration directory that contains two configuration files.

  • <SQDATA_VAR_DIR>/daemon/logs A logs directory, though not required, is suggested to store log files used by the controller daemon. Its suggested location below must match the file locations specified in the Global section of the sqdagents.cfg file created in the section "Setup Controller Daemon".

Additional directories should be created for each Capture agent running on the system. Precisely recommend the structures described below:
  • <SQDATA_VAR_DIR>/<type>cdc - The working directory of each capture agent where type might be ORA (Oracle) and UDB (Db2/LUW)
  • <SQDATA_VAR_DIR>/<type>cdc/data - A data directory is also required by each Capture agents. Files

    will be allocated in this directory as needed by the CDCStore Storage Agent when transient data exceeds allocated in-memory storage. The suggested location below must match the "data_path" specified in the Storage agent configuration (.cab file) described in Setup and Configure Sources. A dedicated File System is required in production with this directory as the "mount point".

    Example:

    The following commands will create the directories described above:
    $ mkdir -p <SQDATA_VAR_DIR>/daemon --mode=775
    $ mkdir -p <SQDATA_VAR_DIR>/daemon/cfg --mode=775
    $ mkdir -p <SQDATA_VAR_DIR>/daemon/log --mode=775
    $ mkdir -p <SQDATA_VAR_DIR>/<type>cdc --mode=775
    $ mkdir -p <SQDATA_VAR_DIR>/<type>cdc/data --mode=775
Note: The User-ID(s) under which the Capture and Engine agents and the Controller Daemon will run must be authorized for Read/Write access to these directories.

Create application directory structure

Connect CDC (SQData) Apply and Replicator engines share a number of operational components including both NaCl Keys and the Connect CDC (SQData) daemon. The Linux directory structure described below should be used for Apply Engines.

The Connect CDC (SQData) variable directory <SQDATA_VAR_DIR> location works for Capture Agents and the Controller daemon. Apply Engine script development requires a structure for similar items on different platforms such as DDL from Db2 and Oracle. The following directory nodes are recommended for script development and parts management.

In the table below, it is only necessary to create the directories that are needed for the source and target types used at the site. For example, if you are not replicating from IMS, you do not need the IMSSEG and IMSDBD directories.

<SQDATA_VAR_DIR>/<type>cdc/<directory_name> Description
ENGINE Main Engine scripts
CDCPROC CDC Engine Called Procedures referenced by #INCLUDE
LOADPROC

Load (UnLoad) Engine Called Procedures referenced by #INCLUDE

DSDEF Datastore Definition referenced by #INCLUDE
<TYPE>DDL RDBMS specific DDL, eg DB2DDL, ORADDL, MSSQLDDL, etc
IMSSEG IMS Segment Copybooks
IMSDBD IMS DBD
<TYPE>COB

System specific COBOL copybooks, eg: VSAMCOB, SEQCOB (sequential files)

XMLDTD

XML Document Type Definitions that will be used in a DESCRIPTION command

<TYPE>CSR RDBMS specific Cursors, eg DB2CSR, ORACSR,etc
<TYPE>LOAD RDBMS specific Load Control, eg DB2LOAD, ORALOAD ,etc
Note:
  • If your environment includes z/OS platform, it is recommended that z/OS references use upper case.
  • Engine scripts are platform specific and cannot be used on other platform, (i.e. z/OS and UNIX).
  • Called Procedures can be used with little or no changes on another platform, even when they contain platform specific Functions, unless they require direct access to a datastore on another platform, a typical requirement.

  • Part locations usually refer to the last node of standard z/OS Partitioned Datasets and UNIX or Windows directory hierarchy.

Unzip the Basic_Linux_Parts_Structure.zip file to create the full Linux directory structure along with sample parts and shell scripts.

Commands similar to the following may be used to create the recommended directory structures:
$ mkdir -p <SQDATA_VAR_DIR>/DB2DDL --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/ORADDL --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/IMSDBD --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/IMSSEG --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/ENGINE --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/CDCPROC --mode=775

The nature of Replicator Engine Configuration scripts do not require a complicated structure. Precisely recommends that you consider how your Source database and Applications are structured since it may be desirable to maintain Replicator Engine Configurations in a similar structure:

<SQDATA_VAR_DIR>/<type>cdc/<directory_name> Description
ENGINE Replicator Engine Configuration Scripts

Resolve external library requirements

Librdkafka.so provides the "C" language API for Producer, Consumer and Admin clients and is required by the Engines to connect to Kafka. The required version level of the Kafka external library is 0.80 or higher. This library can be installed using a distribution provided package which can be downloaded from github.

Note: Precisely recommends using the most current version of this open source library whenever possible.

Kcat a command line tool from the developers of librdkafka that uses the same API should also be installed from https://github.com/edenhill/kcat. It is a prerequisite and should be used to test and diagnose all installation specific configurations and connection issues. Once Kcat is working the same configuration parms can be used for Connect CDC (SQData). Kcat can also be used to confirm topic content and Engine execution by acting as a local Kafka consumer, providing end to end validation of data replication.

libcurl is required to support communication with the Confluent Schema Registry by the Engines when using AVRO formatted Kafka topics. Libcurl can be installed using distribution provided package if available or built from source which can be downloaded from https://curl.se/download.html. For more information about AVRO and Confluent platform, see https://www.confluent.io/download/.

Other functionality can be also be used:

Set external library path

If the Kafka external library is not already in the system library path, the environment variable SQDATA_KAFKA_LIBRARY must be set to point to the external library.