This section prepares the Linux environment where the Engine will operate.
Create variable directories
Once Linux, UNIX and Windows source and target systems and datastores have been identified, you can configure the Capture Agents, Apply Engines and Controller Daemon's. You will need to create directories and files for variable portions of the configuration.
/opt/sqdata or
/home/<sqdata_user>/sqdata
SQDATA_DIR
/var/opt/sqdata[/<application>[/<environment>]] or
/home/<sqdata_user>[/<application>[/<environment>]] or simply
/home/sqdata[/<application>[/<environment>]]
SQDATA_VAR_DIR
<SQDATA_VAR_DIR>/daemon - The working directory used by the Daemon that also contains two sub
directories.
<SQDATA_VAR_DIR>/daemon/cfg - A configuration directory that contains two configuration files.
<SQDATA_VAR_DIR>/daemon/logs A logs directory, though not required, is suggested to store log files used by the controller daemon. Its suggested location below must match the file locations specified in the Global section of the sqdagents.cfg file created in the section "Setup Controller Daemon".
- <SQDATA_VAR_DIR>/<type>cdc - The working directory of each capture agent where type might be ORA (Oracle) and UDB (Db2/LUW)
- <SQDATA_VAR_DIR>/<type>cdc/data - A data directory is also required by each Capture agents. Files
will be allocated in this directory as needed by the CDCStore Storage Agent when transient data exceeds allocated in-memory storage. The suggested location below must match the "data_path" specified in the Storage agent configuration (.cab file) described in Setup and Configure Sources. A dedicated File System is required in production with this directory as the "mount point".
Example:
The following commands will create the directories described above:$ mkdir -p <SQDATA_VAR_DIR>/daemon --mode=775 $ mkdir -p <SQDATA_VAR_DIR>/daemon/cfg --mode=775 $ mkdir -p <SQDATA_VAR_DIR>/daemon/log --mode=775 $ mkdir -p <SQDATA_VAR_DIR>/<type>cdc --mode=775 $ mkdir -p <SQDATA_VAR_DIR>/<type>cdc/data --mode=775
Create application directory structure
Connect CDC (SQData) Apply and Replicator engines share a number of operational components including both NaCl Keys and the Connect CDC (SQData) daemon. The Linux directory structure described below should be used for Apply Engines.
The Connect CDC (SQData) variable directory <SQDATA_VAR_DIR> location works for Capture Agents and the Controller daemon. Apply Engine script development requires a structure for similar items on different platforms such as DDL from Db2 and Oracle. The following directory nodes are recommended for script development and parts management.
In the table below, it is only necessary to create the directories that are needed for the source and target types used at the site. For example, if you are not replicating from IMS, you do not need the IMSSEG
and IMSDBD
directories.
<SQDATA_VAR_DIR>/<type>cdc/<directory_name> | Description |
---|---|
ENGINE | Main Engine scripts |
CDCPROC | CDC Engine Called Procedures referenced by #INCLUDE |
LOADPROC |
Load (UnLoad) Engine Called Procedures referenced by #INCLUDE |
DSDEF | Datastore Definition referenced by #INCLUDE |
<TYPE>DDL | RDBMS specific DDL, eg DB2DDL, ORADDL, MSSQLDDL, etc |
IMSSEG | IMS Segment Copybooks |
IMSDBD | IMS DBD |
<TYPE>COB |
System specific COBOL copybooks, eg: VSAMCOB, SEQCOB (sequential files) |
XMLDTD |
XML Document Type Definitions that will be used in a DESCRIPTION command |
<TYPE>CSR | RDBMS specific Cursors, eg DB2CSR, ORACSR,etc |
<TYPE>LOAD | RDBMS specific Load Control, eg DB2LOAD, ORALOAD ,etc |
- If your environment includes z/OS platform, it is recommended that z/OS references use upper case.
- Engine scripts are platform specific and cannot be used on other platform, (i.e. z/OS and UNIX).
-
Called Procedures can be used with little or no changes on another platform, even when they contain platform specific Functions, unless they require direct access to a datastore on another platform, a typical requirement.
- Part locations usually refer to the last node of standard z/OS Partitioned Datasets and UNIX or Windows directory hierarchy.
Unzip the Basic_Linux_Parts_Structure.zip file to create the full Linux directory structure along with sample parts and shell scripts.
$ mkdir -p <SQDATA_VAR_DIR>/DB2DDL --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/ORADDL --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/IMSDBD --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/IMSSEG --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/ENGINE --mode=775
$ mkdir -p <SQDATA_VAR_DIR>/CDCPROC --mode=775
The nature of Replicator Engine Configuration scripts do not require a complicated structure. Precisely recommends that you consider how your Source database and Applications are structured since it may be desirable to maintain Replicator Engine Configurations in a similar structure:
<SQDATA_VAR_DIR>/<type>cdc/<directory_name> | Description |
---|---|
ENGINE | Replicator Engine Configuration Scripts |
Resolve external library requirements
Librdkafka.so provides the "C" language API for Producer, Consumer and Admin clients and is required by the Engines to connect to Kafka. The required version level of the Kafka external library is 0.80 or higher. This library can be installed using a distribution provided package which can be downloaded from github.
Kcat a command line tool from the developers of librdkafka that uses the same API should also be installed from https://github.com/edenhill/kcat. It is a prerequisite and should be used to test and diagnose all installation specific configurations and connection issues. Once Kcat is working the same configuration parms can be used for Connect CDC (SQData). Kcat can also be used to confirm topic content and Engine execution by acting as a local Kafka consumer, providing end to end validation of data replication.
libcurl is required to support communication with the Confluent Schema Registry by the Engines when using AVRO formatted Kafka topics. Libcurl can be installed using distribution provided package if available or built from source which can be downloaded from https://curl.se/download.html. For more information about AVRO and Confluent platform, see https://www.confluent.io/download/.
Other functionality can be also be used:
-
SSL - While disabled by default, SSL can be used for Encryption and Authentication by the Connect CDC SQData Kafka client. Fore more information, see https://github.com/edenhill/librdkafka/wiki/Using-SSL-with-librdkafka and http://kafka.apache.org/documentation.html#security_ssl.
-
A variety of Kakfa dashboard tools, both open source and proprietary can be used to monitor the flow of data into Kafka.
Set external library path
If the Kafka external library is not already in the system library path, the environment variable SQDATA_KAFKA_LIBRARY must be set to point to the external library.