DATASTORE - connect_cdc_sqdata - Latest

Connect CDC (SQData) Apply engine

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (SQData)
Version
Latest
Language
English
Product name
Connect CDC (SQData)
Title
Connect CDC (SQData) Apply engine
Copyright
2024
First publish date
2000
ft:lastEdition
2024-07-30
ft:lastPublication
2024-07-30T20:19:56.898694

The DATASTORE command is used to specify the characteristics of source and target datastores. An Engine script may reference only one primarysource DATASTORE but may read a second source datastore, typically a parameter file or may write or "apply" to multiple target datastores. Prior to specifying the DATASTORE command in the script, one or more DESCRIPTION commands must be provided to describe the structure of each record in the datastore.

  • A CDC Datastore may contain all the changed data for multiple physical Tables or all the different type of segments in an IMS database. Depending on the nature of the Engine processing the CDC Datastore one or more physical target Datastores may be affected data.
  • Relational Source datastore can contain only a single Table, an Engine can create multiple physical target datastores.
  • A hierarchical or flat file source datastore may contain many different segments or record layouts which could result in one or more physical target datastores.
Syntax
DATASTORE  cdc://[<host_name_or_address>[:<port_number>]]/<agent_name>/<engine_name>*
        | cdcs://[<host_name_or_address>[:<port_number>]]/<agent_name>/<engine_name>
        | tcp://<hostname>/<tcpip_port>
        | mqs://[<qmgr_name>]/<queue_name> [:<correl_id>]
        | kafka://[<hostname>[:<port_number>]] / [<kafka_topic_id>][/<partition> | key | root_key]
        | hdfs://<hostname>[:<port_number>]/<hdfs_file_name>
        | *
        | RDBMS
        | file://[<relative_path.> | <full_path.>] <file_name>
        | <DD:ddname>[* | (<member_name>)]
        | <schema.table_name>
        | <IMS_DBD_name>	
      OF <DATASTORE_TYPE> AVRO | BINARY | DB2LOAD | DELIMITED | TEXT | HSSUNLOAD | IMSCDC | IMSDB | JSON | ORACLECDC | RELATIONAL | UTSCDC | VSAM | VSAMCDC | XMLCDC
        [CHRDEL(‘<delchr>’)]
        [COLDEL(‘<delchr>’)]
        [RECDEL(‘<delchr>’)]
        [CCSID <code> | ASCII | EBCDIC ]
        [STAGING SIZE <nG>]                   -- file size in Gigabytes
        [STAGING DELAY <mm>]                  -- rotate every mm minutes
      AS <datastore_alias>
      DESCRIBED BY <description_alias1> [,<description_alias2>,…, <description_aliasn>] |  DUMMY
      DESCRIBED BY GROUP <group_name1> [,<group_name2>,…}
      [ACCEPT ALL]
      [BYPASS CHGCHECK]
      [CHECK FULL BEFORE IMAGE]
      [COMPENSATE]
      [COMPENSATE WITH WARNING]
      [NO COMPENSATION]
      [EXCEPTION <exception_datastore>]
      [FOR CHANGE|INSERT]
      [FORCE QUALIFIER <high_level>
      [FORMAT [CONFLUENT | CONTAINER | PLAIN]
      [IMSPATHDATA]
      [KEEP NULL FIELDS]
      [KEY IS <key1> [,<key2>...,<keyn>]]
      [OPERATION IS <change_op_field>]
      [PROCESS COMMIT]
      [QUERY </+ <SQL Select and Where clause> +/>]
      [RECONNECT]
      [RRS]
      [SINGLE IMAGE]
      [STRIP TRAILING SPACES]
      [WTO ON COMPENSATION EVERY <n> MINUTES]
      [WTO ON EXCEPTION ONCE]
;
Keyword and Parameter Descriptions
Keyword Description
cdc://[<host_name_or_address>[:<port_number>]] / <agent_name> / <engine_name>

URL syntax type that specifies a Change Data Capture (CDC)/publisher subscription. It is comprised of the characters cdc :// followed by the qualified name of the CDCStore/CDCzLog Publisher and subscribing Engine:

[<host_name_or_address>[:<port_number>]] May be omitted to specify Localhost when the Engine runs on the same system as the Capture AND uses Default Port 2626. Host name only may be omitted and only Port number specified, if Localhost but not using Default Port 2626.
<agent_name>

Specifies the alias name assigned to the Capture/Publisher agent specified in the sqdagents.cfg file for the SQDaemon.

<engine_name>

Specifies the name of the Engine subscribing to the Capture/publisher specified in the capture/publisher configuration .CAB file (see the applicable the Apply Engine Capture Reference). UPPER CASE Engine names recommended for z/OS compatibility.

cdcs://[<host_name_or_address>[:<port_number>]] / <agent_name> / <engine_name> Identical to URL syntax type "cdc" except that "cdcs" specifies that a secure TLS connection will be made to zOS (ONLY) Daemon, Capture and Publisher components already configured to run under IBM's Application Transparent Transport Layer Security (AT-TLS) .
tcp://<hostname>/<tcpip_port>

URL syntax type that specifies a direct "tcp/ip" connection consisting of the remote host name or address and port number. Direct tcp/ip connections are only supported between two Engines, both of which must specify the respective remote host port. An example of specifying a TCP/IP port is provided later in this section.

mqs:// [<qmgr_name>] /<queue_name> [<:correl_id>]

URL syntax type that specifies the physical name of a IBM MQ (queue_name). An optional queue manager name (qmgr_name) can be specified before the queue name (/queue_name). Note, there must be a total of three (3) "/" marks between the "mqs:" and the queue name. An optional correlation ID (correl_id), preceded by a colon (:), can be specified after the queue name (queue_name).

Note: You can specify the correlation_id as a hexadecimal (hex) value and translate it to either ASCII or EBCDIC. When the correlation_id must be translated, it must be enclosed between E”corrid” (for EBCDIC) or A”corrid” (for ASCII), or X”hex_char_of_corrid” (for hexadecimal).
kafka://[<hostname>[:<port_number>]] / [<kafka_topic_id>] [/<partition> | key | root_key] URL syntax type that specifies a Kafka target. Optionally identify specific Kafka Broker Host name, TCP/IP port, unique Kafka Topic id and partition.
Note: Various options are available for specifying the Kafka URL including a list of Brokers specified at runtime, dynamically generated Topic ID's and source key based partitioning, see Kafka Datastores.
hdfs://<hostname>[:<port_number>]/<hdfs_file_name> URL syntax type that specifies a HDFS (Hadoop) target. Optionally specifies a specific HDFS Host name, TCP/IP port and unique Hadoop HDFS file name. See HDFS (Hadoop) Datastores.
<file_name> Specifies the qualified (using full or relative path) physical name of the file. The file_name can also be the fully qualified name of the physical file enclosed in double quotes (“).
file://[<relative_path.> | <full_path.>] <file_name>

Specifies the qualified (using full or relative path) physical name of the file.

<DD:ddname>{(<member_name>)] Specify the physical file on the z/OS platform by specifying DD:ddname, where ddname is the name of the DD statement in zOS JCL that specifies the file or partitioned dataset name containing the optional <member_name>.
<schema.table_name> Specifies the name of a relational database schema and table or view.
<IMS_DBD_name> Specifies the name of an IMS database. The OF <datastore_type> datastore type must be IMSDB.
OF <datastore_type>

The type of datastore that will be processed must be one of the following Keywords. Some of the datastore types have additional Keywords / Parameters, also specified below.

 

AVRO - Specifies that the target datastore utilizes AVRO schemas, which are based on JSON but that separate the data structure description from a much more compact data payload. The AVRO Type Datastores also provide for schema evolution driven by changes in the source datastore Descriptions and/or custom target datastore Descriptions. For more details see both AVRO Type Datastores , JSON Type Datastores below and the OPTION: USE AVRO COMPATIBLE NAMES .

  BINARY - Specifies a binary/flat file (i.e. a file created by a COBOL program).
 

DB2LOAD - Specifies a file containing data in DB2 Unload utility format.

 
DELIMITED - Specifies a file containing delimited data where up to three different types of delimiter characters can be specified as follows:
  • [CHRDEL(‘delchr’)] Specifies the character delimiter that is used to surround character fields, where delchr is the delimiter. The default is double quotes (“). The delimiter can be entered as a hex value. For example if the character representing the hex value FD is the delimiter, you can enter CHRDEL(‘\xFD’).

  • [COLDEL (‘delchr’)] Specifies the delimiter that is used between columns, where delchr is the delimiter. The default is a comma (,). The delimiter can be entered as a hex value (see the CHRDEL keyword).

  • [RECDEL (‘delchr’)] Specifies the delimiter that is used between records, where delchr is the delimiter. The default is CRLF (carriage return/line feed). The delimiter can be entered as a hex value (see the CHRDEL keyword).

  TEXT - Specifies the file output in text format.
  HSSUNLOAD – A file (input only) containing IMS database unloaded data using the HSSR unload utility.
  IMSCDC – Specifies a source datastore that is populated by the Connect CDC (SQData) IMS Data Capture Agent.
  IMSDB - Specifies an IMS database with one (1) or more segments. This is the name of the IMS database as it appears in the corresponding IMS database descriptor (DBD).
  JSON - Specifies a target datastore that utilizes JavaScript Object Notation (JSON), a text based lightweight data-interchange format. For more details see both JSON Type Datastores and AVRO Type Datastores .
 

ORACLECDC - Specifies a source datastore that is populated by the Connect CDC SQData Oracle Data Capture Agent.

  RELATIONAL - Specifies a relational database table or view.
 

UTSCDC - Specifies a source datastore that is populated by the Connect CDC SQData Db2 Data Capture or UDB (Db2 LUW) Capture Agent.

 

VSAM - Specifies a VSAM datastore.

  VSAMCDC – Specifies a source datastore that is populated by either the Connect CDC SQData VSAM Data Capture Agent or Keyed File Compare Capture Agent.
 

XMLCDC - Specifies an XML formatted datastore that is usually used in conjunction with one (1) of the Connect CDC SQData Data Capture Agents (i.e. for IMS, DB2, VSAM, etc.) where IBM MQs are used. This type of datastore writes entire XML records as a single message rather than multiple messages.

  CCSID <code> Optional parameter that specifies the character code that represents the data in the datastore. Using this parameter instructs the Apply Engine to translate the data from the character code specified here to character code of the system in which the Engine is executing. If the code parameter is not specified, the Apply Engine uses the character code of the operating system platform where the data was captured or in the case of target datastores, where the Engine is running.
 

STAGING SIZE <nG> - hdfs:// and file:// URL's ONLY, maximum file size in Gigabytes, once this limit is reached a new file will be created. A time stamp is appended to the file name, for either URL type before the file extension if one exists, indicating the time when the file was created. For example, a file name of employee.dat will be show up as employee.2017-05-27-13.18.41.dat.

 

STAGING DELAY <mm> - hdfs:// and file:// URL's ONLY, rotate files every <mm> minutes. A time stamp is appended in the same fashion as for STAGING SIZE.

 

ASCII - Denotes data stored in ASCII format. This format is most commonly found in UNIX and Windows environments.

 

EBCDIC - Denotes data stored in EBCDIC format. This format is most commonly found in z/OS environments.

AS <datastore_alias>

Specifies the alias name of the datastore, may contain only dash (-) and underscore (_) separators. All references to this datastore in the Apply Engine script will be through this alias name. Though not required or a default, the alias CDCIN is used for source datastores by convention in most CDC Apply Engine scripts regardless of source or platform. Adopting this convention will simplify both development, maintenance and diagnosis across multiple implementations and applications. CDCIN will be used in this manner throughout all product documentation. Similar conventions such as TARGET will often be used for target or output datastores, particularly when there is a single or primary "target" datatastore.

DESCRIBED BY <description_alias> | DUMMY

Specifies the layout of one (1) or more record types found in this datastore. Each record type must already be defined via a DESCRIPTION command, before its alias can be used. Exception datastores require DUMMY to be specified as the desc_alias.

DESCRIBED BY GROUP <group_name> Specifies the <group_name> , previously specified by a BEGIN GROUP Command that contains one or more DESCRIPTION entries.
[ACCEPT ALL] Specifies that Source records with no corresponding/matching Descriptions will simply be bypassed, rather than raising an exception that would otherwise terminate processing.
[BYPASS CHGCHECK] This parameter forces the engine to pass a source record to the script even if none of the mapped fields in the source have changed, as commonly occurs when only a few of the source fields are required for target mapping. This makes it possible to generate “activity” target records when “any”source field has changed.
[CHECK FULL BEFORE IMAGE]

Applies to CDC REPLICATE Command processing only when both Source and Targets are either IMS or VSAM datastores. Specifies that the full before Source image will be compared to the Target image on both Updates and Deletes and that any discrepancies found will raise an Exception unless COMPENSATE is also specified for the Datastore.

[COMPENSATE]

Applies to CHANGE and REPLICATE Command processing only. Specifies that Compensation will insure that a record/row in the target datastore exists and is current whenever an Insert or Update has occurred in the Source datastore.

[COMPENSATEWITH WARNING]

Applies to CHANGE and REPLICATE Command processing only. Specifies that when Compensation occurs a before and after image of the data will be written to the Engine Report.

[NO COMPENSATION]

Applies to CHANGE and REPLICATE Command processing only. Specifies that Compensation will not occur when whenever an Insert or Update has occurred in the Source datastore and the data already exists or does not exist in the target.

[DEFAULT QUALIFIER <high_level> ]

Specifies a Default high level qualifier to associate with a Datastore. This parameter is typically used to specify the owner or schema name associated with a Relational Target datastore when one is not specified in a Description. It is often supplied with a substitution Parser Directive. See also FORCE QUALIFIER

[EXCEPTION <exception_datastore>]

Applies to both Source and Target exceptions and specifies that an Exception datastore is used for storing records that cause errors and the name of the associated Exception datastore. For example, if a source record contains numeric fields with invalid data, the Apply Engine can redirect those records to the Exception datastore where the data can be corrected or the Apply Engine script can be modified to handle the exception, after which the data in the Exception datastore can be recycled back through the Engine. Target exceptions . If an Exception datastore is not specified, the Engine terminates once it encounters a record that causes errors.

[FOR <action>

Specifies the type of "action" to be taken on the target datastore. If no FOR <action> is specified, the default action will depend on the OF <datastore_type> of the target datastore. Detailed examples of CHANGE and INSERT are found with the PROCESS Command.

CHANGE

This is the default Action for Relational and IMS database target datastores. If the source datastore contains CDC records, CHANGE instructs the Engine to Apply changed data based on the change operation (i.e. insert, update, delete) of the source transaction. The CHANGE action also performs compensation by insuring that a record/row in the target datastore exists and is current whenever an Insert or Update has occurred in the Source datastore. For example if the changed data record was the result of an insert, an insert is performed on the target unless a target with a matching key is present, then an update is performed. Similarly, if the changed data record was the result of an update, an update is performed on the target if a target with a matching key is present, otherwise an insert is performed. Deletes will only be performed if a matching key is present in the target datastore.

The CHANGE action requires the knowledge of Unique target keys. They may be specified three different ways: Using the KEY IS clause on the target DATASTORE definitions; as part of the DDL contained in the DESCRIPTIONS specified using the DESCRIBED BYGROUP <group_name>; or optionally by allowing the Engine to retrieve the Unique Keys specified in the target database catalog. While handy we recommend not depending on the database catalog as it can be incomplete.

The Change action also requires the OPERATION IS keyword which specifies the changed data capture (CDC) change operation field. The CDC change operation can be accessed using the CDCOP function.

Note: The CHANGE action does not work for datastores that do not allow single record deletes or the specification of keys such as sequential flat files.
INSERT

Appends (adds) data records/rows to the target datastore. It is the default Action for non-Relational target datastores such as IBM MQ, files and TCP/IP Ports.

[FORCE QUALIFIER <high_level> ]

Specifies the high level qualifier to associate with a Datastore. This parameter is typically used to specify the owner or schema name associated with a Relational Target datastore and will override the high level qualifier that may be part of the DDL DESCRIPTION. It is often supplied with a substitution Parser Directive.

[FORMAT [CONFLUENT | CONTAINER | PLAIN]

Specifies the type, if any of the Schema associated with AVRO Type Datastores.

[IMSPATHDATA]

This parameter is valid only for the IMSDB datastore type not IMSCDC. IMSPATHDATA instructs the ENGINE to retrieve PATH data in addition to the IMS segment specified. This parameter is not needed when reading an entire IMS Database as in an Unload Engine. In that case the parent segment data will remain accessible and can be directly referenced using <parent_description>.<parent_field> syntax.

[KEEP NULL FIELDS]

JSON datastores by default suppress "Key-value" pairs that contain a "Null" value. This parameter is valid only for the JSON datastore type and instructs the Apply Engine to include source fields/columns that contain "Null" values when generating JSON formatted output, eg: "<data item>": null

[KEY IS <key1> [,<key2>...,<keyn>]]

There are two different use cases for the Datatsore Key IS parameter:

1. For Relational Targets when the a Target RDBMS table has NO Keys defined in the Target database catalog. In that case the Apply Engine can only perform processing by treating every column in the Target as a Key in the WHERE clause used to retrieve data for update.

2. On Relational Source Descriptions to identify the Key Columns that will be used to partition data written to Kafka.

[OPERATION IS <change_op_field>]

Specifies that the named variable containing the change operation value of a Change Data Capture (CDC) record. Precisely recommends using V_CHGOP as a naming convention for this variable which is determined using the CDCOP< source_datastore_alias> Function.

[PROCESS COMMIT]

Exposes the Unit-of-Work (UOW) "commit" record to be identified and explicitly acted on by the Apply Engine. Use cases that involve special handling after all the CDC records in a UOW have been processed can be identified by a CDC Change operation value of "C" returned from the CDCOP<source_datastore_alias> Function.

[QUERY </+ <SQL Select and Where clause> +/>]

Provides support for an in-line SELECT clause used by an Apply Engine used to directly read a Relational Source and perform partial Target Reloads, Refreshes or other operations, see QUERY Keyword for details.

[RECONNECT] Instructs the Engine to periodically attempt to reconnect to the Capture after a connection has been lost.
[RRS] Applies only to VSAM processing and specifies that IBM's Resource Recovery Manager will manage two phase commits.
[SINGLE IMAGE]

Applies to REPLICATE Command processing only and only for JSON or AVRO Target Datastore Types. Note, that visibility to Key changes will be lost because the Before Image is suppressed.

[STRIP TRAILING SPACES]

Applies to REPLICATE Command processing only and only for JSON or AVRO Target Datastore Types. Removes trailing spaces from CHAR data type fields.

[WTO ON COMPENSATE EVERY <n> MINUTES]

Applies to REPLICATE Command processing only. Specifies that when Compensation occurs a "Compensation Occurred" message will be written to the System Log/Console (zOS WTO) no more than once every <n> minutes.

[WTO ON EXCEPTION ONCE]

Applies to both Source and Target datastores for which an EXCEPTION Datastore has been specified. Specifies that when the first (once) record is written to the Exception Datastore, an "EXCEPTION Occurred" message will be written to the System Log/Console (zOS WTO).

Notes:

  1. The DESCRIBED BY GROUP parameter provides the means to specify multiple descriptions for the Source datastore. In the case of a Relational Target datastore it allows a single DATASTORE command to define multiple physical target Tables so long as they are part of the same "Database".
  2. The physical datastore name must be enclosed in double quotes (") if any part of the name is also an Apply Engine reserved word.