Distribution key requirements - Connect_CDC - connect_cdc_mimix_share - Latest

Connect CDC Advanced User Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (MIMIX Share)
Version
Latest
Language
English
Product name
Connect CDC
Title
Connect CDC Advanced User Guide
Copyright
2024
First publish date
2003
Last edition
2024-07-19
Last publish date
2024-07-19T23:30:25.334335

For each mapping to a target table, at least one target column must be specified as a Distribution Key, that is, a column or set of columns that uniquely identifies each row in the target table.  

Connect CDC Director automatically selects the target table primary key as the Distribution Key in any mapping automatically generated. If there is no primary key, Connect CDC Director uses a unique index.

The user uses the Mapping page and set a Distribution Key (DKey) for any mappings to target tables that have neither a primary key nor a unique index.

If the source table contains multiple records whose column values map to the same target Distribution Key value, then:

  • Only one of these records is inserted at the target if the target Distribution Key is a unique index because the target DBMS prevents a duplicate unique value.  

  • Multiple records with the same Distribution Key value are inserted at the target if the target Distribution Key is not a unique index.

However, an update or delete to either of these source records produces an error trying to replicate the update or delete to the target because there are multiple target records to update or delete. This condition prevents multiple rows on the target table from being updated or deleted and posts an error message.

It is strongly recommended:

  • The source columns that map to the target table Distribution Keys comprise a unique index of the source table.

  • The target columns of the Distribution Key comprise a unique index in the target table.

Although Connect CDC Director initially makes target table primary keys the default Distribution Keys, you can change them.

  • If a table has no primary key but at least one unique index, Connect CDC Director defines the first unique index as the Distribution Key.

  • If a table has no primary key or unique index, Connect CDC Director selects the target columns corresponding to the source table keys, if possible.

Violation of any of the following rules produces a model validation error or warning:

  • Every table mapping (except Journal Mappings, see note below) in an EDMM must have a defined Distribution Key, which may consist of one or more columns.

  • The sum of the lengths of all  columns in a table cannot exceed 256. This restriction does not apply to Oracle sources.

  • Each Distribution Key column must be mapped. The input parameters to a method that produces a target Distribution Key column may be key columns, constants, system variables, or non-key columns.

  • A column used as a Distribution Key may not be:

  • long column, that is, one that contains data-only large objects

  • "medium-sized" column, that is, one whose datatype is categorized Character but whose length exceeds a certain value (254 for DB2, 255 for MS SQL Server)

  • bit type

  • binary type

  • Approximate numeric (REAL, FLOAT, DOUBLE) types, but this is only a WARNING.

  • In bi-directional replication, the Distribution Keys in each direction must be mapped in a “reversible” fashion. For example, if the mapping from A to B concatenates two fields in A to form a Distribution Key in B, the the mapping from B to A must separate the field in B into the two fields in A.

  • In protected replication involving Oracle or DB2, at least one Distribution Key column must be defined as NOT NULL at each server. If only other DBMSs are involved, all key columns may be nullable.