One additional activity often overlooked when discussing Change Data Capture is the initial load of the Target datastores and the methods employed to achieve full synchronization of the source and target. Modifications to Source datastores, overlooked requirements, business rule changes affecting filters or transformations or even operational issues may also surface the need to Refresh all or a subset of the CDC/Apply targets. Various methods that support Initial Load and Refresh should be considered based on all applicable factors including performance, ease of configuration and operational impact:
- Native database unload/reload utilities may be available to unload the source datastore and load the target datastore, they are however generally restricted to the same type (RDBMS, IMS, etc) source and target datastore.
- A special Connect CDC (SQData) Unload engine that reads the source datastore locally and writes records to be loaded by a database utility or one that reads the source datatastore remotely and writes directly to another target like Kafka. When the source and target datastores are not identical, Precisely recommends that a special version of the already tested Engine script be used for the initial load of the target datastore. This approach has the additional benefit of providing a mechanism for "refreshing" target datastores if for some reason an "out of synchronization" situation occurs because of an operational problem or a business rule change affecting filters or transformations.
- Third party remote disk mirroring, often the only practical solution when large scale disaster type replication systems are being implemented.
Note: The method selected for the initial load of the target datastore must also consider concurrent source database activity. The source capture and target apply process must ensure that source and target synchronization is achieved, often with a "catch-up" phase during which Connect CDC (SQData) will perform compensation.