Determine HDFS data requirements - connect_cdc_sqdata - Latest

Connect CDC (SQData) HDFS Quickstart

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (SQData)
Version
Latest
Language
English
Product name
Connect CDC (SQData)
Title
Connect CDC (SQData) HDFS Quickstart
Copyright
2024
First publish date
2000
Last edition
2024-07-30
Last publish date
2024-07-30T20:10:32.610182

While your HDFS may eventually contain lots of different types of data, Precisely recommends that you start with only a few sources. That usually means a subset of segments in a legacy IMS database or small number of Relational database tables. Since your data may come from different platforms as well, pick just one to get started.

Since most implementations will utilize Connect CDC (SQData) change data capture to collect the data sent to HDFS, it is easy to forget that downstream Consumers may need access to data that hasn't changed in some time and therefore has never been published to HDFS. There are several methods for performing an "Initial Load" and they vary depending on the original source of data, be it hosted on the Mainframe, Linux or Windows. See the Initial Load sections of the applicable Change Data Capture reference documentation for more details. Precisely also recommends special consideration be given to the HDFS file names and Metadata associated with this historical data.