The Quickstart approach is a step by step guide to the installation, configuration, testing and operation of the Connect CDC SQData Capture and Apply Engine components that will populate your Hadoop HDFS:
- Determine your initial HDFS target data requirements
- Preparation of the Source Data Capture and Target Apply Engine environments
- Configure Engine Controller Daemon
- Determine how you will control the structure of the target HDFS files
- Create a HDFS no-map replication Apply Engine script
Once these steps have been completed you will then be able to run an end to end test of each of the components in standalone mode. This allows you to work out any security or environmental issues before running alongside other Apply Engines in a shared Capture/Publisher/SQDaemon configuration.
After all components are working properly and your first HDFS file has been successfully populated, you are ready to add more source/target structure interfaces to your configuration.
This Quick Start is intended to supplement, not replace, other documents including the various Data Capture and the Apply and Replicator Engine Reference documentation. We recommend you familiarize yourself with the Precisely portal where you can learn more about Connect CDC (SQData)'s overall Architecture and approach to Change Data Capture. The answer to many of the questions that inevitably arise during initial installation, configuration and testing will be found in those documents.
Processing data placed into HDFS is beyond the scope of this Quick Start as are the myriad of tools that can be used to "consume" that data.