Replication to AVRO containers in either plain file or HDFS (Hadoop file system) can be accomplished with minimal configuration and provide excellent performance.
Prerequisites
- AVRO container targets are restricted to a single worker and each target must have its own individual file.
- As a consequence the target url must be a generic url and the substitution parameter for the generic url must be unique for each target.
- By default, the qualified name of the source is used as substitution parameter. That can be overridden using the MAPPINGS section, either static or dynamic.
Examples
- AVRO Container formatted file or HDFS (Hadoop file system) where every source object (i.e. table name) will be written to a unique file using AVRO CONTAINER formatting and default file Rotation.
REPLICATE DB2 cdc://<host_name>:<sqdaemon_port>/<publisher_name>/<subscription_name> TO AVRO CONTAINER [file | hdfs]:///* ;
- AVRO Container formatted file or HDFS (Hadoop file system) where every source object name (i.e. table name) is written to a specified file name with a common prefix and suffix and specific file rotation parameters in specified in an OPTIONS statement.
REPLICATE DB2 cdc://<host_name>:<sqdaemon_port>/<publisher_name>/<subscription_name> TO AVRO CONTAINER [file | hdfs]:///<prefix>_*_<suffix> ; OPTIONS ROTATE SIZE 100M, ROTATE DELAY 30
Note: AVRO Container targets use a file rotation method controlled by a size and or a delay. For hadoop (HDFS), if an OPTIONS statement does not define a size and delay, by default a delay of 1 hour is applied. A delay of one hour means that from the time a record is first written to a target file, the file will be rotated after one hour has passed, if the file has not been rotated for other reason.