Size transient storage pool - connect_cdc_sqdata - Latest

Connect CDC (SQData) Change Data Capture

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect CDC (SQData)
Version
Latest
Language
English
Product name
Connect CDC (SQData)
Title
Connect CDC (SQData) Change Data Capture
Copyright
2024
First publish date
2000
Last edition
2024-09-05
Last publish date
2024-09-05T15:00:09.754973

The CDCStore Storage Agent utilizes a memory mapped storage pool to speed captured change data on its way to Engines. It is designed to do so without "landing" the data, after it has been mined from a database log. Configuration of the Storage Agent requires the specification of both the memory used to cache changed data as well as the disk storage used if not enough memory can be allocated to hold large units-of-work and other concurrent workload.

Memory is allocated in 8MB blocks with a minimum of 4 blocks allocated or 32MB of system memory. The disk storage pool is similarly allocated in files made up of 8MB blocks. While ideally the memory allocated would be large enough to maintain the log generated by the longest running transaction AND all other transactions running concurrently, that will most certainly be impractical if not impossible.

Ultimately, there are two situations to be avoided which govern the size of the disk storage pool:

Large Units of Work - While never advisable, some batch processes may update very large amounts of data before committing the updates. Often such large units of work may be unintentional or even accidental but must still be accommodated. The storage pool must be able to accommodate the entire unit of work or a DEADLOCK condition will be created.

Archived Logs - Depending on workload, database logs will eventually be archived at which point the data remains accessible to the Capture Agent but at a higher cost in terms of CPU and I/O. Under normal circumstances, captured data should be consumed by Engines in a timely fashion making the CDCStore FULL condition one to be aware of but not necessarily concerned about. If however the cause is a stopped Engine, the duration of the outage could result in un-captured data being archived.

The environment and workload may make it impossible to allocate enough memory to cache a worse case or even the average workload, therefore we recommend two methods for sizing the storage pool based on the availability of logging information.

If detailed statistics are available:

  1. Gather information to estimate the worse case log space utilization (longest running Db2 transaction AND all other Db2 transactions running concurrently) - We will refer to this number as MAX.
  2. Gather information to estimate the log space consumed by an "Average size" Db2 transaction and multiply by the number of average concurrent transactions - We will refer to this number as AVG.
  3. Plan to allocate disk files in your storage pool as large as the Average (AVG) concurrent transaction Log space consumed. Divide the value of AVG by 8 (number of MB in each block) - This will give you the Number-of-Blocks in a single file
  4. Divide the value of MAX by 8 (number of MB in each block) and again by the Number of Blocks to calculate the number of files to allocate which we will refer to as Number-of-Files. Note, dividing the value of MAX by AVG and rounding to the nearest whole number should result in the same value for N.

Example:

Number-of-Blocks = AVG / 8 (MB per block)

Number-of-Files = MAX / 8 / Number-of-Blocks (which is the same as Number-of-Files = MAX / AVG)

If detailed statistics are NOT available:

  1. Precisely recommends using a larger number of small disk files in the storage pool and suggests beginning with 256MB files. Dividing 256MB by the 8MB block size gives the Number-of-Blocks in a single file, 32.
  2. Precisely recommends allocating a total storage pool of 2GB (2048MB) as the starting point. Divide that number by 256MB to calculate the Number-of-Files required to hold 2GB of active LOG which would be 8.

Example:

Number-of-Blocks = 256MB / 8MB = 32

Number-of-Files = 2048MB / 256 = 8

Use these values to configure the CDCStore Storage Agent in the next section.

Notes:

  • Remember that it is possible to adjust these values once experience has been gained and performance observed. See the section "Display Storage Agent Statistics" in the Operations section below.
  • Think of the value for Number-of-Files as a file Extent, in that another file will be allocated only if the MEMORY cache is full and all of the Blocks (Number-of-Blocks) in the first file have been used and none are released before additional 8MB Blocks are required to accommodate an existing incomplete unit of work or other concurrent units of work.
  • While the number of Number-of-Blocks and Number-of-Files can be dynamically adjusted they will apply only to new files allocated. It will be necessary to stop and restart the Storage Agent for changes to MEMORY.
  • Multiple Directories can also be allocated but this is only practical if the File system itself fills and a second directory becomes necessary.