Connect CDC (SQData) provides several methods for handling both source and target data exceptions. The objective is to provide the means to maintain continuous operation in spite of exceptions encountered, while providing an appropriate level of notification regarding the nature and severity of the exception. Options exist to ensure that data is not lost and the provide the means to recover with minimal impact. The most common type are source data exceptions caused literally by bad data allowed by business applications or interfaces to be stored in a source datastore. Target exceptions are generally environmental in nature and most often occur when a change made in the target environment has not been accommodated by an Engine script or when a resource in the target environment is unavailable.
Source exceptions
Connect CDC (SQData) is designed to ensure that bad data is not propagated to any target datastores. The developer of the script is responsible for detecting conditionally invalid data and determining how this type of exception will be handled using the tools provided. There are three basic bad data conditions, two of which an Apply Engine can detect on its own and one that must be considered when designing an Apply Engine script:
Invalid Data - Source data does not match the type and format of Data Description
Some types of non-relational datastores, notably IMS, VSAM and flat files leave the validation of data type and format up to the application. Consequently programmers have accidentally or intentionally allowed data that would normally be considered invalid, such as character data in a numeric field to be stored. Usually this was do to an error in programming but sometimes a field may serve subtle dual purposes depending on some other logical data condition. COBOL redefines of group data structures and fields are a common cause of such errors, particularly in IMS databases. Relational datastores can appear to exhibit a similar problem, however, rather than bad data it will be caused by using out of date DDL to describe a table, which should never happen in the production environment. The Apply engine will detect Invalid Data conditions and instantly terminate before the Source datastore record is made accessible to the mapping and processing logic in the Apply Engine script.
Undefined Data - Source data does not match the structure of Data Description
All source datastores provide some means of describing the structure of the data they contain. If the Apply Engine script is not told of all possible source data structures, it will necessarily fail when one is encountered. This type of error is generally introduced when the script is designed and corrected during testing. It can however, be encountered when a new structure is introduced by an application and Apply Engine scripts have not been subsequently updated. The Apply Engine will also detect this condition and terminate before the Source datastore record is made accessible to the mapping and processing logic in the script.
Business Rule Invalid Data - Source data does not meet a required logical business rule
- It was simply not considered a possible exception when the script was written.
- It many not be possible to define the necessary logic to effect a repair.
- The repair may be outside the scope of the Apply Engine scripts responsibility.
Without planning and intervention, the Apply engine will, as described above, terminate the instant it encounters a bad source record. Script Development describes how to handle exception conditions by modifying the source data. It also describes the recommended use of exception datastores during testing, to capture the exception data for diagnosis and also how to the exceptions can be recycle if necessary, in both near real-time and asynchronous change data capture/propagation processes
Target exceptions
Once an Apply Engine script has been through the normal development and testing cycle, no target Exceptions should be encountered. When one occurs it is almost certainly due to a change in the nature of a target datastore or the environment that will require modification of the script. The recommended the approach is to allow the engine to simply fail. While this will ensure that no data is lost, its can, depending on the volume of data involved, require immediate attention. If the problem is environmental, such as a lack of space, allowing the engine to fail is usually the best approach. Once the problem is resolved, the Engine can be restarted and it will automatically resume from the point of failure.
Other target exceptions, particularly those involving relational target datastores can be accommodated in other ways:
Changes to target structure - for example a new column is added and no default value is specified.
Changes to target keys - for example a unique index is added that results in duplicate keys.
These types of exceptions can be handled by allowing the Apply Engine to create an Exception Datastore containing the SQL statements that failed when they were attempted by the Engine. This makes it possible to make the necessary modifications to the target datastore or even to the SQL data itself and then simply process the SQL statements directly outside of Connect CDC SQData.
The generated SQL will contain a where clause containing keys from the database catalog ensuring that all necessary keys. Data columns in the SQL will come from the Target DESCRIPTION specified in the script.