Configuration tab - dataflow_designer - spectrum_quality_1 - 23.1

Spectrum Data Quality Guide

Product type
Software
Portfolio
Verify
Product family
Spectrum
Product
Spectrum > Quality > Spectrum Quality
Version
23.1
Language
English
Product name
Spectrum Data Quality
Title
Spectrum Data Quality Guide
First publish date
2007
Last updated
2024-03-04
Published on
2024-03-04T22:52:13.486265

This tab is displayed in the Exception Monitor Options dialog box.

Disable exception monitor
Turns Exception Monitor on or off. If you disable Exception Monitor, records will simply pass through the stage and no action will be taken. This is similar in effect to removing Exception Monitor from the dataflow.
Stop job after reaching exception limit
Specifies whether to halt job execution when the specified number of records meet the exception conditions.
Maximum number of exception records
If Stop job after reaching exception limit is selected, use this field to specify the maximum number of exception records to allow before halting job execution. For example, if you specify 100, the job will stop once the 101st exception record is encountered.
Report only (do not create exceptions)
Enables you to track records that meet exception conditions and reports those statistics on the Data Quality Performance page in the Data Stewardship Portal, but does not create exceptions for those records.
Return all records in exception's group
Specifies whether to return all records belonging to an exception record's group instead of just the exception record. For example, a match group (based on a MatchKey) contains four records. One is the Suspect record, one is a duplicate that scored 90, and two are unique records that scored 80 and 83. If you have a condition that says that any record with a MatchScore between 80 and 89 is an exception, by default just the records with a match score of 80 and 83 would be sent to the exception port. However, if you enable this option, all four records would be sent to the exception port.
Enable this option if you want data stewards to be able to compare the exception record to the other records in the group. By comparing all the records in the group, data stewards may be able to make more informed decisions about what to do with an exception record. For example, in a matching situation a data steward could see all candidates to determine if the exception is a duplicate of the others.
Note: If the input data does not contain a field named "CollectionNumber" this option will be disabled.
Group by
If you selected Return all records in exception's group, choose the field by which to group the records.
Note: The "CollectionNumber" input field will not appear in this list because it is not a valid selection for the Group by feature.
Revalidation service
Select the service you want to run when you revalidate records from this dataflow. The service runs when a user saves edited records in the Portal Exception Editor. Status is changed to Failed for records that fail revalidation. Successfully revalidated records are reprocessed or approved depending on the selection for Action after revalidation.
In an approval flow, successfully revalidated records are passed to the next approval level. For the last approval level in an approval flow, revalidated records are either released for reprocessing or retained in the repository as Resolved, depending on the selection for Action after revalidation.
Action after revalidation
Specifies whether to reprocess records or approve records that have been successfully revalidated.
  • Reprocess records—Choose this option to reprocess records that are successfully revalidated. The revalidated records are removed from the repository for reprocessing.
  • Approve records—Choose this option to approve records that are successfully revalidated. The approved records are retained in the repository and their status changed to Resolved.
Match exception records using match field
Uses match fields to match input records against exception records in the repository. Enable this option if your input contains records that previously generated exceptions but are now corrected in the input.

The input records will be evaluated against the conditions and then matched against the existing exception records in the repository. If an input record passes the conditions and matches an exception record, that exception record will be removed from the repository. If an input record does not pass the conditions and matches an exception record, that exception record will be updated and retained in the repository. Additionally, if duplicates exist in the repository, only one matched exception per dataflow will be updated; all others for that dataflow will be deleted.

Optimized for single records or small batches
This option is activated when you check Match exception records using match field. When this option is not checked (default), the server will load into memory all existing exception records for the current dataflow and stage before processing the incoming exception records. This is recommended when the repository has a low number of existing exception records and high number of new exception records or updates. This scenario typically involves a longer initial load time and an increased memory requirement; it is faster when processing larger batches, such as daily, weekly, or monthly updates.

Checking this option is recommended when the repository has a high number of existing exception records and a relatively low number of new exception records or updates, as the server queries the repository for existing exception records as each input record is read in. This scenario typically involves a shorter initial load time and a lower memory requirement; it is faster when processing a few records in real time.

Match fields
Provides a list of all input fields used to build a key to match an exception record in the repository. You must define at least one match field if you checked the Match exception records using match field check box.