Options - dataflow_designer - spectrum_quality_1 - 23.1

Spectrum Data Quality Guide

Product type
Software
Portfolio
Verify
Product family
Spectrum
Product
Spectrum > Quality > Spectrum Quality
Version
23.1
Language
English
Product name
Spectrum Data Quality
Title
Spectrum Data Quality Guide
Topic type
Overview
Reference
Tips
How Do I
First publish date
2007
ft:lastEdition
2024-03-04
ft:lastPublication
2024-03-04T22:52:13.486265
  1. In the Load match rule field, select one of the predefined match rules which you can either use as-is or modify to suit your needs. If you want to create a new match rule without using one of the predefined match rules as a starting point, click New. You can only have one custom rule in a dataflow.
    Note: Do not use special characters while creating a new rule.
    Note: The Dataflow Options feature in Enterprise Designer enables the match rule to be exposed for configuration at runtime.
  2. Click Group By to select a field to use for grouping records in the match queue. Intraflow Match only attempts to match records against other records in the same match queue.
  3. Select the Sort box to perform a pre-match sort of your input based on the field selected in the Group By field.
  4. Click Advanced to specify additional sort performance options.
    Note: The optimal sort performance settings depends on your server's hardware configuration. You can use this equation as a general guideline to produce good sort performance:

    (InMemoryRecordLimit × MaxNumberOfTempFiles ÷ 2) >= TotalNumberOfRecords

  5. Click Express Match On to perform an initial comparison of express key values to determine whether two records are considered a match.
    You can generate an express key as part of generating a match key through MatchKeyGenerator. See Match Key Generator for more information.
  6. In the Initial Collection Number text box, specify the starting number to assign to the collection number field for duplicate records.

    The collection number identifies each duplicate record in a match queue. Unique records are assigned a collection number of 0. Each duplicate record is assigned a collection number starting with the value specified in the Initial Collection Number text box.

  7. Click Sliding Window to enable this matching method. For more information about Sliding Window, see Sliding Window Matching Method
  8. Click Generate Data for Analysis to generate match results. For more information, see Analyzing Match Results.
  9. Optional: Uncheck Assign collection number 0 to unique records to generate collection numbers other than zero for unique records.
    The unique record collection numbers will then be in sequence with any other collection numbers. This option is checked by default to assign zeroes as collection numbers to unique records.
    For example, if your matching dataflow finds five records and the first three records are unique, the collection numbers would be assigned as shown in the first group below. If your matching dataflow finds five records and the last two are unique, the collection numbers would be assigned as shown in the second group below.
    Option Description
    Collection Number Record Type
    1 Unique
    2 Unique
    3 Unique
    4 Duplicate/Suspect
    4 Duplicate/Suspect
       
    Collection Number Record Type
    1 Duplicate/Suspect
    1 Duplicate/Suspect
    2 Unique
    3 Unique
    4 Unique
    If you leave this box checked, any unique records found in your dataflow will be assigned a collection number of zero by default.
  10. Select the Return match rule name option to include the selected match rule name in the stage output.
  11. Select Return detailed match information if you want detailed match information to be displayed as an output for your match rule. For more information about the output fields, see Output.
    Note: If you enable this field, it will hinder the overall stage performance.
  12. For information about modifying the other options, see Building a Match Rule.
  13. Click Evaluate to evaluate how a suspect record scored against candidate records. For more information, see Interflow Match.