Analyzing Match Rule Changes - spectrum_quality_1 - 23.1

Spectrum Data Quality Guide

Product type
Product family
Spectrum > Quality > Spectrum Quality
Product name
Spectrum Data Quality
Spectrum Data Quality Guide
Topic type
How Do I
First publish date

You can use the Match Analysis tool in Enterprise Designer to view in detail the effect that a change in a match rule has in the dataflow match results. You can do this by running the dataflow, making changes, re-running the dataflow, and then viewing the results in the Match Analysis tool. This procedure describes how to do this.

Important: When comparing match results, the input data used for the baseline and comparison runs must be identical. Using different input data can cause misleading results. Observe the following to help ensure an accurate comparison:
  • Use the same input files or tables
  • Sort the data in the same way prior to the matching stage
  • Use the same Candidate Finder queries when using Transactional Match
  1. In Enterprise Designer, open the dataflow you want to analyze.
  2. For each Interflow Match, Intraflow Match, or Transactional match stage whose matching you want to analyze, double-click the stage and select the Generate data for analysis check box.
    Important: Enabling the Generate data for analysis option reduces performance. You should turn this option off when you are finished using the Match Analysis tool.
  3. Select Run > Run Current Flow
    Note: For optimal results, use data that will produce 100,000 or fewer records. The more match results, the slower the performance of the Match Analysis tool.
  4. In the dataflows matcher stage or stages, make the match rule changes you want then run the dataflow again.

    For example, if you want to test the effect of increasing the threshold value, change the threshold value and run the dataflow again.

  5. When the dataflow finishes running, select Tools > Match Analysis.

    The Browse Match Results dialog box displays with a list of dataflows that have match results that be viewed in the Match Analysis tool. If the job you want to analyze is not listed, open the dataflow and make sure that the matching stage has the Generate data for analysis check box selected.

    Tip: If there are a large number of dataflows and you want to filter the dataflows, select a filter option from the Show only jobs where drop-down list.
  6. On the left side of the Match Analysis pane, there is a list of the matcher stages, one per run. Select the matcher stage in the run that you want to use as the baseline for comparison then click Baseline. Then, select the run you want to compare the baseline to and click Compare.

You can now compare summary match results, such as the total number of duplicate records, as well as detailed record-level information that shows how each record was evaluated against the match rules.

Example of Match Results Comparison

For example, say you run a job named HouseholdRelationshipsAnalysis. You want to test the effect of a change to the Household Match 2 stage. Your first run the job using the original settings, then you modify the match rules in the Household Match 2 stage and run the job again. In the Match Analysis tool, the run with a job ID of 10 is the run with the original settings, so you set it as the baseline. The run with a job ID of 13 is run with the modified match rule. When you click Compare, you can see that the modified match rule produced one more duplicate record and one less unique record than the original match rule.

Given below is a representation of the Match Analysis screen.
  Baseline Comparison Changes
Input Records 11 11 0
Duplicate Records 10 11 1
Unique Records 1 0 -1
Match Groups 3 3 0
Duplicate Collections 3 3 0
Express Matches 0 0 0
Average Score 99.8 99.8 0.0