Determining Matches - trillium_discovery - trillium_quality - 17.1

Trillium Control Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Trillium > Trillium Quality
Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Control Center
Topic type
How Do I
Installation
Reference
Configuration
Administration
Overview
First publish date
2008

The linking process is designed to compare records to determine the level of similarity between them. Level of similarity is decided based on the following elements: matching attributes, comparison routine, score, grade, category, and pattern ID.

  • Matching attributes. Attributes to compare. The common matching attributes include business and personal names, primary and secondary address and geography components.
Note: The maximum number of matching attributes is 50.
  • Comparison routine. Matching algorithm. Trillium provides optimal comparison algorithms tailored to the type of attributes. For example, the BUSNAME routine is designed to compare business name attributes. See Relationship Linker Comparison Routines for a list of comparison routines.
  • Score. The numeric value from 0 to 100. Each comparison routine employs a unique score system to measure the similarity of records. For example, the ABSOLUTE routine returns a score of 100 for exact match and a 0 for all other cases.
  • Grade. The letter grade from A to E that corresponds to the numeric score. You can specify up to five grade thresholds. For example, a score 100 is grade A, a score 95 is grade B, and so on.
  • Category. The type of match represented by the pattern of grades. The options are P (Pass), S (Suspect), and F (Fail).
  • Pattern ID. A unique, user-defined identification number for the combination of matching attributes and grades. Patterns define the criteria that determine whether the match is considered a pass, suspect, or fail.

The default rules are given for each country. Using the Relationship Linker Rules Editor and Data Comparison Calculator, you can build your own matching patterns and specify how you want matches to be determined for your records.

Example

Matching Attributes and Score

In this example, there are four attributes to compare: business name, street name, house number, and PO Box number. You use the BUSNAME comparison routine for business name and set up the score threshold of 100, 90, and 80. This means that if the records are exact match for business name, it receives a score of 100 (an A). If the number of word errors is more than one third the greater number of words in both attributes, it receives a score of 90 (a B). The STREETS, HOUSENO, and ABSOLUTE routines are used for other attributes.

Grade Pattern List

Based on the matching attributes and score, you create a category and pattern list.

  • P100 AAAA. If the record received a score of 100 (an A) on business name, street name, house number, and PO Box, it will be a Pass (P) with Pattern ID 100.
  • P110 AA-A. If the record received a score of A on business name, street name, and PO Box, it will be a Pass (P) with Pattern ID 110. A wildcard (-) indicates that any score would be accepted.
  • S200 AB-A. If the record received a score of A on business name and PO Box, but a score of 96 (a B) on street name and a wildcard for house number, it will be a Suspect (S) with Pattern ID 200.
  • F999 BABB. If the record received a score of B on business name, house number, and box number and an A on street name, it will be a Fail (F) with Pattern ID 999.

After running, the Relationship Linker or Reference Matcher will store the pattern ID for matched and suspect records in the following attributes: lev1_matched_ pattern, lev1_suspect_pattern, lev2_matched_pattern, and lev2_suspect_ pattern. From the pattern IDs in the output, you can obtain the reasons for matches and suspects. See Viewing Relationship Linker Results for how to review the results of the Relationship Linker process.

Click the following topics to learn about how to setup or modify the matching rules.