Multi-Matching - trillium_discovery - trillium_quality - 17.1

Trillium Control Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Trillium > Trillium Quality
Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Control Center
Topic type
Overview
Administration
Configuration
Installation
Reference
How Do I
First publish date
2008

TSS enables you to include multi-matching processes in a project. Multi-matching is a process for more complex matching. You define multiple window keys, one for each matching criteria, and run multiple Relationship Linkers on the same set of records using the different window keys.

Multi-matching is performed in the following steps:

  1. Two or more Relationship Linker processes are run on the same set of rows using different window keys.
  2. Match results from each process are recorded in a Link file.
  3. Transitivity of the links are resolved using the Resolve utility. Example: if a = b and b = c, then a = c.
  4. Resolved results are applied to the first Relationship Linker output by the File Update utility.
  5. Final Relationship Linker is performed.

Example

For example, you might have a customer who opened an account 10 years ago, and has since married, changed addresses, and is now making purchases from another section of your business. In order to match a past customer of 10 years ago with an existing customer record, a social security number would be the most likely attribute to match. In that case, multiple matches are performed: the first match is based on records with the same last name and address; the second match is based on records with either the same social security number and name, or same social security number and address.

Input Rows

The following rows are used as the input to the first match, which matches rows with the same last name and address using the window key named Window Key.

Name

Address

Postal Code

Soc Sec

Window Key

John Smith

5 Main St

01876

000904598

018MNSM

John T Smith

5 Main St

01876

000000000

018MNSM

Jane Smith

5 Main St

01876

 

018MNSM

Jane Jones

5 Main St

01876

000578398

018MNJO

Judy Jones

5 Main St

01876

 

018MNJO

Jane Jones

25 Elm Rd

01754

000578398

017ELJO

Jane Jones

25 Elm Rd

01754

000000000

017ELJO

John Smith

16 Linnell Cir

01543

000904598

015LNSM

First Match Results

Four groups of rows are created during the match. Rows with the same last name and address are matched together. The output of this first match becomes the input to the second match. The second match uses Social Security as Window Key and matches on either social security number and name, or social security number and address. Before running the second match, the input must be sorted by the Social Security window key. See Sort utility on how to specify the sort keys.

Also, make sure to specify the linking settings. The Link file will contain the actual links from the first match with the results of the second match. In this case, Link selection flag is set to Passes only and Link source id is lev2_matched_in_lev1_matched. See Using a Link File for details.

Name

Address

Postal Code

Soc Sec

Window Key

lev2_matched_in_lev1_matched

John Smith

5 Main St

01876

000904598

018MNSM

00000001

John T Smith

5 Main St

01876

000000000

018MNSM

00000001

Jane Smith

5 Main St

01876

 

018MNSM

00000001

Jane Jones

5 Main St

01876

000578398

018MNJO

00000002

Judy Jones

5 Main St

01876

 

018MNJO

00000002

Jane Jones

25 Elm Rd

01754

000578398

017ELJO

00000003

Jane Jones

25 Elm Rd

01754

000000000

017ELJO

00000003

John Smith

16 Linnell Cir

01543

000904598

015LNSM

00000004

Second Match Results

The next table shows the result of the second match. Rows with the same Social Security number and name, or Social Security number and address, are matched together.

Name

Address

Postal Code

Soc Sec

Window Key

lev2_matched_in_lev1_matched

John Smith

5 Main St

01876

000904598

018MNSM

00000001

John Smith

16 Linnell Cir

01543

000904598

015LNSM

00000004

Jane Smith

5 Main St

01876

000578398

018MNSM

00000001

Jane Jones

5 Main St

01876

000578398

018MNJO

00000002

Jane Jones

25 Elm Rd

01754

000578398

017ELJO

00000003

Link File

The output of the Link file would look like this. The first row shows that the row with the value of "00000001" in lev2_matched_in_lev1_matched links to the row with the value of "00000004." The second row shows that "00000001" also links to "00000002" and the third rows shows that "00000002" links to "00000003".

From_link

To_link

00000001

00000004

00000001

00000002

00000002

00000003

Resolve Process

The resolve process using the Resolve utility takes the Link file(s) from the 2nd - Nth Relationship Linkers, and finds all indirect links and reconciles the linking relationships.The following table shows the output from the resolve process. It shows that since 00000001 matched to row 00000002 and 00000002 matched to 00000003 from the Link file, 00000001 also matches to 00000003. See Resolve utility on how to configure the Resolve process.

From_link

To_link

00000002

00000001

00000003

00000001

00000004

00000001

File Update Process

In a multi-matching project, the File Update utility is used to update the match keys on the first Relationship Linker’s output. The output is then sorted on the updated key to group all records together with their resolved match set. This example use the output file from the Resolve process as the transaction file. If the transaction match key (From_link) is the same as the master match key (lev2_matched_in_lev1_matched), the process updates the master file with the match key values from the transaction file. See File Update utility on how to configure the File Update process.

Name

Address

Postal Code

Soc Sec

Window Key

lev2_matched_in_lev1_matched

New Match Key

John Smith

5 Main St

01876

000904598

018MNSM

00000001

00000001

John T Smith

5 Main St

01876

000000000

018MNSM

00000001

00000001

Jane Smith

5 Main St

01876

 

018MNSM

00000001

00000001

Jane Jones

5 Main St

01876

000578398

018MNJO

00000002

00000001

Judy Jones

5 Main St

01876

 

018MNJO

00000002

00000001

Jane Jones

25 Elm Rd

01754

000578398

017ELJO

00000003

00000001

Jane Jones

25 Elm Rd

01754

000000000

017ELJO

00000003

00000001

John Smith

16 Linnell Cir

01543

000904598

015LNSM

00000004

00000001

Final Relationship Linker Process

The last step is the final sort and Relationship Linker processes. The values of the new match key now represent the resolved match sets from both Relationship Linkers. You can use this attribute as the window key to group all records together and run the final Relationship Linker.