Use Case: Identifying Duplicates in Your Data - Relationship Linker - trillium_discovery - trillium_quality - 17.1

Trillium Control Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Quality
Trillium > Trillium Discovery
Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Control Center
Topic type
How Do I
Overview
Configuration
Reference
Administration
Installation
First publish date
2008

You have an entity containing individual and business customer names and addresses. The data in the entity will be exported into the customer database and used for business purposes such as marketing campaigns and loyalty programs. The entity seems to contain a number of duplicates due to the various sources where the original data came from. Before updating the database, you need to identify all possible duplicates in the entity to eliminate unnecessary data.

Scenario

You have an entity containing individual and business customer names and addresses. The data in the entity will be exported into the customer database and used for business purposes such as marketing campaigns and loyalty programs. The entity seems to contain a number of duplicates due to the various sources where the original data came from. Before updating the database, you need to identify all possible duplicates in the entity to eliminate unnecessary data.

Solution

TS Quality helps you find matching or duplicate data based on common data content. This is called "linking data." Linking identifies the relationship (or lack thereof) between records at business and consumer levels. The linking processes use various sets of match rules and comparison routines to help determine matches for further investigation.

The Relationship Linker process in TS Quality compares rows with other rows in the same entity and identifies matching rows. This process is included as part of a standard Quality project and all country-specific match rules are pre-defined in the country project templates.

The basic relationship linking is a four-step process. These steps are described in the following topics:

Sample File

These topics use the sample file (input.csv) included with Trillium. You must select "Install delimited sample data files" option during the Repository Server installation to use this file. By default, the file is located in the Repository's import/data folder (for example, C:/ProgramData/Trillium Software/Data/import/data) and accessed through the "Default delimited file connection" in the Create Entity Wizard.

If you create an entity and a US Name and Address project using input.csv as input, you can obtain the same results as these steps show. When you create the project, you should use Line 01 - Line 05 as the Parser inputs for the project and run the entire project by right-clicking the project name in the Navigation View and select Run. This will generate the window keys required for this use case.