Use Case: Profiling and Parsing Unstructured Business Data - trillium_discovery - trillium_quality - 17.1

Trillium Control Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Trillium > Trillium Quality
Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Control Center
Topic type
Overview
Administration
Configuration
Installation
Reference
How Do I
First publish date
2008

You have an entity containing data for loan application such as property ID, credit score and mortgage description in the free-form text format. To be able to use this data for various business purposes, you want to analyze and standardize the mortgage description attribute.

Scenario

You have an entity containing data for loan application such as property ID, credit score and mortgage description in the free-form text format. To be able to use this data for various business purposes, you want to analyze and standardize the mortgage description attribute.

Solution

Trillium provides a comprehensive workflow of processing unstructured business data from profiling to parsing. You first analyze the attribute and identify potential categories for words and phrases. You create a word definition table and assign the selected words and phrases to the categories, optionally define recodes and synonyms. You then import the word definition table to the Business Data Parser (BDP) process and run the BDP. If you have an ignore word table defined, you can also import it for use in the BDP to recode small or insignificant words to commas.

The basic profiling and parsing unstructured data is a four-step process. These steps are described in the following topics:

Sample File

These topics use the sample file (sample.dat) included with Trillium. You must install the sample data files during the Repository Server installation to use this file. By default, the file is located in the Repository's import/data folder (for example, C:/ProgramData/Trillium Software/Data/import/data) and accessed through the default TSQ file connection in the Create Entity Wizard.

If you create an entity and a business data project using sample.dat as input, you can obtain the similar results to these steps show.