Step 1: Perform Phrase Analysis - trillium_discovery - trillium_quality - 17.1

Trillium Control Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Trillium > Trillium Quality
Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Control Center
Topic type
Administration
Overview
How Do I
Configuration
Reference
Installation
First publish date
2008

The first step for profiling and parsing unstructured business data is to analyze the attribute using phrase analysis. Phrase analysis allows you to search for words and phrases using a word range and uniqueness threshold that you specify. You can also direct the analysis to ignore certain words during the analysis. Use the results of the analysis to create a word definition table that is later imported for use in the BDP process.

To analyze word phrases in an attribute

  1. Select the Entities tab on the Discover bar.
  2. Open the input entity (Sample) you use for profiling and parsing.
  3. Right-click the attribute Mortgage Description and select Attribute Properties. The Attribute Properties window opens.
  4. Select the Analysis tab.
  5. In the Derived Metadata Rules section, specify the numeric range and frequency as follows:

    Phrases, between 1 and 2 words per phrase, and a frequency count of over: 1

    You must review the data to decide the shortest and longest phrases to analyze. In this sample, we want to analyze all single words and 2 word phrases such as "interest only" and "adjustable mortgage." Therefore the range is set to 1 and 2.

  6. (Optional) Click Configure Ignore words. Ignore word tables must be added or imported to the Control Center in advance. In this sample, we can use the following ignore word table (ignore_mortgage):
  7. (Optional) In the Available Ignore Words list, select the ignore word table (ignore_mortgage) and click Add. The table displays in the Selected Ignore words list.
  8. (Optional) Click OK to close the Configure Ignore words window.
  9. Check Analyze Now and click OK.
  10. Run the analysis now or schedule a time for the job to run later.
  11. After the job finishes, right-click the Mortgage Description attribute in the Navigation View and select Drill down to Metadata.
  12. Double-click Phrases to open a List View of phrases for the attribute.

     

The phrase is analyzed. Now you can move on to Step 2, Create a Word Definition Table