Analyzing Word Phrases in an Attribute - trillium_quality - trillium_discovery - Latest

Trillium Control Center

Product type
Product family
Trillium > Trillium Discovery
Product name
Trillium Quality and Discovery
Trillium Control Center
First publish date
Last updated
Published on

You can configure the properties of an attribute to include phrase analysis. This allows you to search for word phrases using a word range and uniqueness threshold that you specify. You can also direct the analysis to ignore certain words during the analysis.

Use the results of the analysis to find and compare word phrases that represent similar string values but may have a different phrasing, spelling, or meaning.

To analyze word phrases in an attribute

  1. In the Navigation View, right-click an attribute and select Edit attribute. The Attribute Properties window opens.
  2. Click the Analysis tab.
  3. Select the Phrases, between option. Enter a numeric range to specify the number of words required to make up a phrase before it will be analyzed. The default is between 1 and 5.
    Note: The higher the words-per-phrase value, the longer it takes the analysis to run. Therefore it is recommended that you keep these numbers low; for example, If your data contains only single word phrases, set the range between 0 and 1. If your data contains 1 and 2 word phrases, set the range between 1 and 2. 2 or more word phrases are considered multi-word phrases.
  4. Choose a uniqueness indicator by specifying how many times the phrase must occur before it is selected for analysis. Use the up-down arrows after and a frequency count of over: to enter this value. Default value is 1.
  5. Click the Classify phrases on the basis of metaphones option if you want metaphone analysis.
  6. In order to ignore certain words during phrase analysis:
    1. Click Configure Ignore words. The Configure Ignore Words window opens, displaying all available ignore word tables in the Available Ignore words section. (Ignore word tables must be added or imported to the Control Center before this list is populated.) For more information, see Ignore Words.
    2. From the Available Ignore Words list, select one or more ignore word tables and click Add. Click Add All if you want to use words in all available ignore word tables. The selected tables display in the Selected Ignore Words list.
    3. Click OK to close the Configure Ignore Words window.
  7. Check Analyze Now to indicate you want to analyze all rows that have not yet been analyzed.
  8. Click OK, the run phrase analysis.
    Note: If you check the Phrases option, but do not check Analyze Now, the phrase analysis will run with the next scheduled job.
  9. (Optional) After the job finishes, verify the total phrases analyzed:
    1. In the Navigation View, click the attribute name.
    2. In the attribute's Summary Data View, on the Content Summary tab, expand Consistency. The number of analyzed phrases displays, along with those for Pattern Count and Mask Count.
    3. Click Phrases to open a List View of phrases for the attribute.