Ignore Words - trillium_quality - trillium_discovery - Latest

Trillium Control Center

Product type
Product family
Trillium > Trillium Discovery
Product name
Trillium Quality and Discovery
Trillium Control Center
First publish date
Last updated
Published on

When you run a phrase analysis for one or more attributes, certain repeated words can make it difficult to fully understand the results. The ignore words tables provide a convenient way to instruct the phrase analysis process to ignore these small, insignificant or repeated words or phrases that can obscure the results.

Simply put, these tables contain words and phrases that will get ignored during phrase analysis of an attribute or attributes. Ignore words tables can also be used in the BDP with similar effect (that is certain words or phrases will get ignored when the BDP process is run).

You can create or import ignore words tables, modify them by adding or removing words and phrases. When you configure phrase analysis, you can select which ignore word tables should be used in the Attribute Properties dialog.

Note: By default, phrase analysis is not run against ignore word tables. You will need to configure phrase analysis process to look up the ignore words table and ignore the words in them.

How Ignore Words are Determined

An ignore word can be a single word, including punctuation. An ignore word can also be made up of multiple words, including punctuation. Words are separated by one or more white-space characters.

Multiple spaces are not removed from an ignore word; however, when a value is processed for ignore words, multiple spaces are treated as a single space.

Ignore words consisting of multiple words are matched with your data regardless of whether or not there is any white-space between the words, both in the ignore word and the data.

Note: It is recommended that you define no more than 25,000 ignore words for phrase analysis. If you have definitions that are higher than the recommended limit, it can take up a significant amount of system resources and slow down processing.