Adding Word and Phrase Definitions in BDP - trillium_discovery - trillium_quality - 17.1

Inline Quality and Discovery

Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Inline Quality and Discovery

You use the BDP Word Definitions Tool to create the business data categories, along with the word and phrase definitions, that tell the BDP process how to identify and standardize your business data.

Note: A phrase is a series of adjacent words.

You can also add word and phrase definitions in the Parser Tuner. You must be familiar with parser syntax and conventions before adding or customizing definitions in the Parser Tuner. For more information, see Adding a Definition to a Custom Definition Table. You can also define masks. For information, see Adding Mask Definitions.

Guidelines:
  • You must run Phrase Analysis before you can define words and phrases in the using the Word Definitions tool.
  • The Word Definitions Tool does not support the customization and parsing of Asian (double-byte/Unicode) data.
  • You cannot use single quotes (' ') in the Word Definitions Tool.

To add word and phrase definitions

  1. Analyze words and phrases in the input attribute.
  2. From the Navigation or Quality Project View, right-click the Business Data Parser process and select Edit Process. The BDP editing window opens.
  3. Click Tools.
  4. Click Word Definitions. The Word Definitions window displays.
  5. Click Categories. The Output Categories window opens.
  6. Enter the name of your output categories in the Single and/or Multi columns using the following guidelines:
    • Single. The categories in the Single column do not allow concatenation in the output; you only specify one word or phrase in a single category. Create a maximum of 25 single categories. The number for each category corresponds to the BP_USER1 - BP_USER25 output attributes.
    • Multi. The categories in the Multi column allow concatenation of words and phrases in the output. Therefore, all words and phrases found for a multi output category will be concatenated in the output attribute for that category. Create a maximum of 475 multiple categories. The number for each category corresponds the BP_USER26 - BP_USER500 output attributes. Define multi categories if you plan to extract multiple words/phrases using a substring pattern. (For more information, see Substring Patterns.)
    Note: By default, a business data project includes only 50 output attributes (BP_USER1 - BP_USER50) in the output schema. If you define more than 50 categories, you must add additional attributes to the output schema.
  7. Click OK.
  8. To define the first word/phrase, click the first field in the Word/Phrase column and enter a word/phrase or select one from the drop-down list of original data values. All words and phrase combinations are available in the drop-down list.
    Note: To navigate the rows and columns, press the Tab or arrow keys or right-click with your mouse; pressing the Enter/Return key while entering definitions will save your work and close the tool.

    Note the following guidelines:

    • Use the Filter Phrase list to filter the drop-down list of available words/phrases by entering a word, phrase, letter, or number. This restricts the list to entries that include the specified string.
    • When you add a word/phrase to the list of definitions, it is removed from the drop-down list.
    • When you delete a word/phrase from the list of definitions, it is added to the end of the drop-down list.

  9. Click in the Category column and select a category from the drop-down list.

    Note the following guidelines:

    • Assign a category to multiple words/phrases. Any values defined in the Word/Phrase column that do not yet have a category assigned are given the next selected category; for example, select a word/phrase for rows 10 through 14, then select the category BRAND from any of the uncategorized rows. The Category column for rows 10 through 14 will display BRAND.
    • This is a required field; you must select a category for each word and phrase.
    • Assign a definition to multiple words/phrases. Click Multi-Phrase.... In the Assign Multiple Phrases window, select the words/phrases to include, then select the category to which you want to add them. Optionally, assign all selected words/phrases to a position, recode, and classification.

  10. To define a position for the word/phrase, do one of the following in the Pos column:
    • If the physical location of the word/phrase in the output row is irrelevant, accept the position of DEF (default) .
    • If the definition only applies to the word/phrase when it occurs at the beginning or end of the input attribute, select from the drop-down list either BEG or END.
    • This field is required.
  11. (Optional) To recode the word/phrase, in the Recode column enter a recode value.
  12. (Optional) To add a classification, in the Classification column enter a classification value.
  13. (Optional) To create a synonym of a defined word/phrase, click Synonyms.... The Synonyms window opens. Enter or select a synonym in the Synonym column. In the Of column, select the defined word/phrase for which you are creating a synonym. For more information about using synonyms in the parser, see Synonyms.
    Note: You cannot create a synonym for words or phrases that have not yet been defined.
  14. When you are finished defining words/phrases, press Enter/Return or click OK to save your work and close the tool.
  15. To verify your definitions in the Customized Definitions table, click Launch Parser Tuner... The Parser Tuner opens in a separate window. In the BDP edit window the field next to the Launch Parser Tuner button displays an Application running message and lists any files that get transferred between the applications. You can work independently in the Parser Tuner and the Control Center at the same time, if necessary.
    Note: Whenever you finish editing the BDP, before you run the process, always click Finish to save your work and close the editing window. If you close the edit window without clicking Finish, your definitions and edits will not be saved.