Special Entries - trillium_discovery - trillium_quality - 17.1

Trillium Parser Tuner

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Trillium > Trillium Quality
Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Parser Tuner
Topic type
Administration
Overview
How Do I
Configuration
Reference
Installation
First publish date
2008

The Parser uses the following special entries in its definition files:

US City Name Changes

You enter city name change entries with an underscore (_) as the last character of the entry. The underscore notifies the Parser that this is a city-change, and tells the program to look up the recoded entry in the City Directory Table. This directory is used for city verification and correction, and is based on a primary geography, secondary geography lookup (such as state or city).

For example:

‘MABEVERLEY_’ GEOG DEF ATT=CITY-CHG, REC=’MABEVERLY’

‘CASAN FRAN_’ GEOG DEF ATT=CITY-CHG, REC=’CASAN FRANCISCO’

Non-US City Name Changes

Countries other than the US use another level of city-name changes. It allows for additional city verification and correction based on a complex City Directory Table. An underscore is required as the last character of the entry.

1. Post Town

‘GLOUCESTERSHIRE CHELTENHAN_’ GEOG DEF ATT=CITY-CHG, REC=’CHELTENHAM’

In this example, the program looks up Cheltenham as a valid post town in Gloucestershire county. Note that the recode contains only the corrected spelling of the post town.

2. Locality

‘CHELTENHAM GOTHERINGTEN_’ GEOG DEF ATT=CITY-CHG, REC=’GOTHERINGTON’

In this example, the program looks up Gotherington as a valid locality in the post town of Cheltenham. Note that the recode contains only the corrected spelling of the locality.

Multiple Definitions for One Entry

Occasionally, an entry contains multiple meanings. This is often the case when a word has a meaning for more than one line type. The first definition is entered in the standard way. Subsequent definitions must be INDENTED under the initial operational value.

For example:

‘CENTER’ NAME DEF ATT=BUS

STREET END ATT=SEC-TYPE,REC=CTR

GEOG DEF REC=CENTER
Note: Note that for Geography definitions, tokens are allowed without an attribute.
Note: For the BDP, all entries in the customized definitions table (except patterns) must span only one line.

Patterns

A pattern consists of categories/attribute types and/or intrinsic attributes, which include any alpha, numeric, or special character representation of a data element. Token (word/phrase) identification is converted into meaningful information through pattern processing. Patterns are created in the same table as the word/phrase definition entries. The Parser understands the difference between a definition and a pattern and processes each appropriately. Therefore, you do not have to add definitions in a particular order. For organizational purposes, however, it makes sense to organize the entries by type.

Editing Patterns

Use the following methods to modify existing patterns:

  • Modify an existing pattern by adding another tag to the first line, using the MODIFY operation, for example, 'ALPHA ALPHA' MODIFY PATTERN NAME REC=’GVN-NM1(1) SRNM(1)’
  • When the Parser finds a pattern it does not recognize, it becomes a bad pattern, identified in the Parser Tuner's Parsing Exceptions Analyzer. You fix a bad pattern by associating its undefined, highlighted elements with a defined category/attribute type.

For more information, see Pattern Structure and Correcting Pattern Problems.