The Parser uses the following special entries in its definition files:
US City Name Changes
You enter city name change entries with an underscore (_) as the last character of the entry. The underscore notifies the Parser that this is a city-change, and tells the program to look up the recoded entry in the City Directory Table. This directory is used for city verification and correction, and is based on a primary geography, secondary geography lookup (such as state or city).
‘MABEVERLEY_’ GEOG DEF ATT=CITY-CHG, REC=’MABEVERLY’
‘CASAN FRAN_’ GEOG DEF ATT=CITY-CHG, REC=’CASAN FRANCISCO’
Non-US City Name Changes
Countries other than the US use another level of city-name changes. It allows for additional city verification and correction based on a complex City Directory Table. An underscore is required as the last character of the entry.
1. Post Town
‘GLOUCESTERSHIRE CHELTENHAN_’ GEOG DEF ATT=CITY-CHG, REC=’CHELTENHAM’
In this example, the program looks up Cheltenham as a valid post town in Gloucestershire county. Note that the recode contains only the corrected spelling of the post town.
2. Locality
‘CHELTENHAM GOTHERINGTEN_’ GEOG DEF ATT=CITY-CHG, REC=’GOTHERINGTON’
In this example, the program looks up Gotherington as a valid locality in the post town of Cheltenham. Note that the recode contains only the corrected spelling of the locality.
Multiple Definitions for One Entry
Occasionally, an entry contains multiple meanings. This is often the case when a word has a meaning for more than one line type. The first definition is entered in the standard way. Subsequent definitions must be INDENTED under the initial operational value.
‘CENTER’ NAME DEF ATT=BUS
STREET END ATT=SEC-TYPE,REC=CTR
Patterns
A pattern consists of categories/attribute types and/or intrinsic attributes, which include any alpha, numeric, or special character representation of a data element. Token (word/phrase) identification is converted into meaningful information through pattern processing. Patterns are created in the same table as the word/phrase definition entries. The Parser understands the difference between a definition and a pattern and processes each appropriately. Therefore, you do not have to add definitions in a particular order. For organizational purposes, however, it makes sense to organize the entries by type.
Editing Patterns
Use the following methods to modify existing patterns:
- Modify an existing pattern by adding another tag to the first line, using the MODIFY operation, for example, 'ALPHA ALPHA' MODIFY PATTERN NAME REC=’GVN-NM1(1) SRNM(1)’
- When the Parser finds a pattern it does not recognize, it becomes a bad pattern, identified in the Parser Tuner's Parsing Exceptions Analyzer. You fix a bad pattern by associating its undefined, highlighted elements with a defined category/attribute type.
For more information, see Pattern Structure and Correcting Pattern Problems.