Patterns or sequences of words, known as token masks, can be used for recognition by the Parser and Postal Matcher processes. The token mask entries are used mainly to convert unknown tokens into known tokens, so that the entire string can be recognized. See Token Masks.
Example 1
'LSFX' ATT=TOKEN -MASK REC='LSFM'
where,
L = Last name S = Space character F = First name X = Unknown token M = Merge with previous token |
The pattern LSFX refers to an input string that contains the following sequence of words: a known last name -- a space character -- a known first name -- an unknown word. The LSFM would merge the last unknown token with the previous token and define it as a first name.
Example 2
'4U' ATT=TOKEN-MASK REC='4M'
where,
4 = Level 4 token M = Merge with previous token |
The pattern 4U refers to an input string that contains the following sequence of words: a level 4 -- an unknown word. The 4M would merge the last unknown token with the previous token and define it as a level 4.