Positions - trillium_discovery - trillium_quality - Latest

Trillium Parser Tuner

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Quality
Trillium > Trillium Discovery
Version
Latest
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Parser Tuner
Copyright
2024
First publish date
2008
Last updated
2024-10-18
Published on
2024-10-18T14:59:24.246276

A token may be defined in relation to its position within the name or address line. There are several types of positions:

  • BEG (Beginning)
  • END (Ending)
  • DEF (Default)
  • Beg-Word
  • End-Word

BEG

The first token in an attribute or before the first space character.

Example

Attribute 1

Attribute 2

王 立文 先生

东城区 长安大街 一号

In the example above, '王' and '东城区' are at the beginning positions of their respective attributes.

END

The last token of an attribute or after the last space character in the attribute.

Attribute 1

Attribute 2

王 立文 先生

东城区 长安大街 一号

In the example above, '先生' and '一号' are at the end of their respective attributes.

DEF

When the physical location of the word in the line is irrelevant, use "Default." A default word may appear anywhere on the line, including the beginning or end. If position is omitted from the entry, Default is assumed.

Beg-Word

The first token in an attribute, any token that has a space after it, or any token that comes at the beginning of a longer string that may not be separated by a space character.

Attribute 1

Attribute 2

王 立文 先生

东城区 长安大街 一号

In the example above, all words except ‘先生’ and ‘一号’ are at the beginning of the "words."

End-Word

The last token of an attribute, any tokens with space characters before them, or any tokens that occur at the end of a longer string.

Attribute 1

Attribute 2

王 立文 先生

东城区 长安大街 一号

In the example above, all tokens except ‘王’ and ‘东城区’ occur at the ending of the "word".

A variation on END-WORD is the position "END-NUMBER", which allows the user to search for character data immediately followed by a numeral. For example, with "王2350大街", the position "END-NUMBER" would find "王", but END-WORD would not. This has been implemented to account for the fact that Asian data might appear with or without space characters.