STREETS Routine - trillium_discovery - trillium_quality - 17.1

Inline Quality and Discovery

Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Inline Quality and Discovery

The STREETS routine compares street names using the following logic:

  1. Prior to performing the comparison, the routine changes all periods (.) to blanks, and ampersands (&) to pluses (+).
  2. Common abbreviated street words then have a spelling algorithm applied.
  3. If the spelling yields a score less than 80, a modified sound algorithm is applied.
  4. If the score is still less than 80 for numeral streets, a word comparison routine is applied.
  5. If the two fields do not match exactly, the routine calculates two field lengths by excluding trailing blanks.
Note: For the STREETS routine, when an exact match occurs AFTER a modification to either input field due to any modifier, or the application of the streets match algorithm, the returned score is 98.

The default order of operations for scoring for the STREETS routine is as follows:

  1. Test for blank fields. If both fields are blank, return a score of 88. If one field is blank, return a score of 80.
  2. Test for an exact match of the non-modified input fields. For an exact match return a score of 100.
  3. Apply any of the following Routine Modifiers in the listed order:
    • ALPHANUM
    • NOCASE
    • DECOMP
    • DI
    • NI
  4. Apply the STREETS routine logic. The TYPE modifier is applied as part of the STREETS routine logic.
  5. Determine scoring.
Table 1.  Scoring for STREETS

Score

Description

100

For an exact match.

90-99

Varying degrees of acceptable differences.

95

For neither field value blank and one field an exact starting substring (6 characters or more in length) of the other, but the difference in length is not greater than two (2) characters.

88

For an exact match of blank field value versus blank. Maximum field length allowed is 100 bytes.

80

For blank field versus nonblank field.

0

For comparing numeric streets, if street numbers are different.

For example: 1232ND STREET and 1242ND STREET

Deduct from 100:

– 3

For each non-matched double character.

– 4

For each character error and for each extra character if there are extra characters.

– 1

For each extra character if the last character was a mismatch.

– 2

For other character errors (transposition, insertion, mismatch, extra characters at the end of either field after taking insertions and doubled characters into consideration).

– 10

If the number of character errors is more than 25% of the length of the shorter field.

– 25

If the number of character errors is more than 50% of the length of the shorter field.

Finally, add:

+1

If the length of either field is at least nine (9) characters.

Score

Description

90

If a higher score can be achieved with checking the number of successful word comparisons greater than 2 words and all but one word agree.

Examples

"IBM" vs "ibm" with NOCASE = 98

"I BM" vs "I-BM" with ALPHANUM = 98