The Global Rules Table (rtrules1.win/unx
) defines the rules that apply to the data from all countries. Using this table, the Router matches data from input records to the system tables (Global Geography Tables: GLOBRTR.len/ben
and APGLBRTR.tbl
) containing data specific to a country. When a match is found, a weight is assigned to the identified word or phrase. The assigned weight indicates the level of probability that the word or phrase belongs to the country from which the table match was made.
Parameter |
Description |
---|---|
DROP_PERIOD |
When set to Y, it enables the following:
Example:
|
DUPLICATE_CITY_NAME |
Overrides the order in COUNTRY_LIST for specific cities that appear in multiple countries. City names that appear in more than one city table are extracted to this table. Entry format: City name, country1, country2 Countries are listed in the most likely order of where the city would be found. The Router only returns a weight for the country that is highest ranked in the DUPLICATE_CITY_NAME entry. Example: DUPLICATE_CITY_NAME Toronto,CA,US,AU COUNTRY_LIST us,ca If the input record contained just "Toronto", with no state or province identifier, and no postcode mask, then the Router would only give weight to Canada (even though US is first in the country list). |
FIX_UTF8 |
Specifies that the Global Data Router will attempt to fix UTF8 encoding that is incorrect. Use this parameter when you suspect that UFT8 data is really in a different code page.
Example:
|
GEOG_RECODE |
Allows for recodes from the Global Geography Table to be applied to the input string. The program uses recode and synonym phrases that are found in the table.
Example:
If the Global Geography Table contains ‘ONT’ RECODE ’ON’ , then if ‘ont’ is found in input, it will be replaced with ‘on’. |
LOAD_FILES |
When set to Y, it indicates that the Global Geography Table should be loaded to memory at program startup. This improves performance for real-time applications. Note: Typically, this parameter is NOT used in batch mode.
LOAD_FILES must be set to Y when used with Trillium Director.
Example:
|
MAX_ADDITIONAL_ POSTCODES |
Contains the maximum number of postal codes to add to the weight. Limits the amount of weight added when a city name has been identified. This works in conjunction with WEIGHT_ADDITIONAL_POSTCODES. See the description for WEIGHT_ADDITIONAL_POSTCODES for an explanation of the interaction of these two parameters.
Example:
|
MIN_LENGTH |
Minimum length of Asian Level1, Level2 or Level3 for it to be a match.
Example: Asian data tends to not have spaces to separate the text. The same combination of characters can be used to represent different words, so sometimes very short combinations might represent a city name, or be part of a longer string. |
MIN_WEIGHT |
Sets the minimum weight value that must be accumulated for the record to be assigned to a country.
Example: If the total weight for a country is less than this value, a match is not made. |
RECODE_ALL |
Uses user-defined values. Performs recodes for every country.
Example:
This example would ensure that a value like "CO ANTRIM" does not get sent to the US (since the program might think that "co" means Colorado). |
SPACE_AFTER_PERIOD |
If Y, then this parameter inserts a space, prior to matching the string to the lookup tables, if the program encounters a period in the text and the next character is not a space. This allows text such as "S.Diego" to convert to "S. Diego". If "S." is recoded to "San", then the program gets a match in the Global Geography Table.
Example:
|
STREET_TYPE |
A list of names that are street identifiers. If the word preceding the street type looks like a city name, then the resulting city match is ignored. When working with multiple countries, it is best to put all street types for the various languages in this list, so the program knows to which country each type belongs. Example: R,st L,rue de The identifiers (L = left, R = right) indicate the identifier position, relative to street name. "Rue" always comes to the left of the street name, and "road" to the right. |
TRANSLATE_CHAR |
This parameter converts characters from one form to another. Format: TRANSLATE_CHAR ab, cd, ef ‘a’ is translated to ‘b’, ‘c’ translated to ‘d’ and ‘e’ translated to ‘f’.
Example:
This would translate ‘Ø’ to ‘O’ and ‘A’ to ‘E’. |
TRANSLATE_TABLE |
Contains a translation table to convert accented characters into non-accented ones. Many internal city-state tables are entered without accents. This table allows the text to be converted for each country so that accented characters will match up. This table should be entered before the country sections of the rules table. Within the country section, use USE_TRANSLATE_TABLE to reference a particular table.
Example:
|
UNICODE_RANGE |
The range of characters and weight for each character in that range. This enables you to give weight to characters that should only occur in the designated country. The low and high range values are entered in hexadecimal. Example:
|
UTF8_ENCODING |
Enter an alternate code page to try if the data is supposed to be in UTF8 but is not properly formatted. You can have one of these for each country in the rules table. The default is ASCII.
Example:
|
WEIGHT_ADDITIONAL_ CITY_MATCH |
For city name matches, this value is added if another city match is made for a particular entry in a different country. The result is added to the total weight, giving more weight for more than one city match.
Example:
In this case, if the city of ‘Portsmouth’ got a match in the US as well as the UK, a value of 10 would be applied twice. |
WEIGHT_ADDITIONAL_ POSTCODES |
This value is added to any city name with more than one postcode. It is multiplied by the lesser of the values in MAX_ADDITIONAL_POSTCODES, and the number of postal codes, minus one for this entry in the Global Geography Table. For example, if "Boston" had 15 postal codes and MAX_ADDITIONAL_ POSTCODES was set to 10, weight would be calculated by subtracting 1 from 15, giving 14. Since 14 is greater than 10, 10 would be multiplied by the value of the weight. If the weight is zero, then this is not used in the calculation.
Example: |
WEIGHT_ADDITIONAL_ WORDS_IN_CITY |
Add this weight if the city name contains two words. Default = 15
Example: |
WEIGHT_ATT_CITY |
Add this weight for all entries in the Global Geography Table that have attribute assignments of "att=city". This is useful for foreign countries that may have data that is not represented in the native-language spelling. For instance, in Italy, the capital is "Roma". However, the name could be entered with the English spelling of "Rome." This value is in the Global Geography Table, so it would be categorized as being in Italy.
Example: |
WEIGHT_ATT_STATE |
Add this weight for all entries in the Global Geography Table that have attribute assignments of "att=state." This weight is similar to "WEIGHT_ATT_CITY", except that it uses state names instead of city names.
Example: |
WEIGHT_COUNTRY_ CODE |
Add this weight if the country code for this record is in the appropriate column. This is probably a large value, exceeding the threshold, since that the value, if present, is likely correct.
Example: |
WEIGHT_COUNTRY_ NAME |
Add this weight if the country name appears anywhere in the record. This should not exceed the threshold.
Example:
|
WEIGHT_COUNTRY_ NAME_LAST |
Add this weight if the country name is last input data. If the country name is in a particular attribute, then put this attribute LAST in the list of attributes to pick up.
Example: |
WEIGHT_EMAIL_EXTENSION |
Add this weight if an e-mail attribute exists in the input and the E-mail Attribute setting is specified in the Advanced settings window.
Example:
|
WEIGHT_FOUND_ ENDINGS |
Add this weight if an ending from the Global Geography Table or from the ADD_ENDING parameter for a particular country in your Router Rules Table is found. Example: If the ending –weg is in the table, and the word "Arborweg" is in the record, we add the desired weight.
Example: |
WEIGHT_LEVEL1 | Specifies the value to add if there is a match at the state, province, and county level entry. |
WEIGHT_LEVEL2 | Specifies the value to add if there is a match at the city level entry. |
WEIGHT_LEVEL3 | Specifies the value to add if there is a match at the locality and neighborhood level entry. |
WEIGHT_LEVEL4 | Specifies the value to add if a match exists at the dependent locality and secondary neighborhood level entry. |
WEIGHT_NO_POSTCODE_MATCH | Specifies the value to SUBTRACT if there is a match on the city, but not on the postal code. |
WEIGHT_NO_STATE_ MATCH | Specifies the value to SUBTRACT if there is a match on the city, but not on the state/province. |
WEIGHT_POBOX |
Add this weight if there is a word in the record that refers to a post office box for that country. These items are in the Global Geography Table, with an attribute of "ATT=PBOX".
Example: |
WEIGHT_POSTCODE_ MASK |
Add this weight when there is a match on the position and pattern described in POSTCODE_MASK.
Example:
|
WEIGHT_POSTCODE_ MASK_ANYWHERE |
Add this weight if the POSTCODE_MASK is matched anywhere in the record. The entire record is then searched for the postal code pattern. This is useful in countries like Canada or the UK, where postal codes can be very distinctive.
Example:
|
WEIGHT_SECONDARY GEOG_MATCH |
Add this weight if there is a secondary match at a particular level: for example, if the city is matched and then the state or postal codes associated with this city are also matched. Consider this example:
And
If the program looked this up in the table, it would find "Billerica". When checked further, it would find a match on "MA", and also on postal code "01821". This would add 2 x 100 to the weight value for the state and postal code matches. |
WEIGHT_STATE_ POSTCODE_RANGE |
The program will add this weight if the following condition is met: Router can't find an exact postcode match, but the postcode is in the correct range for the state from the STATE_POSTCODE_RANGE table (created in the rules file).
Example:
This will not get a match on "Billorica", but since the range for Massachusetts contains the sectional center 018, it will add this weight. |
WEIGHT_SYNONYM | If a synonym is found in the Global Geography Table, this weight is added to the overall total. |
WEIGHT_TEMP_TABCIT | Specifies the value to subtract from the WEIGHT_LEVEL2 value if there is a match in the city changes file. |
WEIGHT_THREE_CHAR_LEVEL2 |
Add this weight if the city name consists of three characters. This is for countries that have very short city names that might occur in other countries as fillers. When the three-character city is matched, instead of WEIGHT_LEVEL2, this will add a much smaller weight so that if there is other information pointing to a different country, this does not override it.
|
WEIGHT_THREE_ WORDS_IN_CITY |
Add this weight if the city name contains three or more words. Default = 500
Example: |
WEIGHT_THRESHOLD |
This is a user-defined value that the total computed weight is compared against. When the total weight is greater than or equal to this value, no more data comparison is performed and the country of origin is determined.
Example: If the computed weight equals 100 or greater, processing stops and the country of origin is determined. |
WEIGHT_TWO_CHAR_LEVEL2 |
Add this weight if the city name consists of two characters. This is for countries that have very short city names that might occur in other countries as fillers. When the two-character city is matched, instead of WEIGHT_LEVEL2, this will add a much smaller weight so that if there is other information pointing to a different country, this does not override it.
|