Advanced Customer Data Parser Settings - trillium_discovery - trillium_quality - Latest

Inline Quality and Discovery

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Quality
Trillium > Trillium Discovery
Version
Latest
Language
English
Product name
Trillium Quality and Discovery
Title
Inline Quality and Discovery
Copyright
2024
First publish date
2008
Last updated
2024-10-18
Published on
2024-10-18T15:10:12.949492

You can specify the following advanced settings to configure the CDP to meet your business needs. Changing one or more of these settings may have a significant impact on the CDP output. We recommend that you validate that your changes have the impact you want by running the CDP process and analyzing the results whenever you change one of these settings.

To specify Advanced Parser settings

  1. From the Navigation or Project View, right-click the Customer Data Parser process and select Edit Process.
  2. Select the Options menu option.
  3. Click Advanced. The Advanced Parser Settings window opens.
  4. Make the changes you want, using the following table as a guide.
  5. When you have made the changes, click Back and then click Finish.
    Option Description
    Assign gender option

    From the drop-down list, select one of the following options:

    • Base gender on title and all given names. (Default)
    • Base gender on title and first given name.
    • Base gender on title and first given name that is not an initial.

    This setting addresses problems that arise in cultures where a person may have both a male and female given name, which, with the default setting, would return a review code of 019.

    Attribute to identify rows in exceptions file

    From the drop-down list, select the attribute to write out with each bad pattern or city problem entry in the parsing exceptions file. The attribute must contain a unique value for each record (such as sequence numbers, social security numbers, or account number). By default, this attribute is not defined.

    Pre-processing options for house numbers

     

    From the drop-down list, select the type of house number processing. By default, the CDP applies minimal pre-processing of house numbers before street pattern processing.

    • No pre-processing - disable pre-processing
    • Minimal pre-processing - a fractional number like "1 1/2" becomes "11/2". Note that "1 1/2" becomes a HSNO token. The fraction portion must be 3 characters in length and include the '/'.
    • Full pre-processing - This option handles fractions just as the Minimal Pre-Processing option does. In addition, it tells the CDP to convert a number like "2420-36" to "2420 36". This conversion does not apply to New York, New Jersey and Hawaii.

    Split hyphenated house numbers into two attributes

    If you have house number entries that contain a hyphen (for example 12-25) and you want to split the entry into house number and apartment number, select one of the following options:

    • House no. + Apartment no. - the first number will be considered house number and the second will be considered apartment number.
    • Apartment no. + House no. - the first number will be considered apartment number and the second will be considered house number.

    The house number value will be written to the pr_house_number attribute and the apartment number will be written to the pr_dwelling1_name attribute.

    Note: This option is primarily used for Netherlands (NL). It should not be used for Australia (AU), Canada (CA), United Kingdom (GB), or United States (US).
    Street ordinal options

    From the drop-down list, select the street ordinal option.

    • Add street ordinals to street lines - add street ordinal. This is the default for US. For example, "10 22 Street" becomes "10 22nd Street". However, ordinals are not added if the street is made up of multiple words.

     

    • Remove street ordinals from street lines - remove street ordinal if it is there.

     

    • Do nothing (keep inputs) - do not add or remove street ordinal. This is the default for all countries except US.
    Spelling algorithm for matching city names

    From the drop-down list, select the spelling algorithm option to use when the input city name does not match the city table.

    • No spelling algorithm- the spelling algorithm is not used.
    • Basic spelling algorithm - the spelling algorithm is applied when the first four (4) characters of the two names being compared are identical.
    • Enhanced spelling algorithm - the enhanced spelling algorithm is applied. The CDP handles spaces, special characters and words correctly. For example, it will drop spaces, swap hyphens and spaces, change accented characters to non-accented characters, and swap short words such as "de" for "da."
    Use city changes file Check this box to use the secondary city lookup table (city changes file). The city changes file specifies characters and words used for enhanced table lookup. The default file is provided and accessed from the Advanced Rules tab. You can customize this file. For more information, see City Changes File.
    Note: The Enhanced spelling algorithm for the Spelling algorithm option must be selected to use this option.
    Maximum length for user definitions

    By default, the user definitions file, which includes all user-defined parsing customization definitions, allows entries up to 1000 characters. Specify a larger value (up to 4000 characters) if your definitions require it.

    Copy original data to Parser original output attributes

    Check this box to always retain the original input data in the parser output file. If this option is not selected, the CDP retains the original data unless the word/phrase is a synonym of another word or phrase. In that case, the synonym value is stored in the output attribute identified as Original. By default, this setting is disabled.

    Interpret concatenee and next word as a surname

     

    Check this box to define a combined token (also called a concatenated token) as a SURNAME attribute. Otherwise, the CDP defines the combined token as an ALPHA attribute. (By default, this setting is enabled.)

    Activate business recognition functions

    Check this box to enable the setting you selected for the business recognition function. See Identifying Business Names for more information. By default, this setting is enabled.

    Disable automatic business line type identification

    Check this box to disable automatic business line type identification and force the CDP to use the patterns for one pass. See Identifying Business Names for more information. By default, this setting is disabled.

    Remove special characters in table lookups

    Uncheck this box to include hyphens and slashes for a table lookup. When this setting is enabled (which is the default), the CDP separates the special characters before performing a lookup.

    Reverse names separated by commas before pattern lookup

     

    Check this box to reverse names that are separated by commas before pattern lookup. By default, this setting is enabled.

    Split concatenated names before parsing

     

    The CDP normally does some pre-processing prior to name processing; this includes name concatenation and name splitting. Check this box to split concatenated names before name processing. By default, this setting is enabled.

    Split three-character names into initials

     

    Check this box to split three-character names into initials. For example, for the entry "BPL," "B" is parsed to first name attribute, "P" to middle name attribute and "L" to last name attribute. If this option is not selected, "BPL" would remain as "BPL." By default, this setting is disabled.

    Start new logical line after CARE-OF attribute

     

    Controls whether a CARE-OF attribute sets logical beginning and ending positions (such as GIVEN-NAME1 or TITLE). By default, this setting is disabled.

     

    Try to match a single word preceding a street line

     

    Check this box to attempt to match lines that have a single token above an identified street line to the table. This setting ensures that lines with a single token above an identified street line are given a single attempt to match a pattern to the table. The token must be identified as a single intrinsic attribute (ALPHA, ALPHA-1SPECIAL, and so forth). By default, this setting is disabled.

    Write street patterns to log file

    Check this box to print to the log file the street pattern as it exists after the first six Street Rules are run, but before any other Street Rules are run. By default, this setting is enabled.

    Move city/province from end of geography line to a new line

    Check this box to split geography lines and move city/province into a new line. By default, this setting is disabled, which means the geography line is not split.

    Hyphenate concatenee and surname

    Check this box to recode a concatenee and surname by inserting a hyphen between them. For example, when this box is checked, the CDP recodes VAN DAMME as VAN-DAMME. Otherwise, the CDP recodes a concatenee and surname by removing the space between them (for example, VANDAMME).

    Note: Because the Window Key process has a rule that builds the window key with characters starting after a hyphen instead of at the beginning, a better window key can be generated. It would limit the window size and prevent over-matching.
    Eliminate duplicate dwellings

    Check this box to eliminate duplicate dwelling information. For example, if the pr_dwel1and pr_dwel2 attributes have identical information, the US Postal Matcher builds the pr_gout_deliver_addr field with data from both pr_dwel1 and pr_dwel2, resulting in duplicate information.

    Write Z lines to misc address lines

    Check this box to write Z lines to the miscellaneous address attributes.

    Note: A Z line is one the CDP has identified as a miscellaneous line type.
    Do not change APT token at end of line to ALPHA

    Check this box if you do not want to change an APT token at the end of a line to ALPHA. By default, if an APT token is the last token on a line (that is, it is not followed by an apartment number), the CDP changes its attribute to ALPHA. However, in some countries (such as Portugal) an APT token at the end of the line does represent an APT and the attribute should not be changed.

    Return first word if multi-word city is not matched

    Check this box to use the first word in a multi-word city name when a match is not found for the full city name. Otherwise, the CDP returns the last word in the city name. This option enables you to determine on a country by country basis how you want the CDP to process city names with multiple words.

    Assign street attribute to a building

    Check this box to parse buildings as street elements instead of name elements.

    Turn off conversion of plus signs to ampersands

    Check this box if you do not want to convert all plus signs (+) to ampersands (&). By default, the CDP converts all plus signs to ampersands.

    Check street lines preceding complex line Check this box if you want the CPLX01 function to look for previous street lines in addition to succeeding street lines. Otherwise, the CPLX01 function does not check previous lines for street lines.
    Flag name line changes

    Check this box to flag changes to the original data that appear on a name line. If there is change, a value of '1' will be stored in the change flag attributes.

    Note: You must add the change flag attributes to the output schema by selecting Add Parser Outputs in the Schema Editor.
    Flag street line changes

    Check this box to flag changes to the original data that appear on a street line. If there is change, a value of '1' will be stored in the change flag attributes.

    Note: You must add the change flag attributes to the output schema by selecting Add Parser Outputs in the Schema Editor.
    Flag geography line changes

    Check this box to flag changes to the original data that appear on a geography line. If there is change, a value of '1' will be stored in the change flag attributes.

    Note:

    You must add the change flag attributes to the output schema by selecting Add Parser Outputs in the Schema Editor.

    Write misc data to neighborhood

    Check this box to obtain neighborhoods when there is miscellaneous data identified between a street line and a geography line. The output is stored in the pr_neigh1 and pr_neigh2 attributes.

    Note: This option should be used when you know that a country uses neighborhoods. It is selected by default for India (IN) and Hong Kong (HK). If used with the Write Z lines to misc address lines option, the Write Z lines option takes precedence over this option.
    Output all name data

    Check this box to write out any name data that do not match the found name pattern. The unmatched name data is stored in the pr_name_relation attributes. Check the box if you want to write any unmatched token, even though it is not part of the pattern.

    Use TSS city table

    Check this box to use the city table provided by Precisely. This option is used when you create a user-defined template using ww_proj (with dummy city table), and then subsequently purchase the country-specific city table.

    Note: This option is enabled only when the country-specific city table exists.
    Additional geography lookup

    Check this box to validate province/city/postcode combination against the auxiliary city table and update or append information in the output. This option will return the flag (Y/N) in the pr_verified_geography attribute. For Hong Kong, this option also returns island for some cities in the pr_sub_city attribute.

    Note: This option is not available for the following countries: Basic Countries (ZZ), China (CN), United Kingdom (GB), Japan (JP), Korea (KR), Singapore (SG), and Taiwan (TW).
    Note: For Canada (CA), Netherlands (NL), and Portugal (PT), the flag in the pr_verified_geography attribute is not available.
    Lookup all city combinations Check this box to look up all word combinations for city when the city lookup process is performed for a multi-word input. By default, the lookup is not performed for all combinations when the initial full string is not a city.