Specifying Parser Options for Japan - trillium_discovery - trillium_quality - 17.1

Trillium Control Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Trillium > Trillium Quality
Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium Control Center
Topic type
Administration
Overview
How Do I
Configuration
Reference
Installation
First publish date
2008

After specifying the Parser input, specify the rest of the options in the Options tab of the Japanese Customer Data Parser.

To specify Parser options for Japan

  1. From the Navigation or Project View, right-click the Customer Data Parser process and select Edit Process. You can also double-click the process to open it for editing.
  2. (Optional) In Name Parsing Mode, select Personal Name Parsing for personal names, Business Name Parsing for business names, or Both for both names. Default is Both.
  3. (Optional) Specify open and end Characters to be used for comments. They must be registered as a pair. Default is "( )". Double-byte characters and symbols can be used.
    Note: If you want to add characters to the default characters, specify as "()「 」". Double quotation (") cannot be registered.
  4. (Optional) Specify how to Convert Kana characters. This setting handles the small katakana characters in the hankaku field. See Converting Small Kana for Japan for details.
  5. (Optional) Specify Characters that cannot end a token. If the character specified is at the end of the token during the token separation process, separation at that position becomes invalid. No character is specified by default. A double quotation mark cannot be used. For example, if you specify "&," the data "Trillium&Software" will not be separated to "Trillium&" and "Software". The program looks for only one character.
  6. (Optional) Specify Characters that cannot begin a token. If the character specified is at the beginning of the token during the token separation process, separation at that position becomes invalid. Default: -゙ァィゥェォ\ョ゜ッャュ. For example. the data "ヤマダジドウキャッシング"will not be separated to "ジドウキ" and "ャッシング". A double quotation mark cannot be used. The program looks for only one character.
  7. (Optional) Specify the PNP Delimiter of words you want to use other than spaces (zenkaku/hankaku). Default is NULL.
  8. (Optional) Specify Maximum Number of name records to generate if there are more than one personal name. Default is 5.
  9. (Optional) Specify additional options based on the following information.
    Option Description

    Separate Contact Name from Branch Name?

    Specifies whether to separate the contact name from the branch name in Business Name Parsing mode. See Separating Branch Name

    Create Multiple Output Records?

    Specifies that when multiple personal names are processed, multiple output records are created. You can set this option to create 1 record for multiple persons or 1 record for 1 person (maximum 5 records). This option is valid only when the One Person per Input Record? option is turned off.

    Note: To use this option, all five sets of personal name attributes (pr_pnp_last_01-05, etc) must be specified in the output schema.

    Input: 山田 太郎 花子

    Output 1: 山田 太郎

    Output 2: 山田 花子

    Separate Titles within a Field?

    Specifies whether to separate titles within the field. This option is also used to separate contact name in the Business Name Parsing process. This option is checked by default.

    Only include masks with unknown tokens

    By default, a parsing exceptions file includes the final mask tokens (pr_mask_final) for all records processed. If you select this option, only records with unknown tokens will be selected for the exceptions file. You can review the exceptions file using the AP Parser Tuner.

    Reverse first/last names

    Specifies whether to reverse personal name. If this option is checked and there is an ASCII name, the first token will be the first name and the second token will be the last name. Asian names are normally in the last/first order.

    Delete Comments?

    Specifies whether comments will be deleted. This option is checked by default. See Deleting Comments.

    Separate Honorifics within a Field?

    Specifies whether to separate honorifics within the field. This option is also used to separate contact name in the Business Name Parsing process.This option is checked by default.

    One Person per Input Record?

    This will set the single person mode and multiple person mode for personal name records. If it is assumed that there is only one person in one record, use the single person mode. If it is possible that there are multiple persons in one record, use the multiple person mode.This option is checked by default.

    Note: In the multiple person mode, the Parser assumes that there are multiple first names.

    Output exceptions records

    When checked, the exception records (pr_status/pr_h_status is other than 0) will be written out in a separate exceptions file. Otherwise the exception records will be written out to the output entity.

    Note: These exception records are not the same as the ones displayed in the AP Parser Tuner.
  10. Click Finish.