The following table shows the list of Quality functions you can use in the Expression Builder. The countries that use a particular function are indicated in the Country column.
The following table shows the list of Quality functions you can use in the Expression Builder. The countries that use a particular function are indicated in the Country column.
Function Name |
Country |
Description |
---|---|---|
ASCTOFULL |
China Japan Korea Taiwan |
Transforms all half-width ASCII characters (single-byte) in an attribute to their full-width (double-byte) representation.
|
ASCTOHALF |
China Japan Korea Taiwan |
Transforms all full-width ASCII characters (double-byte) in an attribute to their half-width (single-byte) representation.
|
CJKTOARABICNUM |
China Japan Korea Taiwan |
Transforms Chinese number symbols in an attribute to their Arabic decimal equivalents.
Note: Make sure that you are applying this function to the attribute
where Chinese numbers only represent NUMBERS. Otherwise, the
following may happen: 千葉県 returns 1000葉県.
|
CJKTOFULL |
China Japan Korea Taiwan |
Transforms half-width characters in an attribute to their full-width form. For Japan, this function automatically composes kana sound marks (dakuten and handakuten) appropriately. See Japanese full-width and half-width characters for details.
|
CJKTOHALF |
China Japan Korea Taiwan |
Transforms full-width characters in an attribute to their half-width form. For Japan, this function automatically decomposes kana sound marks (dakuten and handakuten) appropriately. See Japanese full-width and half-width characters for details.
|
CTOSIMPCHINESE |
China Taiwan |
Transforms Traditional Chinese characters in an attribute to their Simplified Chinese equivalent.
|
CTOTRADCHINESE |
China Taiwan |
Transforms Simplified Chinese characters in an attribute to their Traditional Chinese equivalent.
|
DEDUPE | All |
Separates the value in an attribute into tokens and returns the deduped and delimited list of tokens. It performs the deduplication on the attribute value by searching the maximum number of tokens per phrase first, and repeats the search after decrementing the number of tokens per phrase by 1 each time. This process will continue until the number of tokens per phrase reaches the minimum specified. Note: You can use the DEDUPE function in the Transformer and the Set
Selection utility.
General Guidelines
Guidelines for the Set Selection Utility
|
JCOMBINE |
Japan |
Transforms spacing form sound marks (dakuten and handakutens) in an attribute to combining form. Usually used before JCOMPOSE.
If the sound marks cannot be merged with the preceding character (such as "ア"), they will be written out in hankaku in the output. If you need those sound marks to be in zenkaku in the output, use JSMARK after JCOMPOSE (JCOMBINE + JCOMPOSE + JSMARK). See Japanese Sound Marks for details. |
JCOMPOSE |
Japan |
Merges combining form sound marks (dakuten and handakutens) with the base characters to build dakuten characters. It is recommended to use JCOMBINE.
If the sound marks cannot be merged with the preceding character (such as "ア"), they will be written out in hankaku in the output. If you need those sound marks to be in zenkaku in the output, use JSMARK after JCOMPOSE (JCOMBINE + JCOMPOSE + JSMARK). See Japanese Sound Marks for details. |
JDECOMPOSE |
Japan |
Separate combining form sound marks (dakuten and handakutens) from their base character. Usually used before JSMARK. See Japanese Sound Marks for details.
|
JHIRAGANASTOL |
Japan |
Transforms small size yo-on and soku-on in an attribute to its large equivalent.
Zenkaku Large: あいうえおつやゆよわアイウエオツヤユヨワ Small: ぁぃぅぇぉっゃゅょゎ ァィゥェォッャュョヮ
Hankaku Large: アイウエオツヤユヨ Small: ァィゥェォッャュョ
|
JKANATOROMAN |
Japan |
Transform hiragana and full-width katakana characters in an attribute to Hebon style romaji. See Romaji characters.
|
JROMANTOKANA |
Japan |
Transforms romaji (Hebon) characters in an attribute to full-width katakana. See Romaji characters.
|
JSMARK |
Japan |
Transforms combining form sound marks (dakuten and handakutens) in an attribute to spacing mark form. Usually used after CJKTOFULL or JDECOMPOSE. See Japanese full-width and half-width characters for details.
|
JTOHIRAGANA |
Japan |
Transforms full-width katakana characters in an attribute to hiragana. If you want to convert half-width katakana characters to hiragara, run CJKTOFULL first and run JTOHIRAGANA. See Japanese full-width and half-width characters for details.
|
JTOKATAKANA | Japan |
Transforms hiragana characters in an attribute to full-width katakana. See Japanese full-width and half-width characters for details.
|
KTOROMAN | Korea |
Transforms Korean Hangul characters in an attribute to their romanized forms.
|
MATCH | All |
Compares attributes and/or values and returns a match score based on the Relationship Linker Comparison routines and modifiers. All comparison routines are available for this usage.
|
PROXIMITY | All |
Returns a calculated distance between two latitude and longitude coordinates, based on the DISTANCE Relationship Linker routine. Distance is measured in kilometers (KM), miles (MI), or nautical miles (NM). Each coordinate is made up of two numbers, one for latitude and one for longitude. This function is useful to create an expression in the Transformer to append the calculated distance in a new attribute or to use as part of a conditional statement.
|
UNIQUE_ID | All |
Generates universally unique identifiers (UUIDs) as unique permanent record identifiers. A UUID is a unique 36-character key and used to maintain high volume records in the database. You can use UUIDs, for example, to determine record/attribute changes for sorted files and manage multiple views of matched relationships. UUIDs are represented as 32 hexadecimal digits, displayed in five groups separated by hyphens, in the form of 8-4-4-4-12 for a total of 36 characters (32 alphanumeric characters and four hyphens). Example: f18e79d0-d474-494e-8290-7e09c4b9679d You can configure the Quality processes to generate UUIDs by creating a new attribute and setting the UNIQUE_ID function to that attribute. Note: The attribute to contain the unique IDs must be a minimum of 36
characters in length and the attribute type must be ASCII.
|