The CDP standardizes name and address data by identifying individual elements and recoding the elements according to country-specific rules and tables. The CDP completes its process in seven major steps:
Step 1: Assign Intrinsic Attributes
- intrinsic attribute
- An intrinsic attribute describes the type of data in a token (for example, ALPHA, 1NUMERIC, and so on).
- Example
-
Line 1
Input
Mr
John
E
Smith
Jr
Intrinsic Attribute
ALPHA
ALPHA
1ALPHA
ALPHA
ALPHA
Line 2
Input
1
Lexington
Rd
Intrinsic Attribute
1NUMERIC
ALPHA
ALPHA
Line 3
Input
Billerica
MA
01821
Intrinsic Attribute
ALPHA
ALPHA
NUMERIC
Step 2: Assign Specific Attributes
- specific attributes
- A specific attribute identifies the token as a particular address element, such as CITY, POSTCODE, and so on.
- Example
-
Line 1
Input
Mr
John
E
Smith
Jr
Intrinsic Attribute
ALPHA
ALPHA
1ALPHA
ALPHA
ALPHA
Specific Attribute TITLE-PREFIX GIVEN-NAME1 GENERATION Line 2
Input
1
Lexington
Rd
Intrinsic Attribute
1NUMERIC
ALPHA
ALPHA
Specific Attribute STREET-TYPE Line 3
Input
Billerica
MA
01821
Intrinsic Attribute
ALPHA
ALPHA
NUMERIC
Specific Attribute POSTCODE
Step 3: Assign Line Types
Using the attributes assigned in Step 2, the CDP identifies the line type and reassesses the attributes based on the line type. For example, if the attributes on a line include STREET-NAME and STREET-TYPE, the CDP would identify the line type as Street. The Parser identifies four types of lines:
- Name (N)
- Street (S)
- Geography (G)
- Miscellaneous (M)
- Example
-
Line 1
Input
Mr
John
E
Smith
Jr
Line Type = N TITLE-PREFIX GIVEN-NAME1 1ALPHA ALPHA GENERATION Line 2
Input
1
Lexington
Rd
Line Type = S STREET-TYPE Line 3
Input
Billerica
MA
01821
Line Type = G POSTCODE
Step 4: Process Geography Lines
After identifying the lines, the CDP then parses each line in detail. Geographic lines are parsed first using the city tables.
- Example
-
Line 3 (Line type = G)
Input
Billerica
MA
01821
Final Attribute CITY STATE POSTCODE
Step 5: Process Name Lines
Next, the CDP parses the name lines. The CDP looks up the original pattern of the name line in the country-specific word and pattern definitions table and recodes the line based on the recoded pattern.
- Example
-
Line 1 (Line type = N)
Input
Mr
John
E
Smith Jr Final Attribute TITLE-PREFIX GIVEN-NAME1 GIVEN-NAME2 SURNAME GENERATION
Step 6: Process Street Lines
Street line information is similar to name line information. The CDP looks up the original pattern of the street line in the country-specific word and pattern definitions table and recodes the line based on the recoded pattern.
- Example
-
Line 2 (Line type = S)
Input
1
Lexington
Rd
Final Attribute HSNO STREET-NAME STREET-TYPE
Step 7: Generate Output
Based on the final attributes, the CDP generates standardized output that can be analyzed and corrected (PR_ fields, the "Parser Repository").
- Example
-
CDP Input CDP Output Line 1:
Mr John E Smith Jr Pr Name Prefix Recoded 01
MR
Line 2:
1 Lexington Rd Pr Given Name1 Recoded 01
JOHN Line 3:
Billerica, MA 01821 Pr Given Name2 Recoded 01
E Pr Surname1 Recoded 01
SMITH Pr Name Generation Recoded 01
JR Pr House Number Recoded
1 Pr Street Name Recoded
LEXINGTON Pr Street Type1 Recoded
RD Pr City Name Recoded
BILLERICA Pr State Recoded
MA Pr Postal Code
01821