Standardizing Personal Names - spectrum_quality_1 - 23.1

Spectrum Data Quality Guide

Product type
Software
Portfolio
Verify
Product family
Spectrum
Product
Spectrum > Quality > Spectrum Quality
Version
23.1
Language
English
Product name
Spectrum Data Quality
Title
Spectrum Data Quality Guide
Topic type
How Do I
Overview
Tips
Reference
First publish date
2007
ft:lastEdition
2024-03-04
ft:lastPublication
2024-03-04T22:52:13.486265

This procedure shows how to create a dataflow that takes personal name data (for example "John P. Smith"), identifies common nicknames of the same name, and create a standard version of the name that can then be used to consolidate redundant records.

Note: Before beginning, make sure that your input data has a field named "Name" that contains the full name of the person.
  1. If you have not already done so, load the following tables onto the Spectrum Technology Platform server:
    • Open Parser Base
    • Open Parser Enhanced Names

    Use the Data Normalization database load utility to load these tables. For instructions on loading tables, see the Installation Guide.

  2. In Enterprise Designer, create a new dataflow.
  3. Drag a source stage onto the canvas.
  4. Double-click the source stage and configure it. See the Dataflow Designer's Guide for instructions on configuring source stages.
  5. Drag an Open Name Parser stage onto the canvas and connect it to the source stage.

    For example, if you are using a Read from File stage, your dataflow would look like this:

    Read from File in dataflow
  6. Drag a Table Lookup stage onto the canvas and connect it to the Open Name Parser stage.

    Your dataflow should now look like this:

    Open Name Parser connects to Table Lookup connected in dataflow
  7. Double-click the Table Lookup stage on the canvas.
  8. In the Source field, select FirstName.
  9. In the Destination field, select FirstName.

    By specifying the same field as both the source and destination, the field will be updated with the standardized version of the name.

  10. In the Table field, select NickNames.xml.
  11. Click OK.
  12. Click OK again to close the Table Lookup Options window.
  13. Drag a sink stage onto the canvas and connect it to the Table Lookup stage.

    For example, if you were using a Write to File sink, your dataflow would now look like this:

    Write to File in dataflow
  14. Double-click the sink stage and configure it. See the Dataflow Designer's Guide for instructions on configuring source stages.

You now have a dataflow that takes personal names and standardizes the first name, replacing nicknames with the standard form of the name.