Understanding data patterns gives you the information you need to make decisions about standards used in your data quality projects.
Examining patterns helps identify format deviations and anomalies in your data. Drilling
down to data rows from a pattern adds context to help determine whether a pattern is
correct or not. You can then export the data rows to your local or server system.
Guidelines:
When you examine patterns, you drill-down from a data source to see the pattern values and pattern metadata for a selected attribute. You can then drill-down to see the attribute rows that contain the pattern.
Note the following guidelines when examining each pattern type:
- Character Patterns. If you have attributes that need to conform to a fixed format, such a date or currency format, you examine character patterns to find inconsistencies and errors. See Character Pattern Types for information about choices for pattern encoding.
- Masks. You can examine an attribute for a unique mask pattern. A mask is a description of a word, phrase, or number and identifies each character as alphabetic, numeric, or a special character. The mask pattern is the shape produced by this encoding and shows the common qualities unique to a word(s), phrase(s), or number(s). Before you examine a mask, note the encoding conventions used by the mask pattern.
- Metaphones. If you have an attribute that has a large number of values compared to the number of metaphones, this is an indication that you may have multiple misspelled values.
- Soundexes. Discovery Center groups data values that have been
analyzed as having similar sounds and identifies them as soundexes.
Examining soundexes helps you find duplicated data and misspellings. Note: Soundexes are not available for numeric values and non-ASCII encoded data.
Examine all patterns in an attribute
To examine all patterns in an attribute
- Open a data source.
- Click the Attribute Details tab.
- In the Attribute Name list, select the attribute that contains the patterns you want to examine. A tab named for the attribute opens below the Data Source: Name panel showing an overview of the attribute's metadata. Rows showing metadata for discovered character patterns, masks, soundexes, and metaphones are highlighted in blue. Note the value. This is the number of unique patterns in the attribute.
- Double-click a pattern. The pattern: attribute_name tab opens showing all patterns of the selected type in the attribute, along with metadata such as pattern frequency and distribution %.
- To see pattern values, double-click a row. The Values: Selected pattern tab opens showing the pattern values, frequency, distribution %, length, and other metadata.
- To see data rows that contain the pattern values, double-click a row. The Data Rows: Selected pattern Value tab opens.
- (Optional) Export selected rows to your local or server system as a .CSV file. You can export all rows to your server system. See Exporting Tab Rows.
Examine pattern values, metadata, and associated data rows
To examine pattern values, metadata, and associated data rows