Table Details in Profiling Results - discovery - 23.1

Spectrum Discovery Guide

Product type
Software
Portfolio
Verify
Product family
Spectrum
Product
Spectrum > Discovery
Version
23.1
Language
English
Product name
Spectrum Discovery
Title
Spectrum Discovery Guide
Topic type
How Do I
Reference
Overview
First publish date
2007
ft:lastEdition
2024-02-07
ft:lastPublication
2024-02-07T17:21:58.768552
Click any of the table names in the left pane to display these details:
  • Completeness: The percentage of complete and incomplete rows detected in your profiled data.
    Note: Click the Incomplete Rows and Complete Rows hyperlink to view the records in both categories.
  • Table Summary: Displays these details for every column in the table
    • Column Name: Names of all the columns in the table
    • Data Type: The data type in the column
    • Completeness (%): Completeness of records in the column
    • Uniqueness : Uniqueness of the data contained in the column
    • Detected Type: Displays the semantic types, such as email, phone, city, first name, and last name detected in the string in this column.
    • Other Stats: Displays various other statistics such as Min Length, Max Length, and Text Patterns
  • Null Count Frequency: Displays the number of null values in every row of the table

Viewing Outliers

You can view a summary of the outliers detected for each column in your table by clicking the Outlier Analysis tab. This tab displays category-wise occurrences of any pattern, value, length, or frequency for a column that falls outside the range of other observations.

For example, The permissible length of Country Names is up to 14 characters and strings having 15 characters are detected in your data; this string will be categorized under Length Outlier. The supported categories are Length Outliers, Frequency Outliers, Text Pattern Outliers, Numeric Outliers, Semantic Type Outliers, and Data Type Outliers.

Viewing Malformed Records

In Flat File profiling you can view malformed records in your table by clicking the Malformed Records tab. This tab displays the Category and Count of the malformed records. A record is treated as malformed for these categories:
  • Rows with lesser number of fields than the number of defined columns
  • Rows with more number of fields than the number of defined columns

    You can also display a preview of the malformed records by clicking the Category. The preview displays the Record Number, Record, and the Reason for categorizing a record as malformed.

Viewing Duplicate Records

You can view a summary of the duplicate records detected in your table by clicking the Duplicate Record Analysis tab.

The Duplicate Record Analysis tab also provides the capability to resolve the duplicate records determined in the data by clicking the Resolve Duplicates button. As you click, you are navigated to the smart rule creation page of the Prepare module, where you can further select the columns from your data to resolve the duplicates. To learn more about how to create smart rules, see Preparing Quality Rules.
Note: Only a Flat File or Connection type data source is supported to resolve the duplicates. In the case of a model-type data source, you see an error if you click on the Resolve Duplicates button.