Viewing Column Profile Statistics - discovery - 23.1

Spectrum Discovery Guide

Product type
Product family
Spectrum > Discovery
Product name
Spectrum Discovery
Spectrum Discovery Guide
Topic type
How Do I
First publish date

The column profile statistics let you analyze and review the completeness, uniqueness, frequency, and pattern of the data values to help you identify the potentially flawed data values, which could lead to poor analytical results. In case you observe any flaws with the values, you can investigate further to improve them and build a better match rule.

The Profile Statistics panel allows you to:
  1. Select a column name from the Column drop-down to view the statistics of any particular column.
    Note: The statistics are based on when the profile last ran, which could be days, months, or years. If you need the latest statistics, assuming that the source's data could have changed from its previous run, re-run the profile by clicking the Run Profile button.
  2. View the detailed profiling statistics of the columns by clicking More details adjacent to Profile Statistics. If you click on More details, you are navigated to the Data Profiling Results page with the corresponding columns' details. For more information, see Viewing Data Profiling Results.
    Note: The profile statistics is available only if you had run the profile against the selected column. In case your selected column has never been profiled, you must run the profile against the column at least once to make the profile statistics available. To run a profile, click the Run Profile button. You are navigated to the Add Profile page. For more details on how to create and run a profile, see Creating a Profile.
The Profile Statistics panel displays these comprehensive profiling statistics of a column:
  1. Completeness: Displays the completeness of records in the column. The percentage of Complete, Null, and Empty String detected in the column is displayed in the legend.
  2. Uniqueness: Displays the uniqueness of the data contained in the column by showing the percentage of these statistics in the legend:
    • Unique: Records with no duplicates in the data source.
    • NonUnique: Records having duplicates in the data source.
    • Distinct: A list of all records present in your data source irrespective of those being unique or non-unique record.
  3. Frequency: Displays the frequency of the data contained in the column.
    Note: It is displayed only if the column data type is string.
  4. Patterns: Displays the pattern of the data contained in the column.
    Note: It is displayed only if the column data type is string.