Data profiling allows you to examine data and collect statistics or informative summaries about that data. The results of data profiling can help you to:
-
Understand the data as the first critical step of any data engineering project.
-
Determine data quality rules and requirements that will support a more thorough data quality assessment in a later step.
Fusion supports the following data profiling operations:
-
Column Frequency Analysis
-
Columns Profile
-
Join Analysis
-
Reference Discovery
There is a corresponding subsection in the Specifications section of the navigation tree for each type of data profiling. After you create a data profile specification of the corresponding type. Then you can:
-
Run it and see the result by right clicking on the specification name.
-
Or create an operation in a scenario that runs the specification.
Column Frequency Analysis
Column frequency analysis allows you to get the frequency distribution of values in a column. The result is a table with two columns: the first column contains all unique values of the input column and the second column contains the count of how many times a value appears in the input table.
Create a Column Frequency Profile Specification
To create a column frequency profile specification:
- Right click Specifications -> Profiling -> Column Frequency in the navigation tree and select Create profile.
- Enter the profile name and click the Create button. The current window will show the column frequency specification parameters.
- Enter the parameters: type or select source,space, table, and the column for which you want to compute the frequency distribution, then enter the name of the target table where the result will be stored.
Run a Column Frequency Profile Specification
To run a column frequency profile specification:
- Right click the Column frequency profile specification, select Run, and wait for completion.
- Right click the column frequency profile specification and select View result.
Alternatively you can create a ColumnFrequencyProfile
operation in
a scenario, providing the name of the specification as a parameter, and run it.
Delete Column Frequency Profile Specification
To delete a column frequency profile specification:
- Right click the specification and select Delete.