The Profile Data Legacy node allows you to gather information about any field in your data set. Using this node generates a sheet with fields containing useful information about each profiled field. Fields generated by the Profile Data Legacy node are as follows.
field
The name of the field you are profiling.
uniqueCount
The number of unique values found in the field.
nullsCount
The number of records in the field that contain null values. Null values are those that equal ‘', NULLF(), or those where the value is empty.
emptyCount
The number of records that are null or have a string length of 0.
max
Numeric fields: The maximum numeric value in the field.
String fields: The string that appears at the end of the list, when all values in the field are sorted alphabetically.
Date fields: The most recent date in the field.
min
Numeric fields: The minimum numeric value in the field.
String fields: The string that appears at the beginning of the list, when all values in the field are sorted alphabetically.
Date fields: The least recent date in the field.
values
The values field generates a column where each record is an array containing all unique values for a profiled field and counts of each unique value.
For example, consider the following data set:
field1 |
---|
one |
two |
three |
three |
four |
Profiling field1
would produce the following record in the values
field:values = [{‘value': ‘one', ‘two', ‘three', ‘four'}, {‘totalCount': 1, 1, 2, 1}]
When viewing the values
field for an individual profiled field, you can also use the arrow navigation buttons to view the values
fields for other profiled fields.
patterns
The patterns field generates a column where each record is an array containing all unique value patterns for a profiled field and counts of each unique value pattern.
For example, consider the following data set:
measure1 |
string2 |
---|---|
1 |
x |
10 |
xy |
100 |
xyz |
Profiling measure1
would produce the following records in the patterns
field:pattern = [{‘pattern': N, NN, NNN}, {‘totalCount': 1, 1, 1}]
Profiling string1
would produce the following records in the patterns
field:
pattern = [{‘pattern': a, aa, aaa}, {‘totalCount': 1, 1, 1}]
When viewing the pattern
field for an individual profiled field, you can also use the arrow navigation buttons to view the patterns
fields for other profiled fields.
Custom Counters
Custom Counters allow you to create expressions to apply to individual fields, which are used to create a count of values that satisfy the expression.
For example, consider the following data set:
id |
value |
---|---|
001 |
100 |
002 |
125 |
003 |
150 |
004 |
175 |
005 |
200 |
Were you to create a Custom Counter field named greaterThan150
using the expression value < 150, greaterThan150
would equal 2.
Value Limit
The Value Limit parameter controls how many values can be displayed in the Profile Data Legacy node's values
and patterns
array fields.
For the values
and patterns
fields, if more values are found than the Value Limit, no values will be displayed.
By default, Value Limit is set to 10,000. This is the value that will be used if the parameter is left blank.