Profile Data Legacy - Data360_DQ+ - Latest

Data360 DQ+ Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 DQ+
Version
Latest
Language
English
Product name
Data360 DQ+
Title
Data360 DQ+ Help
Copyright
2024
First publish date
2016
ft:lastEdition
2024-07-09
ft:lastPublication
2024-07-09T15:09:58.774265
Note: This node is deprecated and has been replaced by the Profile Data Node.

 

The Profile Data Legacy node allows you to gather information about any field in your data set. Using this node generates a sheet with fields containing useful information about each profiled field. Fields generated by the Profile Data Legacy node are as follows.

Note: When building your analysis, remember that profile information is representative of your data sample. To obtain an actual profile of the entire data set, you need to execute your analysis.

field

The name of the field you are profiling.

uniqueCount

The number of unique values found in the field.

nullsCount

The number of records in the field that contain null values. Null values are those that equal ‘', NULLF(), or those where the value is empty.

emptyCount

The number of records that are null or have a string length of 0.

max

Numeric fields: The maximum numeric value in the field.

String fields: The string that appears at the end of the list, when all values in the field are sorted alphabetically.

Date fields: The most recent date in the field.

Note: Null values are not evaluated by max profiling.

min

Numeric fields: The minimum numeric value in the field.

String fields: The string that appears at the beginning of the list, when all values in the field are sorted alphabetically.

Date fields: The least recent date in the field.

Note: Null values are not evaluated by min profiling.

values

The values field generates a column where each record is an array containing all unique values for a profiled field and counts of each unique value.

For example, consider the following data set:

field1

one

two

three

three

four

Profiling field1 would produce the following record in the values field:values = [{‘value': ‘one', ‘two', ‘three', ‘four'}, {‘totalCount': 1, 1, 2, 1}]

When viewing the values field for an individual profiled field, you can also use the arrow navigation buttons to view the values fields for other profiled fields.

patterns

The patterns field generates a column where each record is an array containing all unique value patterns for a profiled field and counts of each unique value pattern.

For example, consider the following data set:

measure1

string2

1

x

10

xy

100

xyz

Profiling measure1 would produce the following records in the patterns field:pattern = [{‘pattern': N, NN, NNN}, {‘totalCount': 1, 1, 1}]

Profiling string1 would produce the following records in the patterns field:

pattern = [{‘pattern': a, aa, aaa}, {‘totalCount': 1, 1, 1}]

When viewing the pattern field for an individual profiled field, you can also use the arrow navigation buttons to view the patterns fields for other profiled fields.

Custom Counters

Custom Counters allow you to create expressions to apply to individual fields, which are used to create a count of values that satisfy the expression.

For example, consider the following data set:

id

value

001

100

002

125

003

150

004

175

005

200

Were you to create a Custom Counter field named greaterThan150 using the expression value < 150, greaterThan150 would equal 2.

Value Limit

The Value Limit parameter controls how many values can be displayed in the Profile Data Legacy node's values and patterns array fields.

For the values and patterns fields, if more values are found than the Value Limit, no values will be displayed.

By default, Value Limit is set to 10,000. This is the value that will be used if the parameter is left blank.