Performs a quick statistical analysis of numeric input data, using the sum, min, max, average, count, first, last, standard deviation, variance, and null count functions.
To configure this node:
- In the FieldList property, specify the names of the fields over which you want to run the analysis. You can choose fields by selecting the Input Fields option in the property menu, or you can type the names of the fields in the following format:
fields.<field name>
Tip: If you do not enter any field names, the node analyzes all input fields. - If you want to group the records by specific fields to produce more granular results, select the fields that you want to group by in the GroupBy property.
The GroupBy property is a multi-field picker, a property type which is found on a number of nodes. For more information on this property type, see Multi-field picker.
- If you want to exclude any statistical functions from the evaluation, set the corresponding property to False. For example, to exclude the average function, set the IncludeAverage property to False. Tip: All functions are set to run by default but can be individually turned off to optimize performance. In particular, when not in use, we recommend that you set IncludeSum and IncludeAverage to False when working with large data sets.
By default, the node outputs the statistics for all fields to one record per input field per group. To change this behavior, you can set the WideOutput property to True to output the statistics for all fields to one line per group.
Properties
FieldList
Optionally specify a comma separated list of input fields over which you want to run the analysis, in the following format: fields.<fieldName1>, fields.<fieldName2>
If no value is specified then all input fields are analyzed.
GroupBy
Select or type the names of the fields that you want to group by.
By default, the node will also sort the data in ascending order (low to high). From the menu button to the right of the field name, you have the option to change the sort order to Sort Descending (high to low), you can select Case Insensitive sorting, or for more advanced cases you can choose to Compare Substrings. There is also an option to Delete a selected field from the list.
If you have added multiple group by criteria, you can drag and drop the fields to reorder them if needed. The order of the fields determines which field the data will be sorted by first.
For advanced use cases, you can select the Advanced tab to type Python script to specify the fields that you want to group by. In this case, use the notation fields.<name>
separating each field reference with a comma. To sort in descending order, use the fn.desc
function.
Example: fields.FirstName, fn.desc(fields.DOB)
WideOutput
Optionally specify the output format. If set to True then the statistics for all fields will be output on one line per group. If set to False they will be output to one record per input field per group. The default value is False.
IncludeCount
Optionally specify whether to calculate the number of records across the group defined by the GroupBy property. The default value is True.
IncludeNullCount
Optionally specify whether to calculate the number of NULL values of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludeSum
Optionally specify whether to calculate the sum of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludeAverage
Optionally specify whether to calculate the average of each field defined in the FieldList property across the group defined by the GroupBy property.
The default value is True.
IncludeMin
Optionally specify whether to calculate the minimum value of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludeMax
Optionally specify whether to calculate the maximum value of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludeFirst
Optionally specify whether to output the first value of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludeLast
Optionally specify whether to output the last value of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludeSampleStdev
Optionally specify whether to calculate the sample standard deviation of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludePopulationStdev
Optionally specify whether to calculate the population standard deviation of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludeSampleVariance
Optionally specify whether to calculate the sample variance of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
IncludePopulationVariance
Optionally specify whether to calculate the population variance of each field defined in the FieldList property across the group defined by the GroupBy property. The default value is True.
SortInput
Optionally specify whether the input will be sorted based on the fields specified in the GroupBy property. The default value is True.
UnsortedInputBehavior
Optionally specify the behavior when input data has not been sorted. Choose from:
- Error - The node will fail if the input records are not sorted according to the GroupBy criteria.
- Log - If the input records are not sorted according to the GroupBy criteria, then a warning is logged the first time the problem is encountered, however the node will continue processing.
- Ignore - No action is taken if the input records are not sorted according to the GroupBy criteria.
The default value is Error.
Inputs and outputs
Inputs: Input Records.
Outputs: Statistics.