Threshold Test (Superseded) - Data360_Analyze - 3 - 3.12

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
3.12
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2023
First publish date
2016

Identifies records which, when a running aggregate is calculated, exceed a designated threshold.

Note: This node has been superseded by the Threshold Test node which provides similar functionality. The Threshold Test (Superseded) node is provided for backwards compatibility, but where possible it is recommended that you use the new Threshold Test node.

This node accepts an input dataset, calculates a running total of the records in each group, and places each input record in one of two output bins. One bin contains the records whose running totals fall within the threshold, and the other bin contains those exceed the threshold.

Note: This node does not require the input dataset to be sorted on the GroupBy property. When the value of a group changes in the input, a new group is formed.

Consider the input dataset in Table 1 that has two groups of Blue, separated by a group of Green.


Record Id          color       count
------------------------------------
1                  Blue        31
2                  Blue        18
3                  Green       93
4                  Blue         2

This node treats the input data as having 3 groups of color: Blue, Green and Blue, and resets the running total for each group as shown in Table 2.

Table 2. Running Totals of each group


color     Record Id       running total
---------------------------------------
Blue      1                  31            -- reset running total
Blue      2                  49
Green     3                  93            -- reset running total
Blue      4                  231           -- reset running total

Input

Input data can be in any format and contain any metadata types that allows an expression (specified in the InputExpr property) to evaluate to a single value that is numeric: integer, double, or long.

Output

  • Under - Records whose running total is under the threshold specified in the Threshold property.
  • Over - Records whose running total is above the threshold specified in the Threshold property.

Both Under and Over outputs contain all of the input fields from the input data, and the following output fields:

  • _RunningTotal - The running total of the records in the group. This field can be renamed via the RunningTotalFieldName property.
  • _RowId - The row id of each record within its group. In each group, all the records are numbered sequentially, starting with one for the first record. The row id of a record in a group allows a user to easily identify at which point the running total crosses the threshold. This field can be renamed via the RowIdFieldName property.

Properties

InputExpr

Specify the values used in the calculation of the running total for the input data. This expression is evaluated against each input row and must evaluate to a single value that is either an integer, a double, or a long value. This expression can convert data from one type to another. For example, it could convert two dates from two different fields into the number of days between those two dates.

A value is required for this property.

Threshold

Specify the numeric threshold that separates the records in the input data into the output bins.

A value is required for this property.

GroupBy

Specify the expression controlling how to aggregate.

A value is required for this property.

IncludeThreshold

Optionally specify whether the threshold comparison is inclusive or not.

For a record whose running total is exactly the threshold, it will be placed in the bin 'under' if this property is true. Otherwise, the record will be in the bin 'over'.

The default value is True.

RunningTotalFieldName

Optionally specify the field name for the running total.

By default, the running total is output in a field named '_RunningTotal'. However, this name can be changed by typing in a new value here. This is useful in situations where there is already a field with this name.

RowIdFieldName

Optionally specify the field name for the row id.

By default, the row id is output in a field named '_RowId'. However, this name can be changed by typing in a new value here. This is useful in situations where there is already a field with this name.

Inputs and outputs

Inputs: input.

Outputs: under, over.