This deprecated node identifies records which, when a running aggregate is calculated, exceed a designated threshold.
This node accepts an input dataset, calculates a running total of the records in each group, and places each input record in one of two output bins. One bin contains the records whose running totals fall within the threshold, and the other bin contains those exceed the threshold.
Consider the input dataset in Table 1 that has two groups of Blue, separated by a group of Green.
Record Id color count
------------------------------------
1 Blue 31
2 Blue 18
3 Green 93
4 Blue 2
This node treats the input data as having 3 groups of color: Blue, Green and Blue, and resets the running total for each group as shown in Table 2.
Table 2. Running Totals of each group
color Record Id running total
---------------------------------------
Blue 1 31 -- reset running total
Blue 2 49
Green 3 93 -- reset running total
Blue 4 231 -- reset running total
Input
Input data can be in any format and contain any metadata types that allows an expression (specified in the InputExpr property) to evaluate to a single value that is numeric: integer, double, or long.
Output
- Under - Records whose running total is under the threshold specified in the Threshold property.
- Over - Records whose running total is above the threshold specified in the Threshold property.
Both Under and Over outputs contain all of the input fields from the input data, and the following output fields:
- _RunningTotal - The running total of the records in the group. This field can be renamed via the RunningTotalFieldName property.
- _RowId - The row id of each record within its group. In each group, all the records are numbered sequentially, starting with one for the first record. The row id of a record in a group allows a user to easily identify at which point the running total crosses the threshold. This field can be renamed via the RowIdFieldName property.
Properties
InputExpr
Specify the values used in the calculation of the running total for the input data. This expression is evaluated against each input row and must evaluate to a single value that is either an integer, a double, or a long value. This expression can convert data from one type to another. For example, it could convert two dates from two different fields into the number of days between those two dates.
A value is required for this property.
Threshold
Specify the numeric threshold that separates the records in the input data into the output bins.
A value is required for this property.
GroupBy
Specify the expression controlling how to aggregate.
A value is required for this property.
IncludeThreshold
Optionally specify whether the threshold comparison is inclusive or not.
For a record whose running total is exactly the threshold, it will be placed in the bin 'under' if this property is true. Otherwise, the record will be in the bin 'over'.
The default value is True.
RunningTotalFieldName
Optionally specify the field name for the running total.
By default, the running total is output in a field named '_RunningTotal'. However, this name can be changed by typing in a new value here. This is useful in situations where there is already a field with this name.
RowIdFieldName
Optionally specify the field name for the row id.
By default, the row id is output in a field named '_RowId'. However, this name can be changed by typing in a new value here. This is useful in situations where there is already a field with this name.
Inputs and outputs
Inputs: input.
Outputs: under, over.