Threshold Test - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

Identifies records which, when a running aggregate is calculated, exceed a designated threshold.

This node accepts an input dataset, calculates a running total of the records in each group, and places each input record in one of two output bins. One bin contains the records whose running totals fall within the threshold, and the other bin contains those exceed the threshold.

Note: This node does not require the input dataset to be sorted on the GroupBy property. When the value of a group changes in the input, a new group is formed.

Consider the input dataset in Table 1 that has two groups of Blue, separated by a group of Green.


Record Id          color       count
------------------------------------
1                  Blue        31
2                  Blue        18
3                  Green       93
4                  Blue         2

This node treats the input data as having 3 groups of color: Blue, Green and Blue, and resets the running total for each group as shown in Table 2.

Table 2. Running Totals of each group


color     Record Id       running total
---------------------------------------
Blue      1                  31            -- reset running total
Blue      2                  49
Green     3                  93            -- reset running total
Blue      4                  231           -- reset running total

Input

Input data can be in any format and contain any metadata types that allows an expression (specified in the InputExpr property) to evaluate to a single value that is numeric: integer, double, or long.

Output

  • Under - Records whose running total is under the threshold specified in the Threshold property.
  • Over - Records whose running total is above the threshold specified in the Threshold property.

Both Under and Over outputs contain all of the input fields from the input data, and the following output fields:

  • _RunningTotal - The running total of the records in the group. This field can be renamed via the RunningTotalFieldName property.
  • _RowId - The row id of each record within its group. In each group, all the records are numbered sequentially, starting with one for the first record. The row id of a record in a group allows a user to easily identify at which point the running total crosses the threshold. This field can be renamed via the RowIdFieldName property.

Properties

InputExpression

Specify the values used in the calculation of the running total for the input data. This expression is evaluated against each input row and must evaluate to a single value that is either an integer, a double, or a long value. This expression can convert data from one type to another. For example, it could convert two dates from two different fields into the number of days between those two dates.

A value is required for this property.

Threshold

Specify the numeric threshold that separates the records in the input data into the output bins. A value is required for this property.

GroupBy

Specify the expression controlling how to aggregate. A value is required for this property.

IncludeThreshold

Optionally specify whether the threshold comparison is inclusive or not.

For a record whose running total is exactly the threshold, it will be placed in the bin 'under' if this property is true. Otherwise, the record will be in the bin 'over'. The default value is True.

RunningTotalFieldName

Optionally specify the field name for the running total.

By default, the running total is output in a field named '_RunningTotal'. However, this name can be changed by typing in a new value here. This is useful in situations where there is already a field with this name.

RowIdFieldName

Optionally specify the field name for the row id.

By default, the row id is output in a field named '_RowId'. However, this name can be changed by typing in a new value here. This is useful in situations where there is already a field with this name.

DefaultValueForNullText

Optionally specify the value to be used for Null input string or unicode fields when they are referenced in an expression in the InputExpr property.

Either choose the Null value from the drop-down list, or enter a different default value. The default value is an empty string.

DefaultValueForNullNumber

Optionally specify the value to be used for Null input numeric fields when they are referenced in an expression in the InputExpr property.

Choose one of the following from the drop-down list, or enter a different value:

  • 0
  • 1
  • Null - Null numeric values are left null when used in an expression.

The default value is 0.

Note: This property is applied to all numeric fields. Floating point numbers cannot be provided.

DefaultValueForNullBoolean

Optionally specify the value to be used for Null input boolean fields when they are referenced in an expression in the InputExpr property.

Choose from one of the following options:

  • True
  • False
  • Null

The default value is False.

DefaultValueForNullDate

Optionally specify the value to be used for Null input date fields when they are referenced in an expression in the InputExpr property.

Choose one of the following from the drop-down list, or enter a different default date value in the format yyyy-MM-dd:

  • Today - Null date values that are used in an expression are set to the date when the node began execution.
  • Null - Null date values are left null when used in an expression.
  • 1970-01-01

The default value is 1970-01-01.

DefaultValueForNullTime

Optionally specify the value to be used for Null input time fields when they are referenced in an expression in the InputExpr property. Choose one of the following from the drop-down list, or enter a different default time value in the format HH:mm:ss:

  • Now - Null time values that are used in an expression are set to the time when the node began execution.
  • 00:00:00
  • Null - Null time values are left null when used in an expression.

The default value is 00:00:00.

DefaultValueForNullDateTime

Optionally specify the value to be used for Null input datetime fields when they are referenced in an expression in the InputExpr property.

Choose one of the following from the drop-down list, or enter a different default datetime value in the format yyyy-MM-dd HH:mm:ss:

  • Today - Null datetime values that are used in an expression are set to the start of the day when the node began execution.
  • Now - Null datetime values that are used in an expression are set to the value of datetime when the node began execution.
  • Null - Null datetime values are left null when used in an expression.
  • 1970-01-01 00:00:00

The default value is 1970-01-01 00:00:00.

Inputs and outputs

Inputs: input.

Outputs: under, over.