Duplicate Detection (Superseded) - Data360_Analyze - 3.12

Data360 Analyze Server Help

Product
Data360 Analyze
Version
3.12
Language
English
Portfolio
Verify
Product family
Data360
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2023
First publish date
2016

Detects duplicate data within specified fields, segregating the data into two outputs.

The first output contains all rows that have no duplicate collision, the second contains all those rows which do contain duplicates.

Note: This node has been superseded by the Duplicate Detection node which provides similar functionality, but the underlying code is Python rather than Data360 Analyze Script. The Duplicate Detection (Superseded) node is provided for backwards compatibility, but where possible it is recommended that you use the new Duplicate Detection node.

To remove duplicates, you can use the Remove Duplicates node.

Properties

InputExpr

Specify the expression to test for duplicates.

Detecting rows that have duplicate values in a single field:

id

Detecting rows that have duplicate values across multiple fields:

id, 'type', status

Note that you may need to surround a field name in single quotes if it is also a reserved keyword in Data360 Analyze Script.

A value is required for this property.

ErrorIfDuplicates

Optionally specify whether to generate an error if any duplicates are detected.

The default value is True.

Inputs and outputs

Inputs: Input to Validate.

Outputs: single occurrence, multiple occurrence.