Duplicate Detection (Deprecated) - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

This deprecated node detects duplicate data within specified fields, segregating the data into two outputs.

CAUTION:
This node has been deprecated and will not be supported in a future release. As an alternative, the Duplicate Detection node can be used to provide similar functionality, but the underlying code is Python rather than Data360 Analyze Script.

The first output contains all rows that have no duplicate collision, the second contains all those rows which do contain duplicates.

To remove duplicates, you can use the Remove Duplicates node.

Properties

InputExpr

Specify the expression to test for duplicates.

Detecting rows that have duplicate values in a single field:

id

Detecting rows that have duplicate values across multiple fields:

id, 'type', status

Note that you may need to surround a field name in single quotes if it is also a reserved keyword in Data360 Analyze Script.

A value is required for this property.

ErrorIfDuplicates

Optionally specify whether to generate an error if any duplicates are detected.

The default value is True.

Inputs and outputs

Inputs: Input to Validate.

Outputs: single occurrence, multiple occurrence.