You can use the Uniqueness Check node to identify unique records in your data.
- Click the Add button to add a new check.
- Select the input field that you want to analyze.
- Select a match type. You must select Exact Match for the first check that you add. If you add multiple checks, you have the option to choose Match by Expression for subsequent checks:
- Exact Match - The node checks for identical values in the selected field.
- Match by Expression - The node uses the specified expression to identify matches. You can write an expression that allows tolerance based matching for numeric fields, or one that is based on fuzzy matching for string fields.
The Uniqueness Check node outputs all fields that were input to the node, as well as the following two additional fields:
- IsUnique - Displays True if the record is unique, or displays False if the value is duplicated in another record.
- GroupId - An ID is assigned to each value that is duplicated. All records with the same value are assigned the same ID.
Example
You have the following loan data and want to identify any records that have unique values in both the loan_amount
and term
fields:
member_id | loan_amount | term |
---|---|---|
60952255 | 30000 | 60 months |
60842012 | 30000 | 60 months |
60940830 | 15000 | 60 months |
60740622 | 15000 | 60 months |
60762118 | 15000 | 36 months |
60970398 | 18000 | 60 months |
60607015 | 12000 | 60 months |
60566817 | 12000 | 60 months |
- Click the Add button to add a new uniqueness check.
- Choose
loan_amount
as the Field to check, and select Exact Match. - Add a second check, this time select
term
as the Field to check and choose Exact Match.
The node identifies that there are two unique records based on the specified check criteria, as follows:
member_id | loan_amount | term | IsUnique | GroupId |
---|---|---|---|---|
60952255 | 30000 | 60 months |
False |
c9f9315710d8413f98b30728f9442dad |
60842012 | 30000 | 60 months | False | c9f9315710d8413f98b30728f9442dad |
60940830 | 15000 | 60 months | False | e0422e3ecd9410b8e3c33514b24f9a1 |
60740622 | 15000 | 60 months | False | e0422e3ecd9410b8e3c33514b24f9a1 |
60762118 | 15000 | 36 months | True | |
60970398 | 18000 | 60 months | True | |
60607015 | 12000 | 60 months | False | 2568a10d2846428ab682760c804aca88 |
60566817 | 12000 | 60 months | False | 2568a10d2846428ab682760c804aca88 |
Properties
Display Name
Specify a name for the node.
The default value is Uniqueness Check.
Uniqueness Checks
Field to check
Select a field to check for unique records.
Exact Match
Choose Exact Match if you want to identify records that have the same value.
Match by Expression
Choose Match by Expression if you want to enter an expression on which to base the match. Click the Edit button to open the expression editor.