Lookup (Deprecated) - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

This deprecated node enriches a large data set (left table) with a small data set (right table) by joining the two data sets based on common fields.

CAUTION:
This node has been deprecated and will not be supported in a future release. As an alternative, the Lookup node can be used to provide similar functionality, but the underlying code is Python rather than Data360 Analyze Script.

The Lookup node works in a similar way to the Join node and the Merge node and is an optimization of the inner and left joins.

An advantage of using the Lookup node is that the input data sets do not need to be sorted prior to using the node; however, the entire data set from the right table is loaded into memory, therefore, it is only recommended for use with a small data set on the right input. If you want to join two large data sets, it is recommend that you use the Merge node.

To merge two data sets using the Lookup node, in the InputKey property, specify the field names from the first (main) input data set that you want to base the join on. Then, in the LookupKey property, specify the input fields from the second (smaller) data set that you want to base the join on.

You can also use the Lookup node to join a single record (containing potentially multiple fields) to all the records in the other data set. In this case, enter 1 or true in both the InputKey and LookupKey properties.

Note: The data types of the input fields specified in the InputKey and LookupKey properties must match. For example, if the InputKey references a string input field, then the LookupKey must also reference an input field that contains string data. If the data types do not match, one of the fields must be converted prior to using the Lookup node.

For advanced users, it is also possible to define additional filtering criteria to be applied by the Lookup node while the data sets are merged; you can do this by entering Data360 Analyze Script in the Script property. See the Transform (Deprecated) or Split (Deprecated) node help topics for more information on the types of Script statements that can be used to refine your output.

Note: If there are any input fields that appear in both data sets, other than the fields that are specified in the InputKey and LookupKey properties, the node will fail. To avoid this error, you can specify additional Script in the Script property. For example, you may want to exclude the duplicate fields from the second data set.

In the output, if a left orphan is found, that is, if a record is not found in the lookup (right) data set, the right row contains a NULLvalue. When a match is found, both sides contain data.

Example

You have the following main data set containing a list of customers and locations:

Townstring Customer_Namestring
Liverpool Derek
Leeds Simon
Sheffield Alison
Chester David
Liverpool Emma
Liverpool Jane
Chester Mark
Sheffield Holly
Sheffield Stephen
Liverpool Wendy
Leeds Adrian

You want to enrich this data set with the name of the sales representative that is responsible for each customer:

townstring sales_repstring
Liverpool Jon
Chester Adam
Sheffield Paul
Leeds Claire

In the InputKey property, type:

Town

In the LookupKey property, type:

town

The output contains the following merged data set showing the areas that each sales rep are responsible for:

Townstring Customer_Namestring sales_repstring
Liverpool Derek Jon
Leeds Simon Claire
Sheffield Alison Paul
Chester David Adam
Liverpool Emma Jon
Liverpool Jane Jon
Chester Mark Adam
Sheffield Holly Paul
Sheffield Stephan Paul
Liverpool Wendy Jon
Leeds Adrian Claire

Properties

InputKey

Specify an expression or field name to determine the names of the input fields from the first (main) input data set that you want to base the join on.

A value is required for this property.

LookupKey

Specify an expression or field name to determine the names of the input fields from the lookup (smaller) data set that you want to base the join on.

A value is required for this property.

Script

Optionally specify Script to define additional filtering criteria to be applied by the Lookup node while the data sets are merged. The default code in the Script property outputs all records from the left input along with any matching fields from the right input.

The Script property is evaluated once for every row of data, regardless of whether or not a match is found.

If you only want to output fields that are found in both data sets, uncomment the line that states "where matchIsFound" (the matchIsFound variable is only true when a row in the main data set matches a row in the lookup data set).

DuplicateKeyBehavior

Optionally specify what to do if there are duplicate records in the lookup (right) data set, based on the values specified in the LookupKey property. Choose from:

  • Error - The node fails when it encounters a duplicate value.
  • Log - A warning is logged for each duplicate value. The lookup row that is emitted is the first match found.
  • Ignore - The first matching lookup value is used and no warning or error is logged.

The default value is Error.

Inputs and outputs

Inputs: data, lookup.

Outputs: out1, multiple optional.