Hash Split - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
ft:locale
en-US
Product name
Data360 Analyze
ft:title
Data360 Analyze Server Help
Copyright
2025
First publish date
2016
ft:lastEdition
2025-02-20
ft:lastPublication
2025-02-20T11:13:02.494000

Splits the input record set into multiple streams to allow parallel processing of subsets of the input.

Takes one input and a list of field names. Splits the input into any number of outputs, based on hashing the values in the specified fields. This allows you to split your data into a set of subsequent nodes that can all process at the same time with lower data counts, as opposed to one node processing with a very large data count.

However, it is important to note that if you are using the Hash Split on two streams, with the intention of merging them into a Join operation, you must use the SplitFields property. Otherwise, the results will be inconsistent across the streams or non-deterministic.

Tip: For optimal performance, the number of outputs should be odd, preferably a prime number.

Properties

SplitFields

Specify a comma-separated list of fields on which to hash.

Inputs and outputs

Inputs: in1.

Outputs: out1, out2, multiple optional.