Splits the input record set into multiple streams to allow parallel processing of subsets of the input.
Takes one input and a list of field names. Splits the input into any number of outputs, based on hashing the values in the specified fields. This allows you to split your data into a set of subsequent nodes that can all process at the same time with lower data counts, as opposed to one node processing with a very large data count.
Tip: For optimal performance, the number
of outputs should be odd, preferably a prime number.
Properties
SplitFields
Specify a comma-separated list of fields on which to hash.
Inputs and outputs
Inputs: in1.
Outputs: out1, out2, multiple optional.