Record Joiner performs a SQL-style JOIN
operation to combine records from
different streams based on a relationship between fields in the streams. You can use
Record Joiner to join records from multiple files,
multiple databases, or any upstream channels in the flow. You must connect at least
two input channels to Record Joiner. The results of the
JOIN
operation are then written to one output channel.
Optionally, records that do not match the join condition can be written to a
separate output channel.
Using Record Joiner
- On the Spectrum Flow Designer Home page, click New.
- On the New Flow page, click Job, Service, or Subflow, as required and then click the correcsponding blank canvas.
- Click Ok.
- In the dialog box that appears, give a name to the Flow, Job, Service, or Subflow you are creating.
- Click Ok.
- From the Palette Panel drag the Record
Joiner stage to the canvas. Note: Record Joiner is one of the Control Stages.
- Drag all Sources (of the records that are to be joined) to the canvas and connect their Output Port to the Record Joiner Input Port.
- Drag the Sink for the joined records and connect its Input Port to the Record Joiner Output Port.
- Configure the Sources. See the documentation of the respective stage for field-level details.
- Click Record Joiner and configure the Join Definition as described below.
Join Definition
|
Field Resolution
This tab specifies which port's data to use in the joined record in cases where the same field name exists in more than one input port. For example, if you are performing a join on two sources of data, and each source contains a field named DateOfBirth, you can specify which port's data to use in the DateOfBirth field in the joined record.
If there are fields of the same name but with different data, and you want to preserve both fields' data in the joined record, you must rename one of the fields before the data is sent to Record Joiner. You can use the Transformer stage to rename fields.
Handling Records That Are Not Joined
In order for a record to be included in the Record Joiner output it must meet the join condition, or a join type must be selected that returns both joined records and those that did not meet the join condition. For example, a full join will return all records from all input ports regardless of whether a record meets the join condition. In the case of a join type that does not return all records from all ports, such as a left outer join or an inner join, only records that match the join condition are included in the Record Joiner output.
To capture the records that are not included in the result of the join operation, use the not_joined output port. The output from this port contains all records that were not included in the regular output port.
Records that come out of this port have the field InputPortIndex added to them. This field contains the number of the Record Joiner input port where the record came from. This allows you to identify the source of the record.
- For optimal performance of this stage, ensure two independent streams of records are joined to generate a consolidated output.
- If a single path is first branched using either a broadcaster or conditional router then re-joined back using a Record Joiner, the flow may stop responding. In case multiple stages are used between branching and joining, use the Sorter as close to the Record Joiner as possible.
Math
The Math stage handles mathematical calculations on a single data row and allows you to conduct a variety of math functions using one or more expressions. Data is input as strings but the values must be numeric or Boolean, based on the type of operation being performed on the data.
- Under Control Stages, click the Math stage and drag it to the canvas, placing it where you want on the flow.
- Connect the stage to other stages on the canvas.
- Double-click the Math stage. The Math Options dialog box appears, with the Expressions tab open. This view shows the input fields, the Calculator, and the Expressions canvas. Alternately, you can click the Functions tab to use functions instead of the Calculator.