Data flows are built by connecting nodes together which import, transform and then export or publish data to files or external systems.
Nodes which import data from files or other systems fetch data and then make it available on their output pin. These output pins are then connected to the input pins of downstream nodes which manipulate or analyze the data that they receive. The output pins of these nodes then pass on this transformed data to further nodes, building chains of transformations. You can then use publishing nodes to export the final data back out to files or an external system.
To build your data flow:
- Drag and drop nodes from the Nodes panel onto the data flow canvas. See Browsing for nodes for more details.
- To connect nodes, drag from the output pin of one node to the input pin of another.Note: An input pin can only receive data from one output pin.
If a node requires a valid input connection to enable it to run, an orange circle is displayed within the required input pin until an enabled input is connected. If a node can receive additional optional inputs, these are indicated by a dashed circle. In the following example, the Cat node has a required input pin and an optional input pin:
See Required input pins for more details.
- You can disconnect a node by dragging from its input pin without choosing another pin to connect to.
- Continue to build your data flow by adding more nodes to the canvas and connecting them.
- As you add nodes, configure their properties to determine how they import, export or manipulate data. See Configuring nodes for more details.
- You can rename nodes directly from the canvas by selecting the node name text below the node and typing a new name.
Users with the role of Administrator can grant or revoke permission for you to use certain nodes. Nodes for which you do not have Read permission are not shown in the Nodes panel. If instances of these nodes occur in any data flows that you have access to, they will be displayed as unresolved nodes.
You can only run library nodes for which you have Execute permission. If you have a data flow that contains a library node for which you do not have Execute permission, when you run the data flow, the library node will generate an error stating that you do not have the required permission to execute it.
Required input pins
If a node requires a valid input connection to enable it to run, an orange circle is displayed within the required pin until an enabled input is connected. You will see the orange required pin status in the following scenarios:
- If a node has a required input pin that is missing a connection. For example, the third input pin on the Cat node needs to be connected to a valid input before the node can run:
To help you trace the missing input, all downstream nodes ("Transform" and "Transform 2" in this example) also display the required status until a valid input is connected.
- If a node has a required pin that is connected to a disabled node. For example, the Transform node has been disabled, resulting in the required pin status on the Cat node:
Even if the disabled node had previously completed a successful run, the required status is displayed until the input is enabled.
- If a node within a composite has a required pin that has a missing or disabled input, the composite node status will indicate that the node has partially completed its run. In this case, open the composite to locate and resolve the issue.
You can use the Go to Linked Item menu option to navigate between nodes to locate a required pin. This can be particularly useful when working with large, complex data flows where nodes have been grouped within composites. See Navigating between nodes.