Nodes are the building blocks of your data flow. A node is an individual piece of functionality that typically does one of the following:
- Brings data into your flow from a file or another system.
- Transforms or performs analysis on the data at a particular point in the data flow.
- Publishes data out from your flow to a file or another system.
The Nodes panel is displayed on the left hand side of the screen in the Data Flow Designer, and presents the nodes which are available for building your data flow.
Nodes are organized into several categories, depending on the type of functionality they offer:
- Input Connectors - importing data from files or other systems.
- Aggregation and Transformation - data manipulations, such as sorting, filtering, removing duplicates and summation.
- correlation - combining data sets on common keys.
- Interfaces and Adapters - programmatic interfaces to other systems, such as databases, ERP systems, HTTP or R.
- Logistics - control functions, such as switches, looping and dependency enforcement.
- Metadata and Structure - manipulations of of fields/columns and data flow structuring tools.
- Profiling and Patterns -data analysis and predictive analytics tools.
- Output Connectors - exporting/publishing data to files or other systems.
Categories you do not (or rarely) use can be collapsed so that they take up almost no room in the panel.
By default, a small subset of the available nodes is displayed, which is ideal for new users who are just getting started creating data flows. This subset is called Favorites, and consists of a few of the most commonly used nodes. Substantial data flows can be created from just this subset. Once you are more practiced at creating data flows, or when you need to interact with more external systems to import or publish data, or if you are just plain curious about what else is available, you can choose All Nodes from the drop-down list to see all of the nodes which are available for creating your data flow.
Searching for nodes
You can search for nodes by using the Search field at the top of the Nodes panel. You can search for nodes by name or by keyword.
If you know the name of the node you need, you can start typing the name into the Search field. As you type, the nodes that are displayed are filtered by your search term. For example, you can type 'mongo' to see which MongoDB nodes are available, or type 'sort' to quickly narrow down the list to locate the Sort node.
If you don't know the name of the node you are looking for, you can still use the Search field to find the right node, by entering possible keywords. Each node can have multiple keywords associated with it, and as you enter a search keyword the nodes that are displayed are filtered. For example, if you type 'merge', the complete set of correlation nodes is shown.
When the Nodes panel is set to display only Favorites, if you search for a node by typing into the Search field the results include a link to matches found in the All Nodes category.
Once you have found the node that you want to use in the data flow, drag it onto the data flow canvas to add it to your application. For more details, please see Placing and connecting nodes.
Showing experimental, deprecated and superseded nodes
From the Nodes panel menu you can choose whether to show:
- Experimental nodes - new nodes which are not yet fully supported and may be subject to change in future versions.
- Deprecated nodes - not recommended for use in new data flows as support is likely to be withdrawn in the next major release of the software.
- Superseded nodes - provided for backwards compatibility, but not recommended for use with new data flows, see Superseded nodes.