Executes data flows.
You can use this node to run another data flow from within your current data flow.
You can browse directly to the data flow that you want to run, or, if you want to run more than one data flow, you can connect an input node containing a list of paths to the data flows that you want to run.
When you run another data flow by using the Execute Data Flow node, a "Run State" is created. A run state contains the run properties, node run information and data which is created when the Execute Data Flow node runs another data flow. By default, run states are saved in the Directory if the data flow that is executed fails. If you want to save run states in the Directory when a data flow executes successfully, modify the configuration of the CleanupOnSuccess property.
You can choose where to save temporary data files in the TemporaryExecutionDataLocation property. This allows you to choose a location that is accessible to other users who are collaborating with you on a data flow, so that they can also view the run data without needing to re-run the nodes.
You can open run states from the Directory for viewing in read only mode, with an option to edit the underlying data flow. When you open a run state, the information is overlaid onto the most recent version of the data flow. This means that if you have edited the data flow since the run state was created, the execution information would relate to the older version of the data flow. For example, if you have edited a node in the data flow since the run state was created, when you view the run state you will be seeing the execution information and data for the node as it was before you edited it.
Run a single data flow
- In the DataFlow property, click the folder icon to browse to the data flow that you want to run.
- In the Choose a Data Flow dialog, select the data flow that you want to run and click Choose.
- If you want to configure run properties, you can reference a run property set that you have already defined on the system in the RunPropertySet property. For more information about run property sets, see Run property sets.
- By default, the run state that is generated by running the data flow will be saved into the same Directory folder as the current data flow. If you want to save the run state in a different location, specify where to save the run state in the OutputFolder property.
Run multiple data flows
In this example, you have an input node that contains a list of paths to the data flows that you want to run.
Path:string |
---|
//admin/Data flow 1#graph |
//admin/Data flow 2#graph |
//admin/Data flow 3#graph |
For more information, see Resource path examples.
- Connect your input data node to the Execute Data Flow node. For example, a Create Data node containing a list of paths to the data flows that you want to run.
- Select the Execute Data Flow node and choose the (from Field) variant of the DataFlow property.
- Type the name of the input field containing the list of paths to the data flows that you want to run, in this case
Path
. - If you want to configure run properties, you can reference a run property set that you have already defined on the system in the RunPropertySet property. For more information on run property sets, see Run property sets.
- By default, the run state that is generated by running the data flows will be saved into the same Directory folder as the current data flow. If you want to save the run states in a different location, specify where to save the run states in the OutputFolder property.
If the referenced data flows execute successfully, the Execute Data Flow node will run successfully. The node outputs run information, with one record for each data flow that was executed.
By default, the node will fail if any of the referenced data flows fail to run successfully. You can modify this behavior by using the FailedDataFlowBehavior property.
Run with data driven run properties
When running multiple data flows via the Execute Data Flow, rather than specifying a run property set to run against each data flow, an alternative is to specify run properties in the incoming data set.
In the following example, the incoming data set contains two data flows to be run in the field "DataFlowName", and three other fields "myType", "myId" and "myName":
DataFlowName | myType | myId | myName |
---|---|---|---|
//admin/testDataFlow#graph | tertiary | 8 |
Bob |
//admin/anotherDataFlow#graph | secondary | 5 | Bill |
When the RunPropertySet property is left blank, the node will pass each unused input field as a run property to the data flow when it is run:
- "testDataFlow" is passed the following run properties: myType=tertiary, myId=8, myName=Bob
- "anotherDataFlow" is passed the following run properties: myType=secondary, myId=5, myName=Bill
When the node processes the incoming data set, it will determine whether any unused fields (not used in the properties) are classed as "unmapped fields", if any unmapped fields are found the UnmappedFieldBehavior property specifies how the node will behave.
In this example, if "testDataFlow" uses "myType", and "anotherDataFlow" uses "myType" and "myId", but no data flows use the "myName" field, then "myName" is classed as an unmapped field.
Resource paths
On the Execute Data Flow node, you can specify resource paths as relative paths to the data flow that you are currently editing. For example:
The current data flow has the following path:
//public/myFolder/thisDataFlow
You want to reference a run property set which has this path:
//public/runPropSets/myRunPropSet
In this case, you can use a relative path, for example:
../runPropSets/myRunPropSet
This is useful, for example, when building data flows that you might want to move between a UAT and a Production system which have a similar folder structure.
Properties
DataFlow
Specify the data flow that is to be executed by the node.
Choose the (from Field) variant of this property to specify the name of an input field containing the data flow.
RunPropertySet
Optionally specify the parent run property set to be used by the data flow.
Choose the (from Field) variant of this property to specify the name of an input field containing the run property set.
OutputFolder
Optionally specify the directory to output the execution run state to.
If you do not specify a value, the run state will be saved in the same location as the selected data flow.
Choose the (from Field) variant of this property to specify the name of an input field containing the output folder.
TemporaryExecutionDataLocation
Optionally specify the location to be used for storing temporary execution data.
If the default value is used, then the default location is appended with suffix dataflow name and execution id.
Choose the (from Field) variant of this property to specify the name of an input field containing the output folder.
GenerateSubdirectories
Optionally specify the behavior of the node when a TemporaryExecutionDataLocation property is populated.
If True is selected then each run writes its temp data into a separate subdirectory.
If False is selected, the files from all runs are written into that specified folder in TemporaryExecutionDataLocation property without any separation.
The default value is False.
PassThroughFields
Optionally specify which input fields on the first input will "pass through" the node unchanged from the input to the output, assuming that the input exists. The input fields specified will appear on those output records which were produced as a result of the input fields. Choose from:
- All - Passes through all the input data fields to the output.
- None - Passes none of the input data fields to the output; as such, only the fields created by the node appear on the output.
- Used - Passes through all the fields that the node used to create the output. Used fields include any input field referenced by a property, be it explicitly (i.e. via a 'field1' reference) or via a field pattern (i.e. '1:foo*').
- Unused - Passes through all the fields that the node did not use to create the output.
The default value is Used.
If a naming conflict exists between a pass-through field and an explicitly named output field, an error will occur.
CleanupOnSuccess
Optionally specify what to clean up when the execution of a data flow succeeds. Choose from:
- None
- Temporary Data - Cleans up all the data produced by the nodes so you can't access any data from the input/output pins when viewing the run state.
- Temporary Data and Logs - Cleans up temporary data and any logs produced during the run.
- Node States - Cleans up temporary data, logs, and clears down the state of the nodes, whilst the run state will be available from the directory when viewing the run state no node state information will be available.
- Run State - Clears the entire run state, so that you can't see it from the Directory.
The default value is Run State.
CleanupOnFailure
Optionally specify what to clean up when the execution of a data flow fails. Choose from:
- None
- Temporary Data - Cleans up all the data produced by the nodes so you can't access any data from the input/output pins when viewing the run state.
- Temporary Data and Logs - Cleans up temporary data and any logs produced during the run.
- Node States - Cleans up temporary data, logs, and clears down the state of the nodes, whilst the run state will be available from the directory when viewing the run state no node state information will be available.
- Run State - Clears the entire run state, so that you can't see it from the Directory.
The default value is None.
FailedDataFlowBehavior
Optionally specify the behavior of the node if a failure occurs while executing a data flow. Choose from:
- Error - Logs an error to the Errors panel.
- Log - Logs a warning to the Errors panel.
- Ignore - Ignores the error.
The default value is Error.
UnmappedFieldBehavior
Optionally specify the behavior of the node if an input field is not used by the node. Choose from:
- Error - Logs an error to the Errors panel.
- Log - Logs a warning to the Errors panel.
- Ignore - Ignores the error.
The default value is Ignore.
StopAtFirstFailure
Optionally specify if the node should terminate execution on the first failure it encounters.
The default value is False.
Example data flows
A number of sample Data Flows are available from the Samples workspace, found in the Analyze Directory page.
In the Directory under the /Data360 Samples/Node Examples/
folder, you will find "Running sub Data Flows with Execute Dataflow node", which shows examples of how to use this node.
Inputs and outputs
Inputs: 1 optional
Outputs: out and errors