Execute data stages - Data360_DQ+ - Latest

Data360 DQ+ Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 DQ+
Version
Latest
Language
English
Product name
Data360 DQ+
Title
Data360 DQ+ Help
Copyright
2024
First publish date
2016
Last updated
2024-10-09
Published on
2024-10-09T14:37:51.625264

An execution occurs any time a data stage acts on data. There are numerous ways to implement execution. The decision to execute a data stage depends on the role that is played by the data stage in your data processing pipeline. The following example illustrates a chain of execution dependencies:

1) Data Store A Loads 2) Analysis Runs 4) Data View Runs 3) Data Store B Loads 1) Incoming Data 2) Input from Data Store A 3) Output to Data Store B 4) Input from Data Store B

In this example, numbered items illustrate points of activity that must occur in sequential order:

1) Data Store A loads incoming data.

2) The analysis runs.

3) The analysis outputs data to Data Store B.

4) The data view runs and loads data from Data Store B.

 

You can execute a data stage manually, or by creating a schedule.

Exactly what it means for each data stage to "execute" varies by item.

Manually executing a data stage

If you have Execute permission to an executable data stage, or if you are an administrator, you can run or rebuild it at any time.

For simple pipelines with only a small number of execution dependencies, you can manually execute data stages:

  1. Select Pipelines from the top of the screen.
  2. Click the menu button to the right of the data stage that you want to execute, then select Execute.
  3. Choose Run, Rebuild or Executions:
    • Run - Loads the data stage with new data based on its data load configurations.
    • Rebuild - Loads the data stage with all data, overriding its data load configurations. For example, you can rebuild a data stage that executes on a scheduled basis after you've modified it.
    • Executions - Opens the Executions History, listing all executions of the selected data stage. Depending on which stage is being viewed, different execution details will be provided. From this page, you can select a specific execution and perform the following additional options:
      • Terminate - Stops a run or rebuild mid-process.
      • Rollback - Reverts a data stage to the state it was in just after its last run or rebuild.
      • Rerun - Allows you to perform a run of a specific execution.
      • Refresh - Updates the list of executions.
      • Filter - If you have a long list of executions, you can filter to search for a specific one.
      • Filter By Process Id - If you have selected a process model, you can filter by the process ID of the selected process model to see a list of data stages that were executed during a specific run of that process model.
      • Download Log - Allows you to download log information for a selected execution. You cannot download log information for process models, data store load triggers, or case store execute triggers. Note that you must have Execute permission to the associated data stage to download log information.

Scheduling executions

Administrators and users with Administer permission to an executable data stage can schedule it to load or execute at any time.

If you are working with data stages that must execute on a continual basis, you can create an execution schedule to automate the process. This option is useful when working with data stores that load routinely and that are connected to data views or analyses.

Tip: When execution dependencies become complex, you can automate executions for multiple stages and pipelines using Process Models.
  1. Select Pipelines from the top of the screen.
  2. Click the menu button to the right of the data stage that you want to execute then select Edit > Edit Settings.
  3. Click the Schedule tab and select Enable Scheduling.
  4. Configure the execution settings to determine when you want the data stage to run.
    Tip: For increased flexibility, you have the option to enter a cron expression. This allows you to have multiple schedules on a data stage. For information on how to write a cron expression, see http://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html