About Dynamic Data Sources - trillium_discovery - 17.1

Trillium Discovery Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Version
17.1
Language
English
Product name
Trillium Discovery
Title
Trillium Discovery Center
Topic type
Administration
Overview
How Do I
Configuration
Reference
Installation
First publish date
2008

A dynamic data source points to your data residing in an external data source. Unlike profiled data sources, where data is fully loaded (imported) into the Trillium repository, dynamic data sources allow you to work directly with the external data.

With dynamic data sources, you access a limited but important set of metadata to help analyze your data. You can drill down to see all the data in the source file, then create and run business rules against dynamic data sources to view passing and failing row results.

The following is a process workflow using dynamic data sources:

  1. Create a dynamic data source that links directly to your external data source file or table.
  2. Examine data rows and drill down to details for further investigation.
  3. Create business rules and run them against the data. Drill down to see passing and failing rows.
  4. Determine which rows and attributes you might want to load into a repository for full analysis.

    Dynamic data sources are useful in the planning phase of a data quality project, especially when you have thousands or millions of data rows. You can add dynamic data sources with only a few hundred rows and apply business rules. This can act as a template to help analyze the results from the sample data and validate that your rules and standards meet your requirements.

    Guidelines: Note the following when working with dynamic data sources:
    • HDFS delimited data sources cannot be dynamic, they must be profiled data sources.
    • Although you work directly with the physical data, your external data is never overwritten.
    • Dynamic data sources are indicated with a bolt icon ().
    • Because dynamic data sources are linked to your external data source files, they are usually created much faster than profiled (loaded) data sources, especially when the source data is large (millions of rows). Conversely, because dynamic data source data is not loaded (and therefore not fully analyzed or indexed), filtering, running business rules, and drilling down on data takes longer than when working with a profiled data source.
    • If the external data file is deleted, its name modified, the directory where the linked files are stored is changed, or the data connection data or directory for the source file is redefined, the drill-down to view the data rows will no longer be valid.
    • If the original data changes, ensure you rerun business rules associated with the dynamic data source to see current results.
    • Dynamic data sources cannot be used for Time Series Analysis (in Trillium Control Center) because they do not contain the historical data needed to trend changes over time.