Configure Data Source - trillium_discovery - 17.1

Trillium Discovery Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Version
17.1
Language
English
Product name
Trillium Discovery
Title
Trillium Discovery Center
Topic type
Overview
Administration
Configuration
Installation
Reference
How Do I
First publish date
2008

Properties you select on the Configure tab structure your data in the new data source. The available data source properties vary depending on the type of connection you chose on the Select tab.

A Data Preview accurately reflects the content and structure of the data source you are adding and establishes the configuration of the data source attributes (columns). If there are columns in your original data file that you do not want in the data source, or you want to change column positions, hide and move columns in the Data Preview by using the Edit Columns window.

You can also add a new attribute (column) to the data source to better view results for passing and failing business rules. For more information, see Adding and Editing a Custom Attribute/Column.

Note the following guidelines:

  • If your data source is RDBMS, there are no schema settings, although your repository administrator can extract RDBMS data to a delimited file with a corresponding ANSI DDL schema file. (For more information, see the Trillium Repository Administrator's Guide.)
  • You cannot use a schema (DDL) file during delimited dynamic data source creation. If you do, an empty data source is created.
  • The option Headings in first data row is not available for HDFS Delimited data source configurations. If your delimited files in HDFS have the first row of data as a header row, the row will be included as a data row.

To configure the data source

  1. In the Add Data Source window, click the Configure tab.
  2. Click the Attributes Defined by drop-down list and select one of the following options. The Data Preview will update based on your selection.
    Option Description
    Headings in first data row

    Delimited data sources only. Select if there are column names on the first data row line.

    Note: Not available for HDFS data sources.

    Schema

    Fixed-length and delimited data sources only. Select if this data file has a corresponding schema file. If the data source is fixed-length, this is the only available option. Not available if you imported the source file from your local system. For delimited data sources, a properly formatted SQL-style schema is supported. A Trillium Quality-style XML schema formatted for use by Trillium Quality is not supported. This can include files generated by a Trillium Quality process. If you attempt to use a schema with an unsupported format, a message will display and prompt you to select a supported schema.

    The Data Preview is replaced with the Select File list:

    1. Select the schema that matches the data file. The Preview pane to the right shows the shape of your data using the schema.
    2. Click Select. The Data Preview panel opens with the schema's data structure applied to the data. The name of the schema file you selected displays above the Data Preview.
    3. To change your schema selection, click Change next to the schema name.

    For RDBMS data sources, the data preview displays default column headings returned by the external RDBMS data file. You can insert a structured query language (SQL) WHERE clause to filter the rows in the data source. See Inserting a SQL WHERE Clause. RDBMS data sources include ODBC, Oracle, and Db2.

    Not defined

    Delimited data sources only. Select if there are no column names on the first data row line. The Data Preview will display with default column heading values (for example, Attr 1, Attr 2, and so on).

  3. If attributes are defined by a schema, set the schema file options for the data source. The options vary depending on the data source type. Click the following links for details about the schema options, depending on the data source type:

    The Data Preview will update based on your selections.

    Note the following when you preview an HDFS delimited data source:

    • If you selected an HDFS directory as a source, the contents of the first file in the directory is displayed, although all files in the directory will be included in the data source.
    • HDFS file names that start with underscore (_) and files that are zero (0) bytes in size are ignored.
  4. Optional: Click Edit Columns to hide columns or change column order in the Data Preview. (Unavailable for some RDBMS data sources.) You can also add a new attribute to view expression results for other attributes in the data source. (For more information, see Adding and Editing a Custom Attribute/Column.)
  5. To edit the name of an existing column, select the name in the Included Columns list and click the edit icon () . The Edit Attribute Name window opens. Update the name as needed and click Save.
  6. Click Done to save the changes and close the Edit Columns window. The Data Preview updates with the changes.
    Note: Hidden columns will not be added to the data source.
  7. Click Summary to see an overview of your selections from the Select and Configure steps. Click the edit icon () of a step to make changes to your selections. Click Back to return to the Configure tab.
  8. Do one of the following:
    • Click Continue to save your settings and open the Add tab. See Add Data Source.
    • Click Cancel to close the Add Data Source window without saving your work.