Viewing Key Analysis - trillium_discovery - 17.1

Trillium Discovery Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Version
17.1
Language
English
Product name
Trillium Discovery
Title
Trillium Discovery Center
Topic type
How Do I
Overview
Configuration
Reference
Administration
Installation
First publish date
2008

Trillium performs key analysis automatically when you import data to a repository by creating a profiled data source. The analysis uses only a sample of your data and attempts to find the keys that meet default criteria for uniqueness.

You can drill-down and see duplicate keys and the data rows that contain the duplicate key values. You can also export rows of key analysis metadata as a .CSV file to your local or server system.

To view key results

  1. Open a data source.
  2. Click Summary. The Relationships table shows the number of permanent and discovered keys, dependencies, and joins in the data.
  3. If one or more keys are permanent or discovered, click the Keys column name to open the Keys tab and see keys metadata. To open a tab showing just those keys that are permanent or discovered, click a number in the Keys column. If there are no keys available, a zero (0) displays and the drill-downs are unavailable.
  4. The keys that the Discovery Center shows you may or may not be the keys that you expect in your data. Therefore, review the keys that are found and verify whether they are valid.
  5. (Optional) Set relationship status to discovered or permanent.
  6. To see duplicate keys, from any Keys tab, double-click a key with a Quality % value that is less than 100%. The Duplicate: Selected Key tab opens showing the duplicate key values for each left-hand attribute. Duplicate keys indicate key values for the left-hand attribute(s) that are non-unique and therefore not good candidates for permanent keys.
  7. To see data rows associated with the duplicate keys, from the Duplicate: Selected Key tab, double-click a row. The Duplicates: Selected Key tab opens, showing the data rows that contain the duplicate key values. Duplicate keys indicate key values that are non-unique and so are not good candidates for permanent keys.
    Note: Keys that are found by combining two attributes are not checked with a third attribute. Only attributes that fail to be double-attribute keys are checked as triple-attribute keys, quadruple-attribute keys, and so on.
  8. To delete a key, see Deleting Relationships.
  9. Export key analysis results from any open Keys, Discovered Keys, Permanent Keys, Duplicate Keys, or Duplicates: Selected Key tab as needed. For more information, see Exporting Tab Rows.

    When you review and export keys, the following metadata is available:

    Column

    Description

    LH Attributes

    Attributes that comprise the primary key.

    Note: If you check more attributes against an existing key, they are not shown because after a key is discovered, it remains a key no matter how many additional attributes are combined with it.

    Status

    Indicates status of key:

    Discovered—indicates that discovered key has not been examined. Status as a valid key has not been verified.

    Permanent—indicates the key has been verified as a valid key that you want to save.

    Verified

    Indicates whether the key was checked against every row in the imported data, and therefore, if the list of duplicates is complete. If the original data source contains less than 10,000 rows, this column has a value of Yes. Trillium searches for keys in only the first 10,000 rows of imported data. If the original data source has less than 10,000 rows, all rows are automatically included in the key search, and therefore, any keys found are checked against every row.

    Ref

    Unique key analysis reference number.

    Quality %

    Indicates how well the attributes in the data source form a key.

    Keys

    Indicates the number of non-duplicated (distinct) key values.

    Duplicate Keys

    Indicates the number of distinct key values duplicated on more than one row.

    Duplicate Rows

    Indicates the number of rows with duplicate key values.

    Verified Date

    Date the key analysis was last verified.

    Verified By

    User who performed the key analysis.