Discovery Center Terminology - trillium_discovery - 17.1

Trillium Discovery Center

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Version
17.1
Language
English
Product name
Trillium Discovery
Title
Trillium Discovery Center
Topic type
How Do I
Installation
Reference
Configuration
Administration
Overview
First publish date
2008

Discovery Center profiles data from a variety of diverse sources, and each data source uses unique terminology. To enable you to more easily compare data and relationships across a variety of sources, Discovery Center uses the standard terms defined below. Familiarize yourself with these definitions before you start to working with your data in the Discovery Center.

Term Description
associations A connection between a data source and the attributes and business rules it contains and a rule set in the library. Associations help maintain sets of centralized, sharable, and reusable standards.
attribute Depending on the structure of the data source, an attribute represents a column or a field in your data file. Attributes reside in data sources. Rule sets in the Library also contain library attributes.
business rule Data expressions you build specific to your data and metadata to help you analyze (test) your data in relation to your company's data quality standards. Use customized business rules to run business- and data-specific tests and tasks against your data at almost any step in the data profiling process.
Business Rule Library A central location where you create rule sets that contain library rules and library attributes. You can import and export rule sets for use in other repositories.
category Associate to business rules to help you organize and group your rules. You can also add a sub-category to further refine your rule groups.
cardinality How multiple instances of a data source or attribute relate to one instance of another data source or attribute.
connection A connection to an external data source on your network made prior to importing data to a repository. Also called loader connection or data connection.
data row See row.
data source File or table associated with an external data source. Contains the attributes with your data. See also profiled data source and dynamic data source.
data type The characteristic of an attribute that defines the kind of values it can take. In the Discovery Center, all source data is represented as integer, string, or decimal types.
delimiter A character used to specify the boundaries between areas in a data file. For example, a comma is used as a field delimiter in a sequence of comma-separated values.
dependency Data relationship in which one or more attributes determine the value of another attribute.
discovered join A join that was generated by the Add Discover Joins process or was manually set to Discovered.
documented metadata Statistics taken directly from the schema (DDL or copybook) during a data import. Documented data type is an example of documented metadata.
dynamic data source A data source (also known as an entity) that is linked to an external data source file and is not imported into a repository. See also profiled data source.
encoding The method of assigning numeric values to code positions which represent characters of data. It is also called a code page. The Discovery Center supports most encodings as input data when adding data sources.
Expression Builder A tool you use to build business rules that query your data by performing operations and running functions.
inferred metadata Statistics derived from the full volume of your data. Inferred data type is an example of inferred metadata.
join Intersection of identical or related data across two or more data sources.
key Attribute that uniquely identifies and associates data within a data source, binding the data together.
library See Business Rules Library.
mask A description of a word, phrase, or number that identifies characters as alphabetic, numeric, or as a special character (a character that is not a number or a letter).
mask recode A revision, or recoding, of a mask shape, so that the original mask is modified or replaced by a new mask. For example, you may choose to recode a mask when it includes special characters and spaces.
metadata Statistics and properties associated with a repository object.
metaphone Data values that share a phonetic pattern. For example, Carl and Karl. A metaphone is similar to a soundex, but provides more refined results by comparing values against a multi-character phonetic pattern. Unlike a soundex, a metaphone is not limited in the length of the phonetic representation. See also soundex.
mode count The number of data values that occur with the highest frequency in an attribute.
mode frequency The number of times a data value occurs in an attribute.
null value A blank (empty) field in an attribute.
pattern The shape of a data value described by coded values. There are three main types of patterns: default, rich, and long. Also called character patterns.
permanent join A join that was generated by the Add Permanent Join process or whose status was manually set to permanent.
priority Unique numeric ID values applied to business rules. This ID has no inherent value. Apply priorities to create a business rule hierarchy where each priority signifies a level of importance when associated with a rule. You weight priority values depending on the impact and importance of the rule.
profiled data source A data source (also known as an entity) in which the data is loaded into the repository. See also data source and dynamic data source.
repository Object that contains the data and metadata on which you perform data discovery, profiling, and data quality activities.
repository server Collection of one or more repositories. It has its own group of users, data connections, and security and performance settings.
row When Discovery Center imports data, it reads each data record and imports the records as repository objects called rows.
rule compliance A reflection of how well your data conforms to your company's standards for data quality. The percentage of data source and attribute rows and values that fail or pass business rule analysis determines the quality of your data based on the standards (thresholds) that you set. The higher the passing percentage, the better the rule compliance level.
rule set Collection of library rules and attributes. You associate rule sets with data sources and run the library business rules against your loaded data. You share rule sets by exporting them for use in other repositories.
schema A file that describes the shape of your data.
soundex A 4-character coded identification of data values (based on a phonetic algorithm) which TSS has analyzed as potentially related or the same because they sound similar. For example, Don, DON, Dan, DAN, and Donna would be classified as a soundex. See also metaphone.
standard deviation a measure of how dispersed the values for a numeric attribute are from the attribute's numeric average value.
target metadata Recommendations of how the target (or current) repository object should look based on your actual data. Target data type is an example of target metadata.
unique value Attribute metadata that helps determine how often a value is repeated within an attribute, which rows contain the value, and which attributes contain more distinct values than others.
value recode Each attribute contains a certain number of unique data (literal) values. A value recode is a user-defined revision, or recoding, of a data value that modifies or replaces all occurrences of the original value.
Venn Diagram A graphical representation of detailed join information and metrics.