Data Distribution - trillium_discovery

Data Distribution - trillium_discovery - 17.1

Trillium Discovery Center

Product type

Software

Portfolio

Verify

Product family

Trillium

Product

Trillium > Trillium Discovery

Version

17.1

Language

English

Product name

Trillium Discovery

Title

Trillium Discovery Center

Topic type

How Do I

Installation

Reference

Configuration

Administration

Overview

First publish date

2008

Information about what your data looks like is important to a successful data quality project. This analysis occurs when you create a profiled (fully loaded) data source in the Discovery Center. You can make better informed decisions about how to profile and cleanse your data or plan a data integration project when you:

Verify how complete (or incomplete) your data set is
Recognize whether your data falls within acceptable minimum and maximum ranges
Understand how frequently values occur in an attribute

Data distribution analysis includes:

Data Patterns. Patterns describe the shape of a data value in an attribute and help identify format deviations, misspellings, and duplications in your data.
Data Type Structure. Knowing the type of data you load into the repository allows you to better understand the structure of your data, including which percentage of the data consists of string, integer, decimal, and null values.
Data Values. Each attribute and data row contains a set of values. Important metadata about data values includes how complete or incomplete the values are, the frequency the values occur, and the range in which a value is distributed across your data.
Standard Deviation. Standard deviation is analysis that measures how dispersed the values for a numeric attribute are from the attribute's numeric average value.