A dependency is a many-to-one data relationship where one or more attributes determine the value of another attribute within a single data source. After data is imported into a repository, review the results by examining each potential dependency found.
- Discovered—Found as a potential dependency but not yet reviewed for validity.
- Permanent—Reviewed and verified.
A discovered dependency is considered a potential dependency until you validate it as permanent.
How Dependency Analysis Works
When you import data (by creating a data source), the Discovery Center automatically analyzes the data and identifies potential (discovered) dependencies on a sample of 10,000 data rows. This process discovers single- and double-attribute dependencies that are at least 98% consistent and does not include attributes that are less than 2% unique.
To find a larger number of potential dependencies, you can run a dependency analysis using criteria that allow you to find dependencies that have:
- Less than 98% consistency. This is determined by the Quality % metadata. Quality % is the measure, as a percentage, of how good (consistent) the dependency is. Dependencies listed with a quality of 100% represent dependencies within the data with no conflicts. Dependencies listed with a quality of less than 100% represent dependencies containing conflicts. For example, if there was a dependency between the attributes Order Number and Order Date and the quality percentage was 99%, this indicates that 99% of the time the same Order Number will have the same Order Date.
- More than two combined attributes (fields/columns).
- Attributes less than 2% unique.