Sampling Options - 17.1

Inline Quality and Discovery

Version
17.1
Language
English
Product name
Trillium Quality and Discovery
Title
Inline Quality and Discovery

This table describes the sampling options for loading data into a repository.

Option Meaning
All Load an entire file.
First

Load a select number of rows from the beginning of the file.

Use to test data characteristics of large files before loading the files (as entities) into a repository. Allows you to load only a small number of rows. For example, you might want to load the first 1000 rows, in order to determine level of data quality, schema accuracy, and other relevant information.

Random

Randomly sample a percentage of rows from the file.

Use when you need to load large files but want to test the data by loading a small number of rows first. For example, you might want to load a smaller row sample in order to understand the schema design, before loading all rows.

Note: The random % is the percentage chance of rows being included. Therefore, the actual number of rows loaded may be different for each load of the same file even if you specify the same percentage.