This table describes the sampling options for loading data into a repository.
Option | Meaning |
All | Load an entire file. |
First |
Load a select number of rows from the beginning of the file. Use to test data characteristics of large files before loading the files (as entities) into a repository. Allows you to load only a small number of rows. For example, you might want to load the first 1000 rows, in order to determine level of data quality, schema accuracy, and other relevant information. |
Random |
Randomly sample a percentage of rows from the file. Use when you need to load large files but want to test the data by loading a small number of rows first. For example, you might want to load a smaller row sample in order to understand the schema design, before loading all rows. Note: The random % is the percentage chance of rows being included. Therefore,
the actual number of rows loaded may be different for each load of the same
file even if you specify the same percentage.
|