Command—loaddata for Delimited Files - trillium_discovery - trillium_quality - Latest

Trillium DQ Repository Administrator Guide

Product type
Software
Portfolio
Verify
Product family
Trillium
Product
Trillium > Trillium Discovery
Version
Latest
Language
English
Product name
Trillium Quality and Discovery
Title
Trillium DQ Repository Administrator Guide
Copyright
2024
First publish date
2008
Last updated
2024-10-18
Published on
2024-10-18T15:31:00.219841

You can create an entity using a delimited file that may or may not have a companion schema Data Dictionary Language (DDL) file.

Before you issue the loaddata command, create a loader connection for your "Delimited" data source. The loader connection specifies where the data source files are located and allows the loaddata command to connect to the data source and initiate the data import process.

Required Syntax

loaddata <loader_connection> datafile <filename> 

where

<loader_connection>

Name assigned by the repository administrator to the loader connection.

<filename>

Name of the delimited file that contains the data.

Optional Parameters

Parameter

Description

username <user_name>

User ID required to validate the connection to the data source. Use this parameter only if a login name and password are required.

Do not use the username and password parameters if the mtb_admin user is the data file owner.

password <password>

Password required to validate the connection to the data source. Use this parameter only if a login name and password are required.

schemafile <filename>

Name of the schema file that corresponds to the delimited file you specified as the data file.

jobname <job_name>

Job ID or name of the data load job.

attr <header>

Indicates the header line. The options are:

names—names on first line

one—no attribute names specified

ddl—names in schema file

delimiter <character>

Indicates the character that is used as the data delimiter in the file. A field can be delimited by whitespace, tabs, commas (CSV), periods (.) or other characters.

Using Tab Characters

Tab characters should be enclosed in double-quotes. For example, "\t". On Linux, to specify a tab character, do the following:

  1. Press CTL + v.
  2. Press the Tab key.

terminator <value>

Indicates how records in the data file are terminated. The options are lf (linefeed), cr (carriage return), and crlf (linefeed and carriage return).

 

Typically, if the file resides on a Windows system, type crlf. If on a UNIX system, type lf.

encoding <name>

Character encoding used by the data file. The options are EDCDIC and ASCII.

This parameter controls the character set for the file. EBCDIC data is translated into a correct ASCII representation on load. Generally, UNIX COBOL files will be ASCII and IBM mainframe data will be EBCDIC.

columns <names>

Indicates the names of the columns from which to import data.

skip <number>

Number of rows to skip before starting to import data rows. All rows after the skipped rows will be loaded to the repository. For example, if your file has 300 rows and you select to skip the first 99, the system will load 200 rows, starting with the 100th row.

first <number>

Number of records from the beginning of the file (for example, the first 1000 records) to load.

random <percentage>

The degree to which you want to randomly sample a percentage of records from the file.

Example

This command loads data from three columns in the testdel.txt file and uses the delimconn Loader Connector to connect to the file.

loaddata delimconn datafile testdel.txt attr names delimiter . quote \" terminator crlf columns {{Ref Id} Source Amount}

When multiple columns are represented within a space-delimited line of column names, be sure to enclose them in braces ({}). If a column name contains whitespace, enclose the column name in braces ({}) also.