Data Input/Output (I/O) - trillium_quality - trillium_discovery - Latest

Trillium Beyond The Basics Guide

Product type
Software
Portfolio
Verify
Product family
Trillium™ software
Product
Trillium™ software > Trillium™ Quality
Trillium™ software > Trillium™ Discovery
Version
Latest
ft:locale
en-US
Product name
Trillium Quality and Discovery
ft:title
Trillium Beyond The Basics Guide
Copyright
2024
First publish date
2008
ft:lastEdition
2024-10-18
ft:lastPublication
2024-10-18T15:32:19.232000
L1_Product_Gateway
Verify
L2_Product_Segment
Data Quality
L3_Product_Brand
Precisely Trillium
L4_Investment_Segment
Legacy DQ
Core Data Quality
L5_Product_Group
Legacy DQ - Application
Data Quality - Application
L6_Product_Name
Trillium Quality
Trillium Discovery

Generally, TSS batch processing is I/O intensive rather than CPU intensive because it employs the ‘one record in, one record out’ I/O method. A typical batch processing consists of multiple TSS processes controlled by a script. Each TSS process reads and writes flat files. A single record is read from a file, processed by the TSS process and written to the output. The output is read by the next process, and so on. These intermediate files may be real, landed files or they can be pipes. File sizes on disk are simply multiples of the record length and number of records.

The one exception is the matching process (Relationship Linker/Reference Matcher). The matching process is both I/O and CPU intensive. Matching requires reading groups of records into memory at a time in order to perform comparisons between records; it must read all records in the group into memory as quickly as possible. Then the I/O stops while the matching processes the set of records. At this point the matcher becomes CPU and processing cycle intensive. When it is done matching, the I/O resumes and each member of the set is written to the output.

Another variation is the Sort process. The Sort Utility performs the sorting of a file in memory and it will write temporary sortwork files when it can no longer hold all records in memory. These sortwork files together will take up as much space on disk as the full flat input or output file would take. Once all records have been sorted or written to sortwork files, the Sort writes the output file. Typically TSS does not change the shape of each record during a sort. Thus the input to and sorted output from a sort are the same size on disk. Sortwork files will be deleted upon successful completion of the process. These files will be written regardless of whether the project is piped or not.