Sortwork Compression - Connect_ETL - 9.13

Connect ETL for Big Data Sort User Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
Language
English
Product name
Connect ETL
Title
Connect ETL for Big Data Sort User Guide
Copyright
2023
First publish date
2003
Last updated
2023-09-11
Published on
2023-09-11T19:03:59.237517

During the map and reduce sort stages, if there is not enough memory to perform the sort, Connect for Big Data may spill some data to disk; this is referred to as “sort work”. To minimize the disk read/write impact on performance, this sortwork data is compressed using gzip. Note that the performance gain is more significant on the reduce side, where there is more data, than on the map side.

Sortwork compression is controlled by the dmx.sortwork.compress option as described in Connect for Big Data Sort Accelerator Properties, and should be set to off when the job is more CPU-bound. Alternatively, it can be set to dynamic to let Connect ETL balance the performance trade-offs.