Connect for Big Data Sort is the Hadoop sort acceleration component of Connect for Big Data, the Hadoop-enabled edition of Connect ETL. It seamlessly replaces the native sort within Hadoop MapReduce processing, providing performance benefits without programming changes to existing MapReduce jobs.
When specified to use the Connect for Big Data Sort, Hadoop automatically invokes the highly efficient Connect ETL runtime sort engine, which executes in parallel on all nodes as an integral part of the Hadoop framework, thereby increasing the performance, scalability, and throughput of the Hadoop cluster.
Connect for Big Data can easily be installed and deployed on every node in a Hadoop cluster due to its very small footprint and lack of dependencies on third-party applications at design or runtime.