Sample Hadoop Configuration File - Connect_ETL - 9.13

Connect ETL for Big Data Sort User Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
Language
English
Product name
Connect ETL
Title
Connect ETL for Big Data Sort User Guide
Copyright
2023
First publish date
2003
Last updated
2023-09-11
Published on
2023-09-11T19:03:59.237517

When invoking the hadoop command using the -conf option to specify the configuration parameters as described in Chapter 4. Using Connect for Big Data Sort, you need to provide an XML configuration file that conforms to the Hadoop configuration file schema.

Following is a sample file that includes both the required Connect for Big Data Sort properties along with some optional ones to override default settings. Other Hadoop properties can also be specified in this file to override the site-wide settings. Note that connect_install_dir should be replaced with the actual directory in which Connect for Big Data was installed.
<?xml version=”1.0”?>
<configuration>
<!-- Required properties for Connect for Big Data Sort -->
<property>
<name>mapreduce.job.map.output.collector.class</name>
<value>com.syncsort.dmexpress.hadoop.DMXMapOutputCollector</value>
</property>
<property>
<name>mapreduce.job.reduce.shuffle.consumer.plugin.class</name>
<value>com.syncsort.dmexpress.hadoop.DMXShuffleConsumerPlugin</value>
</property>
<property>
<name>dmx.home.dir</name>
<value>connect_install_dir</value>
</property>
<!-- Override dynamically set memory values -->
<property>
<name>dmx.map.memory</name>
<value>1024</value> <!-- 1GB -->
</property>
<property>
<name>dmx.reduce.memory</name>
<value>4096</value> <!-- 4GB -->
</property>
</configuration>