Connect for Big Data Component Setup and Operation - Connect_ETL - 9.13

Connect ETL Installation Guide

Product type
Software
Portfolio
Integrate
Product family
Connect
Product
Connect > Connect (ETL, Sort, AppMod, Big Data)
Version
9.13
Language
English
Product name
Connect ETL
Title
Connect ETL Installation Guide
Copyright
2024
First publish date
2003
Last updated
2024-11-08
Published on
2024-11-08T16:36:35.232000

A Connect for Big Data setup consists of the following:

  • Windows workstation
  • Connect must be installed as described in Step-by-Step Installation, Windows Systems.
  • Connect Job and Task Editors are used for MapReduce job development.
  • MapReduce jobs are submitted to Hadoop via the ETL server from the Job Editor.
  • Linux server (edge node)
  • Connect must be installed as described in Step-by-Step Installation, UNIX Systems.
  • The Hadoop client must be installed and configured to connect to the Hadoop cluster.
  • The Editor Runtime Service, dmxd, must be running to respond to jobs run via the Windows workstation; it calls dmxjob with the /HADOOP option, which ultimately calls hadoop to submit jobs to the cluster.
  • Hadoop cluster
  • Connect must be installed without dmxd on all nodes in the Hadoop cluster as described in Step-by-Step Installation, Hadoop Cluster.
  • Each mapper and reducer runs the map side or reduce side task(s), respectively.
  • All file descriptors for sources, targets, and intermediate files are carefully connected so they fit into the Hadoop MapReduce flow.