Distributing Reference Data Using S3 - Spectrum_Routing_for_Big_Data - 5.1

Spectrum Routing Installation: Cloudera

Product type
Software
Portfolio
Locate
Product family
Spectrum
Product
Spatial Big Data > Routing for Big Data
Version
5.1
Language
English
Product name
Spectrum Routing for Big Data
Title
Spectrum Routing Installation: Cloudera
Copyright
2024
First publish date
2017
Last updated
2024-10-18
Published on
2024-10-18T09:54:26.541418

Now that the SDK is installed and the routing reference data is configured the reference data must be distributed around the cluster. Your cluster must be set up for S3 access.

Use the s3a URI scheme on Cloudera distributions.

For the purpose of this guide, we will install the reference data into s3a://<your-bucket>/precisely/routing/data.

  1. Upload the reference data into S3.
    hadoop fs -mkdir -p s3a://<your-bucket>/precisely/routing
    hadoop fs -copyFromLocal /precisely/routing/data s3a://<your-bucket>/precisely/routing/data
  2. When the data node performs routing tasks, the node will download the reference data from S3 and onto the local file system. This means a local directory needs to be set up on all data nodes. Perform the following commands on all data nodes and HiverServer nodes.
    sudo mkdir /precisely/downloads
    sudo chown sdkuser:hadoop /precisely/downloads
    sudo chmod 775 /precisely/downloads