Before you can use the Routing function, you must install routing data following the instructions in the Release Notes that accompany the data. See the following sections for specific data installation requirements for Hive and Spark.
Data Installation for Hive
A data resource file (dbList.json) that contains the datasets and database configurations is required. A sample dbList.json file is provided in the distribution at spectrum-bigdata-routing-version.zip/resources/config.
The datasetspaths element must point to each extracted dataset, or .spd files if the data is stored in a remote location such as HDFS or S3.
{
"defaultDatabase": "US",
"datasets": [{
"id": "US",
"paths": ["hdfs:///precisely/routing/data/US_Driving.spd"]
}
],
"databases": [{
"name": "US",
"datasets": ["US"]
}
]
}
Data Installation for Spark
The dbList.json data resource file is not required for the Spark API. You only need to install the routing data, and note the path to it for use later when setting up the distribution of the data in your cluster.