This Spark job generates the hexagons within a bounding box (for example, the bounding box of the continental USA). Hexagon output can be used for map display.
- Modify the command options according to the hexagons to be generated. Change the bounding box coordinates and hexagon level to suit your needs. See Hexagons to learn about hexagon levels.
- Deploy jar and configuration to the Hadoop cluster.
- Start the Spark job using the following command:
spark-submit --class com.precisely.bigdata.li.spark.app.hexgen.HexGenDriver --master local <dir-on-server>\location-intelligence-bigdata-spark3drivers_2.12-0-SNAPSHOT-all.jar --output <output-path> --min-longitude -73.728200 --min-latitude 40.979800 --max-longitude -71.787480 --max-latitude 42.050496 --hex-level 3 --container-level 2 --number-of-partitions 1 --max-number-of-rows 5 --csv header=true --overwrite
The output of the HexGenerator is a list of WKT that represents the hexagons. See Consuming Results for how to use the output.
Executing the Job
To run the Spark job, you must use the spark-submit script in Spark’s bin directory. Make sure to use the appropriate Spark3 jar for your installed distribution of Spark and Scala.
DriverClass:
com.precisely.bigdata.li.spark.app.hexgen.HexGenDriver
Scala2.12:
/precisely/li/software/spark3/sdk/lib/location-intelligence-bigdata-spark3drivers_2.12-sdk_version-all.jar
For Example:
spark-submit
--class com.precisely.bigdata.li.spark.app.hexgen.HexGenDriver
--master local
C:\python\jars\location-intelligence-bigdata-spark3drivers_2.12-sdk_version-all.jar
--output /user/sdkuser/hexgen_output
--min-longitude -73.728200
--max-longitude -71.787480
--min-latitude 40.979800
--max-latitude 42.050496
--hex-level 3
--container-level 2
--number-of-partitions 5
--max-number-of-rows 100
--csv header=true
--overwrite
Job Parameters
Parameter | Description | Example |
---|---|---|
--min-longitude |
Minimum longitude value of the bounding box for which you want to generate hexagons. | --min-longitude
-73.728200 |
--max-longitude |
Maximum longitude value of the bounding box for which you want to generate hexagons. | --max-longitude
-71.787480 |
--min-latitude |
Minimum latitude value of the bounding box for which you want to generate hexagons. | --min-latitude
40.979800 |
--max-latitude |
Maximum latitude value of the bounding box for which you want to generate hexagons. | --max-latitude
42.050496 |
--hex-level |
The level to generate hexagons for. Must be between 1 and 11. | --hex-level 3 |
--container-level |
A hint for providing some parallel hexagon generation. Must be less than the hex-level property. | --container-level 2 |
--number-of-partitions |
Number of partitions. | --number-of-partitions 5 |
--max-number-of-rows |
Maximum number of rows per partition. | --max-number-of-rows 100 |
--output |
The location of the directory for the output. | --output /user/sdkuser/hexgen_output |
--output-format |
The output format. Valid values: csv or parquet. If not specified, the default is csv. | --output-format=csv |
--csv |
Specify the options to be used when reading and writing CSV input and output
files. Common options and their default values:
|
|
--parquet |
Specify the options to be used when reading and writing parquet input and output files. | --parquet compression=gzip |
--overwrite |
Including this parameter will tell the job to overwrite the output directory. Otherwise, the job will fail if this directory already has content. This parameter does not have a value. | --overwrite |