Install Sedona Scala/Java
Before starting the Sedona journey, you need to make sure your Apache Spark cluster is ready.
There are two ways to use a Scala or Java library with Apache Spark. You can user either one to run Sedona.
- Spark interactive Scala or SQL shell: easy to start, good for new learners to try simple functions
- Self-contained Scala / Java project: a steep learning curve of package management, but good for large projects
Spark Scala shell¶
Download Sedona jar automatically¶
-
Have your Spark cluster ready.
-
Run Spark shell with
--packages
option. This command will automatically download Sedona jars from Maven Central.Please refer to Sedona Maven Central coordinates to select the corresponding Sedona packages for your Spark version../bin/spark-shell --packages MavenCoordinates
-
Local mode: test Sedona without setting up a cluster
./bin/spark-shell --packages org.apache.sedona:sedona-spark-shaded-3.0_2.12:1.4.1,org.apache.sedona:sedona-viz-3.0_2.12:1.4.1,org.datasyslab:geotools-wrapper:1.4.0-28.2
-
Cluster mode: you need to specify Spark Master IP
./bin/spark-shell --master spark://localhost:7077 --packages org.apache.sedona:sedona-spark-shaded-3.0_2.12:1.4.1,org.apache.sedona:sedona-viz-3.0_2.12:1.4.1,org.datasyslab:geotools-wrapper:1.4.0-28.2
-
Download Sedona jar manually¶
-
Have your Spark cluster ready.
-
Download Sedona jars:
- Download the pre-compiled jars from Sedona Releases
- Download / Git clone Sedona source code and compile the code by yourself (see Compile Sedona)
-
Run Spark shell with
--jars
option.If you are using Spark 3.0 to 3.3, please use jars with filenames containing./bin/spark-shell --jars /Path/To/SedonaJars.jar
3.0
, such assedona-spark-shaded-3.0_2.12-1.4.1
; If you are using Spark 3.4 or higher versions, please use jars with Spark major.minor versions in the filename, such assedona-spark-shaded-3.4_2.12-1.4.1
.-
Local mode: test Sedona without setting up a cluster
./bin/spark-shell --jars /path/to/sedona-spark-shaded-3.0_2.12-1.4.1.jar,/path/to/sedona-viz-3.0_2.12-1.4.1.jar,/path/to/geotools-wrapper-1.4.0-28.2.jar
-
Cluster mode: you need to specify Spark Master IP
./bin/spark-shell --master spark://localhost:7077 --jars /path/to/sedona-spark-shaded-3.0_2.12-1.4.1.jar,/path/to/sedona-viz-3.0_2.12-1.4.1.jar,/path/to/geotools-wrapper-1.4.0-28.2.jar
-
Spark SQL shell¶
Please see Use Sedona in a pure SQL environment
Self-contained Spark projects¶
A self-contained project allows you to create multiple Scala / Java files and write complex logics in one place. To use Sedona in your self-contained Spark project, you just need to add Sedona as a dependency in your POM.xml or build.sbt.
- To add Sedona as dependencies, please read Sedona Maven Central coordinates
- Use Sedona Template project to start: Sedona Template Project
- Compile your project using SBT. Make sure you obtain the fat jar which packages all dependencies.
- Submit your compiled fat jar to Spark cluster. Make sure you are in the root folder of Spark distribution. Then run the following command:
./bin/spark-submit --master spark://YOUR-IP:7077 /Path/To/YourJar.jar
Note
The detailed explanation of spark-submit is available on Spark website.