Install Sedona Scala/Java
Before starting the Sedona journey, you need to make sure your Apache Spark cluster is ready.
There are two ways to use a Scala or Java library with Apache Spark. You can user either one to run Sedona.
- Spark interactive Scala or SQL shell: easy to start, good for new learners to try simple functions
- Self-contained Scala / Java project: a steep learning curve of package management, but good for large projects
Spark Scala shell¶
Download Sedona jar automatically¶
-
Have your Spark cluster ready.
-
Run Spark shell with
--packages
option. This command will automatically download Sedona jars from Maven Central.
./bin/spark-shell --packages MavenCoordinates
Please refer to Sedona Maven Central coordinates to select the corresponding Sedona packages for your Spark version.
* Local mode: test Sedona without setting up a cluster
```
./bin/spark-shell --packages org.apache.sedona:sedona-spark-shaded-3.3_2.12:1.7.0,org.datasyslab:geotools-wrapper:1.7.0-28.5
```
* Cluster mode: you need to specify Spark Master IP
```
./bin/spark-shell --master spark://localhost:7077 --packages org.apache.sedona:sedona-spark-shaded-3.3_2.12:1.7.0,org.datasyslab:geotools-wrapper:1.7.0-28.5
```
Download Sedona jar manually¶
-
Have your Spark cluster ready.
-
Download Sedona jars:
- Download the pre-compiled jars from Sedona Releases
- Download / Git clone Sedona source code and compile the code by yourself (see Compile Sedona)
- Run Spark shell with
--jars
option.
./bin/spark-shell --jars /Path/To/SedonaJars.jar
Please use jars with Spark major.minor versions in the filename, such as sedona-spark-shaded-3.3_2.12-1.7.0
.
* Local mode: test Sedona without setting up a cluster
```
./bin/spark-shell --jars /path/to/sedona-spark-shaded-3.3_2.12-1.7.0.jar,/path/to/geotools-wrapper-1.7.0-28.5.jar
```
* Cluster mode: you need to specify Spark Master IP
```
./bin/spark-shell --master spark://localhost:7077 --jars /path/to/sedona-spark-shaded-3.3_2.12-1.7.0.jar,/path/to/geotools-wrapper-1.7.0-28.5.jar
```
Spark SQL shell¶
Please see Use Sedona in a pure SQL environment
Self-contained Spark projects¶
A self-contained project allows you to create multiple Scala / Java files and write complex logics in one place. To use Sedona in your self-contained Spark project, you just need to add Sedona as a dependency in your pom.xml or build.sbt.
- To add Sedona as dependencies, please read Sedona Maven Central coordinates
- Use Sedona Template project to start: Sedona Template Project
- Compile your project using SBT. Make sure you obtain the fat jar which packages all dependencies.
- Submit your compiled fat jar to Spark cluster. Make sure you are in the root folder of Spark distribution. Then run the following command:
./bin/spark-submit --master spark://YOUR-IP:7077 /Path/To/YourJar.jar
Note
The detailed explanation of spark-submit is available on Spark website.