Skip to content

Install Sedona Python

Click Binder and play the interactive Sedona Python Jupyter Notebook immediately!

Apache Sedona extends pyspark functions which depends on libraries:

  • pyspark
  • shapely
  • attrs

You need to install necessary packages if your system does not have them installed. See "packages" in our Pipfile.

Install sedona

pip install apache-sedona
  • Since Sedona v1.1.0, pyspark is an optional dependency of Sedona Python because spark comes pre-installed on many spark platforms. To install pyspark along with Sedona Python in one go, use the spark extra:
pip install apache-sedona[spark]
  • Installing from Sedona Python source

Clone Sedona GitHub source code and run the following command

cd python
python3 install

Prepare sedona-spark-shaded jar

Sedona Python needs one additional jar file called sedona-spark-shaded to work properly. Please make sure you use the correct version for Spark and Scala. For Spark 3.0 + Scala 2.12, it is called sedona-spark-shaded-3.0_2.12-1.4.0.jar

You can get it using one of the following methods:

  1. Compile from the source within main project directory and copy it (in spark-shaded/target folder) to SPARK_HOME/jars/ folder (more details)

  2. Download from GitHub release and copy it to SPARK_HOME/jars/ folder

  3. Call the Maven Central coordinate in your python program. For example, in PySparkSQL
    spark = SparkSession. \
        builder. \
        appName('appName'). \
        config("spark.serializer", KryoSerializer.getName). \
        config("spark.kryo.registrator", SedonaKryoRegistrator.getName). \
               'org.datasyslab:geotools-wrapper:1.4.0-28.2'). \


If you are going to use Sedona CRS transformation and ShapefileReader functions, you have to use Method 1 or 3. Because these functions internally use GeoTools libraries which are under LGPL license, Apache Sedona binary release cannot include them.

Setup environment variables

If you manually copy the sedona-spark-shaded jar to SPARK_HOME/jars/ folder, you need to setup two environment variables

  • SPARK_HOME. For example, run the command in your terminal
export SPARK_HOME=~/Downloads/spark-3.0.1-bin-hadoop2.7
  • PYTHONPATH. For example, run the command in your terminal

You can then play with Sedona Python Jupyter notebook.

Last update: March 16, 2023 00:00:53