Skip to content

Install Sedona Python

Click Binder and play the interactive Sedona Python Jupyter Notebook immediately!

Apache Sedona extends pyspark functions which depends on libraries:

  • pyspark
  • shapely
  • attrs

You need to install necessary packages if your system does not have them installed. See "packages" in our Pipfile.

Install sedona

pip install apache-sedona
  • Since Sedona v1.1.0, pyspark is an optional dependency of Sedona Python because spark comes pre-installed on many spark platforms. To install pyspark along with Sedona Python in one go, use the spark extra:
pip install apache-sedona[spark]
  • Installing from Sedona Python source

Clone Sedona GitHub source code and run the following command

cd python
python3 setup.py install

Prepare python-adapter jar

Sedona Python needs one additional jar file called sedona-python-adapter to work properly. Please make sure you use the correct version for Spark and Scala. For Spark 3.0 + Scala 2.12, it is called sedona-python-adapter-3.0_2.12-1.2.1-incubating.jar

You can get it using one of the following methods:

  1. Compile from the source within main project directory and copy it (in python-adapter/target folder) to SPARK_HOME/jars/ folder (more details)

  2. Download from GitHub release and copy it to SPARK_HOME/jars/ folder

  3. Call the Maven Central coordinate in your python program. For example, in PySparkSQL
    spark = SparkSession. \
        builder. \
        appName('appName'). \
        config("spark.serializer", KryoSerializer.getName). \
        config("spark.kryo.registrator", SedonaKryoRegistrator.getName). \
        config('spark.jars.packages',
               'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.2.1-incubating,'
               'org.datasyslab:geotools-wrapper:1.3.0-27.2'). \
        getOrCreate()
    

Warning

If you are going to use Sedona CRS transformation and ShapefileReader functions, you have to use Method 1 or 3. Because these functions internally use GeoTools libraries which are under LGPL license, Apache Sedona binary release cannot include them.

Setup environment variables

If you manually copy the python-adapter jar to SPARK_HOME/jars/ folder, you need to setup two environment variables

  • SPARK_HOME. For example, run the command in your terminal
export SPARK_HOME=~/Downloads/spark-3.0.1-bin-hadoop2.7
  • PYTHONPATH. For example, run the command in your terminal
export PYTHONPATH=$SPARK_HOME/python

You can then play with Sedona Python Jupyter notebook.


Last update: April 22, 2022 08:55:00