Install Sedona Python
Click and play the interactive Sedona Python Jupyter Notebook immediately!
Apache Sedona extends pyspark functions which depends on libraries:
- pyspark
- shapely
- attrs
You need to install necessary packages if your system does not have them installed. See "packages" in our Pipfile.
Install sedona¶
- Installing from PyPi repositories. You can find the latest Sedona Python on PyPi. There is an known issue in Sedona v1.0.1 and earlier versions.
pip install apache-sedona
- Since Sedona v1.1.0, pyspark is an optional dependency of Sedona Python because spark comes pre-installed on many spark platforms. To install pyspark along with Sedona Python in one go, use the
spark
extra:
pip install apache-sedona[spark]
- Installing from Sedona Python source
Clone Sedona GitHub source code and run the following command
cd python
python3 setup.py install
Prepare python-adapter jar¶
Sedona Python needs one additional jar file called sedona-python-adapter
to work properly. Please make sure you use the correct version for Spark and Scala. For Spark 3.0 + Scala 2.12, it is called sedona-python-adapter-3.0_2.12-1.3.1-incubating.jar
You can get it using one of the following methods:
-
Compile from the source within main project directory and copy it (in
python-adapter/target
folder) to SPARK_HOME/jars/ folder (more details) -
Download from GitHub release and copy it to SPARK_HOME/jars/ folder
- Call the Maven Central coordinate in your python program. For example, in PySparkSQL
spark = SparkSession. \ builder. \ appName('appName'). \ config("spark.serializer", KryoSerializer.getName). \ config("spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config('spark.jars.packages', 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.3.1-incubating,' 'org.datasyslab:geotools-wrapper:1.3.0-27.2'). \ getOrCreate()
Warning
If you are going to use Sedona CRS transformation and ShapefileReader functions, you have to use Method 1 or 3. Because these functions internally use GeoTools libraries which are under LGPL license, Apache Sedona binary release cannot include them.
Setup environment variables¶
If you manually copy the python-adapter jar to SPARK_HOME/jars/
folder, you need to setup two environment variables
- SPARK_HOME. For example, run the command in your terminal
export SPARK_HOME=~/Downloads/spark-3.0.1-bin-hadoop2.7
- PYTHONPATH. For example, run the command in your terminal
export PYTHONPATH=$SPARK_HOME/python
You can then play with Sedona Python Jupyter notebook.