Release notes

Sedona 1.7.2¶

Sedona 1.7.2 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.19, Snowflake 7+, Java 8.

This release is a minor release that only includes bug fixes. No API breaking changes and behavior changes are expected.

New Contributors¶

@MadhuriRathod30 made their first contribution in https://github.com/apache/sedona/pull/1906
@jgoday made their first contribution in https://github.com/apache/sedona/pull/1923
@oliverbeagley made their first contribution in https://github.com/apache/sedona/pull/1929
@cgauvi made their first contribution in https://github.com/apache/sedona/pull/1962

Bug¶

SEDONA-722: Fix precision loss problems caused by casting world coordinates from double to float
SEDONA-704: Optimize STAC reader and fix few issues
SEDONA-715: Unify Zeppelin and Jupyter setting in Docker
GH-1868: Fix spark sql extension load failure when parser failed to load
SEDONA-726: Fix ST_Force_2D and add ST_Force2D
SEDONA-704: Add the grid extension to the stac reader
SEDONA-724: Fix RS_ZonalStats and RS_ZonalStatsAll edge case bug
SEDONA-728: Fix Rasterization clamping bug
SEDONA-690: Set default metric to use Haversine for KNN join and code refactoring
SEDONA-731: Add osm nodes parser
GH-1922: ST_X/Y/Z ON null geometries
GH-1761: Error when invalid ST_Subdivide maxVertices argument
GH-1910: Import geopandas only when type checking
GH-1931: Move packaging module import to geopandas try-except block
SEDONA-734 Fix relation parsing in OSM reader
SEDONA-735 Fix RS_Clip bug caused by AOI geometries smaller than pixel size
GH-1945 Shade Jiffle and its dependencies

Sedona 1.7.1¶

Sedona 1.7.1 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.19, Snowflake 7+, Java 8.

This release is a minor release that includes new features, improvements, bug fixes. No API breaking changes and behavior changes are expected.

New Contributors¶

@MrPowers made their first contribution in https://github.com/apache/sedona/pull/1735
@paleolimbot made their first contribution in https://github.com/apache/sedona/pull/1748
@kadolor made their first contribution in https://github.com/apache/sedona/pull/1762
@BaseMax made their first contribution in https://github.com/apache/sedona/pull/1808
@ruanqizhen made their first contribution in https://github.com/apache/sedona/pull/1822
@sshiv012 made their first contribution in https://github.com/apache/sedona/pull/1826

Highlights¶

[SEDONA-689] SQL interface for GeoStats including ST_DBSCAN, ST_GLocal, and ST_LocalOutlierFactor
[SEDONA-690] Broadcast join support for distributed KNN Join
GeoArrow-enhanced GeoPandas input and output of Sedona DataFrame
[SEDONA-695] Expose spatial partitioning structure from SpatialRDD to enable better partitioning for GeoParquet
[SEDONA-704] STAC catalog reader
[SEDONA-713] OpenStreetMap (OSM) PBF reader
[SEDONA-707] Significant performance improvement on geometry rasterization such as RS_AsRaster
Several cool ST functions such as ST_RemoveRepeatedPoints

Bug¶

[SEDONA-688] - running ST_KNN() error : java.lang.ArithmeticException: / by zero
[SEDONA-690] - Optimize query side broadcast KNN join
[SEDONA-694] - Error message for optional includes refers to old pypi package name
[SEDONA-696] - Fix issue with geopackage datasource on databricks.
[SEDONA-698] - Fix ST_RemoveRepeatedPoints
[SEDONA-699] - When loading metadata for relatively huge geoparquet files Sedona application stops responding.
[SEDONA-700] - ST_KNN fails on null and empty geometries
[SEDONA-706] - Python DataFrame API have problem working in multi-threaded environment
[SEDONA-716] - MERGE INTO TABLE Does't Work with Sedona 1.7.0
[SEDONA-718] - Auto Detect geometry column in GeoJSON writer

New Feature¶

[SEDONA-693] - Add ST_Perimeter2D
[SEDONA-704] - Add the STAC datasource reader
[SEDONA-707] - Add allTouched parameter for RS_functions that perform rasterization (Geometry to Raster)

Improvement¶

[SEDONA-685] - R – Switch shapefile and geojson readers to DataFrame API sources
[SEDONA-689] - Geostats in SQL
[SEDONA-695] - Expose spatial partitioning structure from SpatialRDD
[SEDONA-705] - Add spatial partitioners that don't introduce duplicates
[SEDONA-708] - SedonaPython should use PyArrow to get GeoPandas DataFrame
[SEDONA-711] - Add Geography user-defined type
[SEDONA-715] - Add Zeppelin Notebook support along with visualization plugin for the docker image
[SEDONA-717] - Fix dataframe_to_arrow() for zero-row case
[SEDONA-719] - Support reading Shapefile with Z/M ordinates

Task¶

[SEDONA-25] - Change Scala Seq type in Adapter
[SEDONA-46] - Add Postgis equivalent ST_ClusterDBSCAN to Apache Sedona.
[SEDONA-713] - Create OSM reader to Apache Sedona
[SEDONA-714] - Add conversion from geopandas to Sedona using arrow.

Sedona 1.7.0¶

Sedona 1.7.0 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.19, Snowflake 7+, Java 8.

This release is a major release that includes new features, improvements, bug fixes, API breaking changes, and behavior changes.

New Contributors¶

@mvaaltola made their first contribution in https://github.com/apache/sedona/pull/1574
@emmanuel-ferdman made their first contribution in https://github.com/apache/sedona/pull/1658
@MohammadLotfiA made their first contribution in https://github.com/apache/sedona/pull/1659
@golfalot made their first contribution in https://github.com/apache/sedona/pull/1673
@AmirTallap made their first contribution in https://github.com/apache/sedona/pull/1675
@freamdx made their first contribution in https://github.com/apache/sedona/pull/1704

Highlights¶

Add a new join algorithm for distributed K Nearest Neighbor Join and a corresponding ST_KNN function
Add new spatial statistics algorithms DBSCAN, Local Outlier Factor, and Getis Ord Hot Spot Analysis
Add new DataFrame based readers for Shapefile, and GeoPackage
Add 10 new ST functions

API breaking changes¶

The support of Spark 3.0, 3.1, 3.2 is dropped. Sedona is now only compatible with Spark 3.3, 3.4, and 3.5.
Rasterio is no longer a mandatory dependency. You can still use Sedona Raster without rasterio. If you need to write rasterio UDF in Sedona, you can install it separately.

Behavior changes¶

JTS version is upgraded to 1.20.0. This may cause some behavior changes in ST functions that rely on JTS.
ST_Length, ST_Length2D and ST_LengthSpheroid now only return the length for line objects. It now returns 0 for polygon objects.
ST_Perimeter now only returns the perimeter for polygon objects. It now returns 0 for line objects.

Bug¶

[SEDONA-650] - Fiona-Geopandas Compatibility Issue in Python 3.8
[SEDONA-665] - Docker build failed at ubuntu 22 with rasterio 1.4.0+
[SEDONA-669] - GeoParquet format should handle timestamp_ntz columns properly
[SEDONA-670] - GeoJSON reader does not work properly on DBR
[SEDONA-672] - Bug fix for ST_LengthSpheroid
[SEDONA-673] - Cannot load GeoParquet without bbox metadata when spatial filter is applied
[SEDONA-677] - Kryo deserialization for null envelopes results in unit envelopes
[SEDONA-682] - Sedona Spark 3.3 does not compile on Scala 2.13

New Feature¶

[SEDONA-646] - Shapefile data source for DataFrame API
[SEDONA-647] - Add ST_RemoveRepeatedPoints
[SEDONA-648] - Implement Distributed K Nearest Neighbor Join
[SEDONA-652] - Add ST_MakeEnvelope
[SEDONA-654] - Add ST_RotateY
[SEDONA-655] - DBSCAN
[SEDONA-656] - Add ST_Project
[SEDONA-658] - Add ST_Simplify
[SEDONA-659] - Upgrade jts version to 1.20.0
[SEDONA-661] - Local Outlier Factor
[SEDONA-664] - Add native GeoPackage reader
[SEDONA-666] - Add ST_Scale and ST_ScaleGeom
[SEDONA-667] - Getis Ord G Local
[SEDONA-671] - Spider random spatial data generator
[SEDONA-675] - Add ST_InterpolatePoint
[SEDONA-676] - Add ST_Perimeter

Improvement¶

[SEDONA-636] - datatype geometry is not supported when 'create table xxx (geom geometry)
[SEDONA-640] - Refactor support for multiple spark versions in the build
[SEDONA-642] - R – Adapt R package for split version of jars
[SEDONA-644] - R – Update for SedonaContext
[SEDONA-649] - Fix spelling in Java files
[SEDONA-653] - Add lenient mode for RS_Clip
[SEDONA-663] - Support spark connect in dataframe api
[SEDONA-678] - Fix ST_Length and ST_Length2D behavior
[SEDONA-679] - Fix ST_LengthSpheroid behavior

Task¶

[SEDONA-651] - Add spark prefix to all sedona spark config
[SEDONA-662] - Clean Up Dead Code from DBSCAN
[SEDONA-668] - Drop the support of Spark 3.0, 3.1, 3.2
[SEDONA-674] - Make the rasterio binding for sedona-python work with GDAL 3.10
[SEDONA-680] - Remove rasterio from mandatory dependency
[SEDONA-681] - Bump GeoTools version from 28.2 to 28.5
[SEDONA-683] - Exclude some repetitive dependencies

Sedona 1.6.1¶

Sedona 1.6.1 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.19, Snowflake 7+, Java 8.

This release is a maintenance release that includes bug fixes and minor improvements.

New Contributors¶

@zhangfengcdt made their first contribution in https://github.com/apache/sedona/pull/1431
@james-willis made their first contribution in https://github.com/apache/sedona/pull/1453

Highlights¶

Add native DataFrame based GeoJSON reader and writer
48 new ST functions added
GeoParquet reader and writer supports GeoParquet 1.1.0 covering column
Improve the error handling of ST functions so that the error message includes the geometry that caused the error

API breaking changes¶

The following raster functions now return struct type outputs instead of array types.
RS_Metadata
RS_SummaryStatsAll
RS_ZonalStatsAll
RS_GeoTransform

Bug¶

[SEDONA-560] - Spatial join involving dataframe containing 0 partition throws exception
[SEDONA-561] - Failed to run examples in the core.showcase package
[SEDONA-580] - New instances of RasterUDT object is not equal to the RasterUDT case object
[SEDONA-581] - SedonaKepler fails to reload if a raster column exists
[SEDONA-605] - RS_AsRaster(useGeometryExtent=false) does not work with reference rasters with scaleX/Y < 1
[SEDONA-608] - Fix ST_IsPolygonCW, ST_IsPolygonCCW, ST_ForcePolygonCW and ST_ForcePolygonCCW
[SEDONA-609] - Fix python 3.12 build issue caused by binary compatibility issues with numpy 2.0.0
[SEDONA-611] - Cannot write rasters to S3 on EMR
[SEDONA-618] - Maven build failed with javadoc classes and package list files missing
[SEDONA-624] - Distance join throws java.lang.reflect.InvocationTargetException when working with aggregation functions
[SEDONA-626] - SRID of geometries returned by many ST functions are incorrect
[SEDONA-628] - Python DataFrame Functions Cannot Be Imported As Documented
[SEDONA-639] - ST_Split may produce inaccurate results when splitting linestrings

New Feature¶

[SEDONA-462] - ST_IsValidDetail
[SEDONA-486] - Implement ST_MMin
[SEDONA-487] - Implement ST_MMax
[SEDONA-562] - Add native DataFrame based GeoJSON reader and writer
[SEDONA-563] - Add ST_GeomFromEWKB
[SEDONA-564] - Add ST_NumInteriorRing
[SEDONA-565] - Add ST_ForceRHR
[SEDONA-566] - Add ST_TriangulatePolygon
[SEDONA-567] - Add ST_M
[SEDONA-569] - Add ST_PointZM
[SEDONA-570] - Add ST_PointM
[SEDONA-571] - Add ST_MMin
[SEDONA-572] - Add ST_PointFromWKB
[SEDONA-573] - Add ST_HasM
[SEDONA-574] - Add ST_MMax
[SEDONA-575] - Add ST_LineFromWKB
[SEDONA-576] - Add ST_HasZ
[SEDONA-577] - Add ST_GeometryFromText
[SEDONA-578] - Add ST_Points
[SEDONA-579] - Add ST_AsHEXEWKB
[SEDONA-582] - Add ST_PointFromGeoHash
[SEDONA-583] - Add ST_Length2D
[SEDONA-584] - Add ST_Zmflag
[SEDONA-585] - Add ST_ForceCollection
[SEDONA-586] - Add ST_Force3DZ
[SEDONA-587] - Add ST_Force3DM
[SEDONA-588] - Add ST_Force4D
[SEDONA-589] - Add ST_LongestLine
[SEDONA-590] - Add ST_GeomColFromText
[SEDONA-591] - Add ST_MaxDistance
[SEDONA-592] - Add ST_MPointFromText
[SEDONA-593] - Add ST_Relate
[SEDONA-594] - Add ST_RelatedMatch
[SEDONA-595] - Add ST_LineStringFromWKB
[SEDONA-596] - Add ST_SimplifyVW
[SEDONA-597] - Add ST_SimplifyPolygonHull
[SEDONA-598] - Add ST_UnaryUnion
[SEDONA-599] - Add ST_MinimumClearance
[SEDONA-600] - Add ST_MinimumClearanceLine
[SEDONA-601] - Add ST_DelaunyTriangles
[SEDONA-602] - Add ST_LocateAlong
[SEDONA-603] - Add ST_MakePointM
[SEDONA-604] - Add ST_AddMeasure
[SEDONA-606] - Add ST_IsValidDetail
[SEDONA-607] - Include Geometry in ST Function Exceptions
[SEDONA-610] - Add ST_IsValidTrajectory
[SEDONA-615] - Add ST_MaximumInscribedCircle
[SEDONA-617] - Add ST_Rotate
[SEDONA-625] - Add ST_GeneratePoints
[SEDONA-627] - Writing covering column metadata to GeoParquet files
[SEDONA-631] - Add ST_Expand
[SEDONA-643] - Fix Flink constructor functions signatures
[SEDONA-645] - Add ST_RotateX

Improvement¶

[SEDONA-558] - Fix and improve SedonaPyDeck behavior
[SEDONA-559] - Make the flink example work
[SEDONA-568] - Refactor TestBaseScala to use method instead of a class-level variable for sparkSession
[SEDONA-616] - Apply spotless to snowflake module
[SEDONA-620] - Simplify Java if statements
[SEDONA-621] - Remove redundant call to `toString()`
[SEDONA-622] - Improve SedonaPyDeck behavior
[SEDONA-623] - Simplify Java `if` statements
[SEDONA-629] - Return Structs for RS_ Functions
[SEDONA-632] - Don't use a conventional output committer when writing raster files using df.write.format("raster")
[SEDONA-633] - Add tileWidth and tileHeight fields to the result of RS_Metadata
[SEDONA-634] - Support omitting tileWidth and tileHeight parameters when calling RS_Tile or RS_TileExplode on rasters with decent tiling scheme
[SEDONA-635] - Allow feature and feature collection format in ST_AsGeoJSON
[SEDONA-637] - Show spatial filters pushed to GeoParquet scans in the query plan
[SEDONA-638] - Send telemetry data asynchronously to avoid blocking the initialization of SedonaContext

Task¶

[SEDONA-101] - Add Scala Formatter to MVN
[SEDONA-102] - Java Code Formatting using formatter plugin
[SEDONA-553] - Update Sedona docker to use newer GeoPandas

Sedona 1.6.0¶

Sedona 1.6.0 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.19, Snowflake 7+, Java 8.

New Contributors¶

@mpetazzoni made their first contribution in https://github.com/apache/sedona/pull/1216
@sebdiem made their first contribution in https://github.com/apache/sedona/pull/1217
@guilhem-dvr made their first contribution in https://github.com/apache/sedona/pull/1229
@niklas-petersen made their first contribution in https://github.com/apache/sedona/pull/1252
@mebrein made their first contribution in https://github.com/apache/sedona/pull/1334
@docete made their first contribution in https://github.com/apache/sedona/pull/1409

Highlights¶

Sedona is now compatible with Shapely 2.0 and GeoPandas 0.11.1+.
Sedona added enhanced support for geography data. This includes
- ST_Buffer with spheroid distance
- ST_BestSRID to find the best SRID for a geometry
- ST_ShiftLongitude to shift the longitude of a geometry to mitigate the issue of crossing the date line
- ST_CrossesDateLine to check if a geometry crosses the date line
- ST_DWithin now supports spheroid distance
Sedona Spark Sedona Raster allows RS_ReropjectMatch to wrap the extent of one raster to another raster, similar to RasterArray.reproject_match function in rioxarray
Sedona Spark Sedona Raster now supports Rasterio and NumPy UDF by raster.as_numpy, raster.as_numpy_masked, raster.as_rasterio. You can perform any native function from rasterio and numpy and run them in parallel. See the example below.

from pyspark.sql.types import DoubleType


def mean_udf(raster):
    return float(raster.as_numpy().mean())


sedona.udf.register("mean_udf", mean_udf, DoubleType())
df_raster.withColumn("mean", expr("mean_udf(rast)")).show()

Bug¶

[SEDONA-532] - Sedona Spark SQL optimizer cannot optimize joins with complex conditions
[SEDONA-543] - RS_Union_aggr gives referenceRaster is null error when run on cluster

New Feature¶

[SEDONA-467] - Add optimized join support for ST_DWithin
[SEDONA-468] - Add provision to use spheroid distance in ST_DWithin
[SEDONA-475] - Add RS_NormalizeAll to normalize all bands of a raster
[SEDONA-480] - Implement ST_S2ToGeom
[SEDONA-481] - Implements ST_Snap
[SEDONA-484] - Implement ST_IsPolygonCW
[SEDONA-488] - ST_Buffer with spheroid distance
[SEDONA-498] - Add ST_BestSRID
[SEDONA-499] - Add Spheroidal ST_Buffer
[SEDONA-504] - Add ST_ShiftLongitude
[SEDONA-508] - Add ST_CrossesDateLine
[SEDONA-509] - Add Single Statistic RS_SummaryStats
[SEDONA-514] - Add RS_SetPixelType
[SEDONA-516] - Add RS_Interpolate
[SEDONA-517] - Add RS_MakeRaster for constructing a new raster using given array data as band data
[SEDONA-518] - Add RS_ReprojectMatch for wrapping the extent of one raster to another raster
[SEDONA-522] - Add ST_Union with array of Geometry as input
[SEDONA-533] - Implement ST_Polygonize
[SEDONA-539] - Support Snowflake geography type

Improvement¶

[SEDONA-483] - Implements ST_IsPolygonCCW
[SEDONA-493] - Update default behavior of RS_NormalizeAll
[SEDONA-503] - Support Shapely 2.0 in PySpark binding
[SEDONA-521] - Change ST_H3ToGeom Behavior
[SEDONA-549] - RS_Union_aggr should support combining all bands in multi-band rasters

Task¶

[SEDONA-540] - Fix failed ST_Buffer and ST_Snap Snowflake tests
[SEDONA-550] - Remove the version upper bound of Pandas, GeoPandas
[SEDONA-557] - Bump Flink from 1.14.x to 1.19.0

Sedona 1.5.3¶

Sedona 1.5.3 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.12, Snowflake 7+, Java 8.

This release is a maintenance release that includes one bug fix on top of Sedona 1.5.2. No new features or major changes are added in this release.

Bug¶

[SEDONA-556] - Hidden requirement for geopandas in apache-sedona 1.5.2
[SEDONA-555] - Snowflake Native App should not always create a new role

Sedona 1.5.2¶

Sedona 1.5.2 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.12, Snowflake 7+, Java 8.

This release is a maintenance release that includes bug fixes and minor improvements. No new features or major changes are added in this release.

New Contributors¶

@mpetazzoni made their first contribution in https://github.com/apache/sedona/pull/1216
@sebdiem made their first contribution in https://github.com/apache/sedona/pull/1217
@guilhem-dvr made their first contribution in https://github.com/apache/sedona/pull/1229
@niklas-petersen made their first contribution in https://github.com/apache/sedona/pull/1252
@mebrein made their first contribution in https://github.com/apache/sedona/pull/1334

Bug¶

[SEDONA-470] - Cannot distinguish between missing or null crs from the result of geoparquet.metadata
[SEDONA-471] - SedonaKepler cannot work with Uber H3 hex since 1.5.1
[SEDONA-472] - Adapter API no longer works with unshaded jar
[SEDONA-473] - cdm-core mistakenly becomes a compile dependency for sedona-spark-shaded
[SEDONA-477] - Avoid producing rasters with images having non-zero origins
[SEDONA-478] - Sedona 1.5.1 context initialization fails without GeoTools coverage
[SEDONA-479] - Fix RS_Normalize: Incorrect behavior for double arrays
[SEDONA-494] - Raster data source cannot write to HDFS
[SEDONA-495] - Raster data source uses shared FileSystem connections which lead to race condition
[SEDONA-497] - SpatialRDD read from multiple Shapefiles has incorrect fieldName property
[SEDONA-500] - Cannot correctly read data from directories containing multiple shapefiles
[SEDONA-501] - ST_Split maps to wrong Java-call
[SEDONA-505] - Treat geometry with SRID=0 as if it was in EPSG:4326 in various raster functions
[SEDONA-507] - RS_AsImage cannot visualize rasters with non-integral band data
[SEDONA-510] - geometry columns with snake_case names in GeoParquet files cannot be recognized as geometry column
[SEDONA-511] - geometry columns with snake_case names in GeoParquet files cannot be recognized as geometry column
[SEDONA-519] - ST_SubDivide (Snowflake) fails even on documentation example
[SEDONA-520] - Missing dependencies in Snowflake JAR
[SEDONA-531] - RDD spatial join in Python throws Not available error
[SEDONA-534] - Disable Python warning message of finding jars
[SEDONA-545] - Sedona Python DataFrame API fail due to missing commas
[SEDONA-548] - Fix Python Dataframe API Constructor registrations

Improvement¶

[SEDONA-474] - Remove manipulation of warnings config
[SEDONA-506] - Add lenient mode for RS_ZonalStats and RS_ZonalStatsAll
[SEDONA-512] - Python serializer should report the object type in the error message
[SEDONA-515] - Add handling for noDataValues in RS_Resample
[SEDONA-529] - Add basic `EditorConfig` file
[SEDONA-535] - Add the pull request labeler
[SEDONA-536] - Add CODEOWNERS file
[SEDONA-541] - Allow concurrent snowflake testers

Test¶

[SEDONA-513] - Add pre-commit hook `mixed-line-ending`
[SEDONA-523] - Add pre-commit hook `fix-byte-order-marker`
[SEDONA-524] - Clean up the `pre-commit` config
[SEDONA-525] - Add two more pre-commit hooks
[SEDONA-528] - Add `pre-commit` hook `check-yaml`
[SEDONA-530] - Add `pre-commit` hook `debug-statements`
[SEDONA-537] - Add pre-commit hook `requirements-txt-fixer`
[SEDONA-538] - Add four more pre-commit hooks
[SEDONA-542] - Add `pre-commit` hook `check-executables-have-shebangs`
[SEDONA-544] - Add `ruff-pre-commit` for `Python` linting
[SEDONA-546] - Python linting enable rule `E712`

Task¶

[SEDONA-469] - Update Sedona docker and binder to use 1.5.1
[SEDONA-496] - Dependabot: reduce the open pull requests limit to 2
[SEDONA-526] - Upgrade `actions/setup-java` to `v4`

Sedona 1.5.1¶

Sedona 1.5.1 is compiled against Spark 3.3 / Spark 3.4 / Spark 3.5, Flink 1.12, Snowflake 7+, Java 8.

Highlights¶

Sedona Snowflake Add support for Snowflake
Sedona Spark Support Spark 3.5
Sedona Spark Support Snowflake 7+
Sedona Spark Added 20+ raster functions (or variants)
Sedona Spark/Flink/Snowflake Added 7 vector functions (or variants)
Sedona Spark GeoParquet reader and writer supports projjson in metadata
Sedona Spark GeoParquet reader and writer conform to GeoParquet spec 1.0.0 instead of 1.0.0-beta1
Sedona Spark Added a legacyMode in GeoParquet reader for 1.5.1+ users to read Parquet files written by Sedona 1.3.1 and earlier
Sedona Spark Fixed a bug in GeoParquet writer so 1.3.1 and earlier users can read Parquet files written by 1.5.1+

Behavior change¶

All raster functions that take a geometry will implicitly transform the CRS of the geometry if needed.
The default CRS for these functions is 4326 for raster and geometry involved in raster functions, if not specified.
KeplerGL and DeckGL become optional dependencies for Sedona Spark Python.

New Contributors¶

@hongbo-miao made their first contribution in https://github.com/apache/sedona/pull/1063
@prantogg made their first contribution in https://github.com/apache/sedona/pull/1122
@MyEnthusiastic made their first contribution in https://github.com/apache/sedona/pull/1130
@duhaode520 made their first contribution in https://github.com/apache/sedona/pull/1193

Bug¶

[SEDONA-414] - ST_MakeLine in sedona-spark does not work with array inputs
[SEDONA-417] - Fix SedonaUtils.display_image
[SEDONA-419] - SedonaKepler and SedonaPyDeck should not be in `sedona.spark`
[SEDONA-420] - Make SedonaKepler and SedonaPydeck optional dependencies
[SEDONA-424] - Specify jt-jiffle as a provided dependency
[SEDONA-426] - Change cloning of rasters to be able to include metadata.
[SEDONA-440] - GeoParquet reader should support filter pushdown on nested fields
[SEDONA-443] - Upload-artifact leads to 503 error
[SEDONA-453] - Performance degrade when indexing points using Quadtree
[SEDONA-456] - SedonaKepler cannot work with geopandas >= 0.13.0 correctly

New Feature¶

[SEDONA-369] - Add ST_DWITHIN
[SEDONA-411] - Add RS_Rotation
[SEDONA-413] - Add buffer parameters to ST_Buffer
[SEDONA-415] - Add optional parameter to ST_Transform
[SEDONA-421] - Add RS_Clip
[SEDONA-422] - Add a feature in RS_SetBandNoDataValue and fix NoDataValue in RS_Clip
[SEDONA-427] - Add RS_RasterToWorldCoord
[SEDONA-428] - Add RS_ZonalStats & RS_ZonalStatsAll
[SEDONA-430] - geoparquet writer should have an option called `writeToCrs`
[SEDONA-431] - Add RS_PixelAsPoints
[SEDONA-432] - Add RS_PixelAsCentroids
[SEDONA-433] - Improve RS_SummaryStats performance
[SEDONA-435] - Add RS_PixelAsPolygons
[SEDONA-438] - Add NetCDF reader to Sedona
[SEDONA-439] - Add RS_Union_Aggr
[SEDONA-441] - Implement ST_LineLocatePoint
[SEDONA-449] - Add two raster column support to RS_MapAlgebra
[SEDONA-455] - Add a new data source namely geoparquet.metadata
[SEDONA-459] - Add Snowflake support
[SEDONA-460] - RS_Tile and RS_TileExplode
[SEDONA-461] - ST_IsValidReason
[SEDONA-465] - Support reading legacy parquet files written by Apache Sedona <= 1.3.1-incubating

Improvement¶

[SEDONA-339] - Skip irrelevant GitHub actions
[SEDONA-416] - importing SedonaContext, kepler.gl is not found.
[SEDONA-429] - geoparquet reader/writer should print "1.0.0" in its version
[SEDONA-434] - Improve reliability by resolve the nondeterministic of the order of the Map
[SEDONA-436] - Fix RS_SetValues bug
[SEDONA-437] - Add implicit CRS transformation
[SEDONA-446] - Add floating point datatype support in RS_AsBase64
[SEDONA-448] - RS_SetBandNoDataValue should have `replace` option
[SEDONA-454] - Change the default value of sedona.global.indextype from quadtree to rtree
[SEDONA-457] - Don't write GeometryUDT into org.apache.spark.sql.parquet.row.metadata when writing GeoParquet files
[SEDONA-464] - ST_Valid should have integer flags
[SEDONA-466] - RS_AsRaster does not use the weight and height of the raster in its parameters.

Test¶

[SEDONA-410] - pre-commit: check that scripts with shebangs are executable
[SEDONA-412] - pre-commit: add hook `end-of-file-fixer`
[SEDONA-423] - pre-commit: apply hook `end-of-file-fixer` to more files
[SEDONA-442] - pre-commit: add hook markdown-lint
[SEDONA-444] - pre-commit: add hook to trim trailing whitespace
[SEDONA-445] - pre-commit: apply hook end-of-file-fixer to more files
[SEDONA-447] - pre-commit: apply end-of-file-fixer to more files
[SEDONA-463] - Add a Makefile for convenience

Task¶

[SEDONA-450] - Support Spark 3.5
[SEDONA-458] - The docs should have examples for UDF

Sedona 1.5.0¶

Sedona 1.5.0 is compiled against Spark 3.3 / Spark 3.4 / Flink 1.12, Java 8.

Highlights¶

API breaking changes:

The following functions in Sedona requires the input data must be in longitude/latitude order otherwise they might throw errors. You can use FlipCoordinates to swap X and Y.
- ST_Transform
- ST_DistanceSphere
- ST_DistanceSpheroid
- ST_GeoHash
- All ST_H3 functions
- All ST_S2 functions
- All RS constructors
- All RS predicates
- Spark RDD: CRStransform
Rename RS_Count to RS_CountValue
Drop RS_HTML
Unshaded Sedona Spark code are all merged to a single jar sedona-spark

New features

Add 18 more ST functions for vector data processing in Sedona Spark and Sedona Flink
Add 36 more RS functions in Sedona Spark to support comprehensive raster data ETL and analytics
- You can now directly join vector and raster datasets together
- Flexible map algebra equations: SELECT RS_MapAlgebra(rast, 'D', 'out = (rast[3] - rast[0]) / (rast[3] + rast[0]);') as ndvi FROM raster_table
Add native support of Uber H3 functions in Sedona Spark and Sedona Flink.
Add SedonaKepler and SedonaPyDeck for interactive map visualization on Sedona Spark.

Bug¶

[SEDONA-318] - SerDe for RasterUDT performs poorly
[SEDONA-319] - RS_AddBandFromArray does not always produce serializable rasters
[SEDONA-322] - The "Scala and Java build" CI job occasionally fail
[SEDONA-325] - RS_FromGeoTiff is leaking file descriptors
[SEDONA-329] - Remove geometry_col parameter from SedonaKepler APIs
[SEDONA-330] - Fix bugs in SedonaPyDeck
[SEDONA-332] - RS_Value and RS_Values don't need to fetch all the pixel data
[SEDONA-337] - Failure falling back to pure python implementation when geomserde_speedup is unavailable
[SEDONA-338] - Refactor Raster construction in sedona to use AffineTransform instead of envelope
[SEDONA-358] - Refactor Functions to remove geotools dependency for most vector functions
[SEDONA-362] - RS_BandAsArray truncates the decimal part of float/double pixel values.
[SEDONA-373] - Move RasterPredicates to correct raster package to prevent redundant imports
[SEDONA-394] - fix RS_Band data type bug
[SEDONA-401] - Handle null values in RS_AsMatrix
[SEDONA-402] - Floor grid coordinates received from geotools
[SEDONA-403] - Add Null tolerance to RS_AddBandFromArray
[SEDONA-405] - Sedona driver Out of Memory on 1.4.1

New Feature¶

[SEDONA-200] - Add ST_CoordDim to Sedona
[SEDONA-213] - Add ST_BoundingDiagonal to Sedona
[SEDONA-237] - Implement ST_Dimension
[SEDONA-238] - Implement OGC GeometryType
[SEDONA-293] - Implement ST_IsCollection
[SEDONA-294] - Implement ST_Angle
[SEDONA-295] - Implement ST_LineInterpolatePoint in Flink
[SEDONA-296] - Implement ST_Multi in Sedona Flink
[SEDONA-298] - Implement ST_ClosestPoint
[SEDONA-299] - Implement ST_FrechetDistance
[SEDONA-300] - Implement ST_HausdorffDistance
[SEDONA-301] - Implement ST_Affine
[SEDONA-303] - Port all Sedona Spark functions to Sedona Flink
[SEDONA-310] - Add ST_Degrees to sedona
[SEDONA-314] - Support Optimized join on ST_HausdorffDistance
[SEDONA-315] - Support Optimized join on ST_FrechetDistance
[SEDONA-321] - Implement RS_Intersects(raster, geom)
[SEDONA-323] - Add wrapper for KeplerGl visualization in sedona
[SEDONA-328] - Add wrapper for pydeck visualizations in sedona
[SEDONA-331] - Add RS_Height and RS_Width
[SEDONA-334] - Add ScaleX and ScaleY
[SEDONA-335] - Add RS_PixelAsPoint
[SEDONA-336] - Add RS_UpperLeftX and RS_UpperLeftY
[SEDONA-340] - Add RS_ConvexHull
[SEDONA-343] - Add raster predicates: Contains and Within
[SEDONA-344] - Add RS_RasterToWorldCoordX, RS_RasterToWorldCoordY
[SEDONA-346] - Add RS_WorldToRaster APIs
[SEDONA-353] - Add RS_BandNoDataValue
[SEDONA-354] - Add RS_SkewX and RS_SkewY
[SEDONA-355] - Add RS_BandPixelType
[SEDONA-357] - Implement ST_VoronoiPolygons
[SEDONA-359] - Add RS_GeoReference
[SEDONA-361] - Add RS_MapAlgebra for performing map algebra operations using simple expressions
[SEDONA-363] - Add RS_PixelAsPolygon
[SEDONA-364] - Add RS_MinConvexHull
[SEDONA-366] - Add RS_Count
[SEDONA-367] - Add RS_PixelAsCentroid
[SEDONA-368] - Add RS_SummaryStats
[SEDONA-371] - Add optimized join support for raster-vector and raster-raster(if any) joins
[SEDONA-372] - Add RS_SetGeoReference
[SEDONA-375] - Add RS_SetBandNoDataValue
[SEDONA-376] - Add RS_SetValues
[SEDONA-378] - Add RS_SetValue
[SEDONA-379] - Add RS_AsBase64
[SEDONA-383] - Add RS_Band
[SEDONA-387] - Add RS_BandIsNoData
[SEDONA-388] - Add RS_AsRaster
[SEDONA-391] - Add RS_AsMatrix
[SEDONA-393] - Add RS_AsPNG
[SEDONA-395] - Add RS_AsImage
[SEDONA-396] - Add RS_SetValues Geometry variant
[SEDONA-398] - Add RS_AddBand
[SEDONA-404] - Add RS_Resample

Improvement¶

[SEDONA-39] - Fix the Lon/lat order issue in Sedona
[SEDONA-114] - Add ST_MakeLine to Apache Sedona
[SEDONA-142] - Add ST_Collect to Flink Catalog
[SEDONA-311] - Refactor InferredExpression to handle functions with arbitrary arity
[SEDONA-313] - Refactor ST_Affine to support signature like PostGIS
[SEDONA-324] - R – Fix failing tests
[SEDONA-326] - Improve raster band algebra functions for easier preprocessing of raster data
[SEDONA-327] - Refactor InferredExpression to handle GridCoverage2D
[SEDONA-333] - Support EWKT parser in ST_GeomFromWKT
[SEDONA-347] - Centralize usages of transform()
[SEDONA-350] - Refactor RS_AddBandFromArray to allow adding a custom noDataValue
[SEDONA-352] - Refactor MakeEmptyRaster to allow setting custom datatype for the raster
[SEDONA-360] - Handle nodata values of raster bands in a more concise way
[SEDONA-365] - Refactor RS_Count to RS_CountValue
[SEDONA-374] - RS predicates should support (geom, rast) and (rast, rast) as arguments, and use the convex hull of rasters for spatial relationship testing
[SEDONA-385] - Set the Maven Central to be the first repository to check
[SEDONA-386] - Speed up GridCoverage2D serialization
[SEDONA-392] - Add five more pre-commit hooks
[SEDONA-399] - Support Uber H3 cells
[SEDONA-400] - pre-commit add hook to ensure that links to vcs websites are permalinks
[SEDONA-408] - Set a reasonable default size for RasterUDT

Task¶

[SEDONA-316] - Refactor Sedona Jupyter notebook examples with unified SedonaContext entrypoint
[SEDONA-317] - Change map visualization in Jupyter notebooks with KeplerGL
[SEDONA-341] - Move RS_Envelope to GeometryFunctions
[SEDONA-356] - Change CRS transformation from lat/lon to lon/lat order
[SEDONA-370] - Completely drop the old GeoTiff reader and writer
[SEDONA-377] - Change sphere/spheroid functions to work with coordinates in lon/lat order
[SEDONA-380] - Merge all Sedona Spark module to a single module
[SEDONA-381] - Merge python-adapter to sql module
[SEDONA-382] - Merge SQL and Core module to a single Spark module
[SEDONA-384] - Merge viz module to the spark module
[SEDONA-397] - Move Map Algebra functions

Sedona 1.4.1¶

Sedona 1.4.1 is compiled against Spark 3.3 / Spark 3.4 / Flink 1.12, Java 8.

Highlights¶

Sedona Spark More raster functions and bridge RasterUDT and Map Algebra operators. See Raster based operators and Raster to Map Algebra operators.
Sedona Spark & Flink Added geodesic / geography functions:
- ST_DistanceSphere
- ST_DistanceSpheroid
- ST_AreaSpheroid
- ST_LengthSpheroid
Sedona Spark & Flink Introduced SedonaContext to unify Sedona entry points.
Sedona Spark Support Spark 3.4.
Sedona Spark Added a number of new ST functions.
Zeppelin Zeppelin helium plugin supports plotting geometries like linestring, polygon.

API change¶

Sedona Spark & Flink Introduced a new entry point called SedonaContext to unify all Sedona entry points in different compute engines and deprecate old Sedona register entry points. Users no longer have to register Sedona kryo serializer and import many tedious Python classes.

Sedona Spark:

Scala:

import org.apache.sedona.spark.SedonaContext
val sedona = SedonaContext.create(SedonaContext.builder().master("local[*]").getOrCreate())
sedona.sql("SELECT ST_GeomFromWKT(XXX) FROM")

Python:

from sedona.spark import *

config = (
    SedonaContext.builder()
    .config(
        "spark.jars.packages",
        "org.apache.sedona:sedona-spark-shaded-3.3_2.12:1.4.1,"
        "org.datasyslab:geotools-wrapper:1.4.0-28.2",
    )
    .getOrCreate()
)
sedona = SedonaContext.create(config)
sedona.sql("SELECT ST_GeomFromWKT(XXX) FROM")

Sedona Flink:

import org.apache.sedona.flink.SedonaContext;
StreamTableEnvironment sedona = SedonaContext.create(env, tableEnv);
sedona.sqlQuery("SELECT ST_GeomFromWKT(XXX) FROM");

Bug¶

[SEDONA-266] - RS_Values throws UnsupportedOperationException for shuffled point arrays
[SEDONA-267] - Cannot pip install apache-sedona 1.4.0 from source distribution
[SEDONA-273] - Set a upper bound for Shapely, Pandas and GeoPandas
[SEDONA-277] - Sedona spark artifacts for scala 2.13 do not have proper POMs
[SEDONA-283] - Artifacts were deployed twice when running mvn clean deploy
[SEDONA-284] - Property values in dependency deduced POMs for shaded modules were not substituted

New Feature¶

[SEDONA-196] - Add ST_Force3D to Sedona
[SEDONA-239] - Implement ST_NumPoints
[SEDONA-264] - zeppelin helium plugin supports plotting geometry like linestring, polygon
[SEDONA-280] - Add ST_GeometricMedian
[SEDONA-281] - Support geodesic / geography functions
[SEDONA-286] - Support optimized distance join on ST_DistanceSpheroid and ST_DistanceSphere
[SEDONA-287] - Use SedonaContext to unify Sedona entry points
[SEDONA-292] - Bridge Sedona Raster and Map Algebra operators
[SEDONA-297] - Implement ST_NRings
[SEDONA-302] - Implement ST_Translate

Improvement¶

[SEDONA-167] - Add __pycache__ to Python .gitignore
[SEDONA-265] - Migrate all ST functions to Sedona Inferred Expressions
[SEDONA-269] - Add data source for writing binary files
[SEDONA-270] - Remove redundant serialization for rasters
[SEDONA-271] - Add raster function RS_SRID
[SEDONA-274] - Move all ST function logics to Sedona common
[SEDONA-275] - Add raster function RS_SetSRID
[SEDONA-276] - Add support for Spark 3.4
[SEDONA-279] - Sedona-Flink should not depend on Sedona-Spark modules
[SEDONA-282] - R – Add raster write function
[SEDONA-290] - RDD Spatial Joins should follow the iterator model

Sedona 1.4.0¶

Sedona 1.4.0 is compiled against, Spark 3.3 / Flink 1.12, Java 8.

Highlights¶

Sedona Spark & Flink Serialize and deserialize geometries 3 - 7X faster
Sedona Spark & Flink Google S2 based spatial join for fast approximate point-in-polygon join. See Join query in Spark and Join query in Flink
Sedona Spark Pushdown spatial predicate on GeoParquet to reduce memory consumption by 10X: see explanation
Sedona Spark Automatically use broadcast index spatial join for small datasets
Sedona Spark New RasterUDT added to Sedona GeoTiff reader.
Sedona Spark A number of bug fixes and improvement to the Sedona R module.

API change¶

Sedona Spark & Flink Packaging strategy changed. See Maven Coordinate. Please change your Sedona dependencies if needed. We recommend sedona-spark-shaded-3.0_2.12-1.4.0 and sedona-flink-shaded_2.12-1.4.0
Sedona Spark & Flink GeoTools-wrapper version upgraded. Please use geotools-wrapper-1.4.0-28.2.

Behavior change¶

Sedona Flink Sedona Flink no longer outputs any LinearRing type geometry. All LinearRing are changed to LineString.
Sedona Spark Join optimization strategy changed. Sedona no longer optimizes spatial join when use a spatial predicate together with an equijoin predicate. By default, it prefers equijoin whenever possible. SedonaConf adds a config option called sedona.join.optimizationmode, it can be configured as one of the following values:
- all: optimize all joins having spatial predicate in join conditions. This was the behavior of Apache Sedona prior to 1.4.0.
- none: disable spatial join optimization.
- nonequi: only enable spatial join optimization on non-equi joins. This is the default mode.

When sedona.join.optimizationmode is configured as nonequi, it won't optimize join queries such as SELECT * FROM A, B WHERE A.x = B.x AND ST_Contains(A.geom, B.geom), since it is an equi-join with equi-condition A.x = B.x. Sedona will optimize for SELECT * FROM A, B WHERE ST_Contains(A.geom, B.geom)

Bug¶

[SEDONA-218] - Flaky test caused by improper handling of null struct values in Adapter.toDf
[SEDONA-221] - Outer join throws NPE for null geometries
[SEDONA-222] - GeoParquet reader does not work in non-local mode
[SEDONA-224] - java.lang.NoSuchMethodError when loading GeoParquet files using Spark 3.0.x ~ 3.2.x
[SEDONA-225] - Cannot count dataframes loaded from GeoParquet files
[SEDONA-227] - Python SerDe Performance Degradation
[SEDONA-230] - rdd.saveAsGeoJSON should generate feature properties with field names
[SEDONA-233] - Incorrect results for several joins in a single stage
[SEDONA-236] - Flakey python tests in tests.serialization.test_[de]serializers
[SEDONA-242] - Update jars dependencies in Sedona R to Sedona 1.4.0 version
[SEDONA-250] - R Deprecate use of Spark 2.4
[SEDONA-252] - Fix disabled RS_Base64 test
[SEDONA-255] - R – Translation issue for ST_Point and ST_PolygonFromEnvelope
[SEDONA-258] - Cannot directly assign raw spatial RDD to CircleRDD using Python binding
[SEDONA-259] - Adapter.toSpatialRdd in Python binding does not have valid implementation for specifying custom field names for user data
[SEDONA-261] - Cannot run distance join using broadcast index join when the distance expression references to attributes from the right-side relation

New Feature¶

[SEDONA-156] - predicate pushdown support for GeoParquet
[SEDONA-215] - Add ST_ConcaveHull
[SEDONA-216] - Upgrade jts version to 1.19.0
[SEDONA-235] - Create ST_S2CellIds in Sedona
[SEDONA-246] - R GeoTiff read/write
[SEDONA-254] - R – Add raster type
[SEDONA-262] - Don't optimize equi-join by default, add an option to configure when to optimize spatial joins

Improvement¶

[SEDONA-205] - Use BinaryType in GeometryUDT in Sedona Spark
[SEDONA-207] - Faster serialization/deserialization of geometry objects
[SEDONA-212] - Move shading to separate maven modules
[SEDONA-217] - Automatically broadcast small datasets
[SEDONA-220] - Upgrade Ubuntu build image from 18.04 to 20.04
[SEDONA-226] - Support reading and writing GeoParquet file metadata
[SEDONA-228] - Standardize logging dependencies
[SEDONA-231] - Redundant Serde Removal
[SEDONA-234] - ST_Point inconsistencies
[SEDONA-243] - Improve Sedona R file readers: GeoParquet and Shapefile
[SEDONA-244] - Align R read/write functions with the Sparklyr framework
[SEDONA-249] - Add jvm flags for running tests on Java 17
[SEDONA-251] - Add raster type to Sedona
[SEDONA-253] - Upgrade geotools to version 28.2
[SEDONA-260] - More intuitive configuration of partition and index-build side of spatial joins in Sedona SQL

Sedona 1.3.1¶

This version is a minor release on Sedona 1.3.0 line. It fixes a few critical bugs in 1.3.0. We suggest all 1.3.0 users to migrate to this version.

Bug fixes¶

SEDONA-204 - Init value in X/Y/Z max should be -Double.MAX
SEDONA-206 - Performance regression of ST_Transform in 1.3.0-incubating
SEDONA-210 - 1.3.0-incubating doesn't work with Scala 2.12 sbt projects
SEDONA-211 - Enforce release managers to use JDK 8
SEDONA-201 - Implement ST_MLineFromText and ST_MPolyFromText methods

New Feature¶

SEDONA-196 - Add ST_Force3D to Sedona
SEDONA-197 - Add ST_ZMin, ST_ZMax to Sedona
SEDONA-199 - Add ST_NDims to Sedona

Improvement¶

SEDONA-194 - Merge org.datasyslab.sernetcdf into Sedona
SEDONA-208 - Use Spark RuntimeConfig in SedonaConf

Note

Support of Spark 2.X and Scala 2.11 was removed in Sedona 1.3.0+ although some parts of the source code might still be compatible. Sedona 1.3.0+ releases binary for both Scala 2.12 and 2.13.

Sedona 1.3.0¶

This version is a major release on Sedona 1.3.0 line and consists of 50 PRs. It includes many new functions, optimization and bug fixes.

Highlights¶

GEOGCS["WGS 84",
  DATUM["WGS_1984",
  SPHEROID["WGS 84",6378137,298.257223563,
  AUTHORITY["EPSG","7030"]],
  AUTHORITY["EPSG","6326"]],
  PRIMEM["Greenwich",0,
  AUTHORITY["EPSG","8901"]],
  UNIT["degree",0.0174532925199433,
  AUTHORITY["EPSG","9122"]],
  AUTHORITY["EPSG","4326"]]

Bug fixes¶

SEDONA-119 - ST_Touches join query returns true for polygons whose interiors intersect
SEDONA-136 - Enable testAsEWKT for Flink
SEDONA-137 - Fix ST_Buffer for Flink to work
SEDONA-138 - Fix ST_GeoHash for Flink to work
SEDONA-153 - Python Serialization Fails with Nulls
SEDONA-158 - Fix wrong description about ST_GeometryN in the API docs
SEDONA-169 - Fix ST_RemovePoint in accordance with the API document
SEDONA-178 - Correctness issue in distance join queries
SEDONA-182 - ST_AsText should not return SRID
SEDONA-186 - collecting result rows of a spatial join query with SELECT * fails with serde error
SEDONA-188 - Python warns about missing jars even when some are found
SEDONA-193 - ST_AsBinary produces EWKB by mistake

New Features¶

SEDONA-94 - GeoParquet Support For Sedona
SEDONA-125 - Allows customized CRS in ST_Transform
SEDONA-166 - Provide Type-safe DataFrame Style API
SEDONA-168 - Add ST_Normalize to Apache Sedona
SEDONA-171 - Add ST_SetPoint to Apache Sedona

Improvement¶

SEDONA-121 - Add equivalent constructors left over from Spark to Flink
SEDONA-132 - Create common module for SQL functions
SEDONA-133 - Allow user-defined schemas in Adapter.toDf()
SEDONA-139 - Fix wrong argument order in Flink unit tests
SEDONA-140 - Update Sedona Dependencies in R Package
SEDONA-143 - Add missing unit tests for the Flink predicates
SEDONA-144 - Add ST_AsGeoJSON to the Flink API
SEDONA-145 - Fix ST_AsEWKT to reserve the Z coordinate
SEDONA-146 - Add missing output functions to the Flink API
SEDONA-147 - Add SRID functions to the Flink API
SEDONA-148 - Add boolean functions to the Flink API
SEDONA-149 - Add Python 3.10 support
SEDONA-151 - Add ST aggregators to Sedona Flink
SEDONA-152 - Add reader/writer functions for GML and KML
SEDONA-154 - Add measurement functions to the Flink API
SEDONA-157 - Add coordinate accessors to the Flink API
SEDONA-159 - Add Nth accessor functions to the Flink API
SEDONA-160 - Fix geoparquetIOTests.scala to cleanup after test
SEDONA-161 - Add ST_Boundary to the Flink API
SEDONA-162 - Add ST_Envelope to the Flink API
SEDONA-163 - Better handle of unsupported types in shapefile reader
SEDONA-164 - Add geometry count functions to the Flink API
SEDONA-165 - Upgrade Apache Rat to 0.14
SEDONA-170 - Add ST_AddPoint and ST_RemovePoint to the Flink API
SEDONA-172 - Add ST_LineFromMultiPoint to Apache Sedona
SEDONA-176 - Make ST_Contains conform with OGC standard, and add ST_Covers and ST_CoveredBy functions.
SEDONA-177 - Support spatial predicates other than INTERSECTS and COVERS/COVERED_BY in RangeQuery.SpatialRangeQuery and JoinQuery.SpatialJoinQuery
SEDONA-181 - Build fails with java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$
SEDONA-189 - Prepare geometries in broadcast join
SEDONA-192 - Null handling in predicates
SEDONA-195 - Add wkt validation and an optional srid to ST_GeomFromWKT/ST_GeomFromText

Task¶

SEDONA-150 - Drop Spark 2.4 and Scala 2.11 support

Sedona 1.2.1¶

This version is a maintenance release on Sedona 1.2.0 line. It includes bug fixes.

Sedona on Spark is now compiled against Spark 3.3, instead of Spark 3.2.

SQL (for Spark)¶

Bug fixes:

SEDONA-104: Bug in reading band values of GeoTiff images
SEDONA-118: Fix the wrong result in ST_Within
SEDONA-123: Fix the check for invalid lat/lon in ST_GeoHash

Improvement:

SEDONA-96: Refactor ST_MakeValid to use GeometryFixer
SEDONA-108: Write support for GeoTiff images
SEDONA-122: Overload ST_GeomFromWKB for BYTES column
SEDONA-127: Add null safety to ST_GeomFromWKT/WKB/Text
SEDONA-129: Support Spark 3.3
SEDONA-135: Consolidate and upgrade hadoop dependency

New features:

SEDONA-107: Add St_Reverse function
SEDONA-105: Add ST_PointOnSurface function
SEDONA-95: Add ST_Disjoint predicate
SEDONA-112: Add ST_AsEWKT
SEDONA-106: Add ST_LineFromText
SEDONA-117: Add RS_AppendNormalizedDifference
SEDONA-97: Add ST_Force_2D
SEDONA-98: Add ST_IsEmpty
SEDONA-116: Add ST_YMax and ST_YMin
SEDONA-115: Add ST_XMax and ST_Min
SEDONA-120: Add ST_BuildArea
SEDONA-113: Add ST_PointN
SEDONA-124: Add ST_CollectionExtract
SEDONA-109: Add ST_OrderingEquals

Flink¶

New features:

SEDONA-107: Add St_Reverse function
SEDONA-105: Add ST_PointOnSurface function
SEDONA-95: Add ST_Disjoint predicate
SEDONA-112: Add ST_AsEWKT
SEDONA-97: Add ST_Force_2D
SEDONA-98: Add ST_IsEmpty
SEDONA-116: Add ST_YMax and ST_YMin
SEDONA-115: Add ST_XMax and ST_Min
SEDONA-120: Add ST_BuildArea
SEDONA-113: Add ST_PointN
SEDONA-110: Add ST_GeomFromGeoHash
SEDONA-121: More ST constructors to Flink
SEDONA-122: Overload ST_GeomFromWKB for BYTES column

Sedona 1.2.0¶

This version is a major release on Sedona 1.2.0 line. It includes bug fixes and new features: Sedona with Apache Flink.

RDD¶

Bug fix:

SEDONA-18: Fix an error reading Shapefile
SEDONA-73: Exclude scala-library from scala-collection-compat

Improvement:

SEDONA-77: Refactor Format readers and spatial partitioning functions to be standalone libraries. So they can be used by Flink and others.

SQL¶

New features:

SEDONA-4: Handle nulls in SQL functions
SEDONA-65: Create ST_Difference function
SEDONA-68 Add St_Collect function.
SEDONA-82: Create ST_SymDifference function
SEDONA-75: Add support for "3D" geometries: Preserve Z coordinates on geometries when serializing, ST_AsText, ST_Z, ST_3DDistance
SEDONA-86: Support empty geometries in ST_AsBinary and ST_AsEWKB
SEDONA-90: Add ST_Union
SEDONA-100: Add st_multi function

Bug fix:

SEDONA-89: GeometryUDT equals should test equivalence of the other object

Flink¶

Major update:

SEDONA-80: Geospatial stream processing support in Flink Table API
SEDONA-85: ST_Geohash function in Flink
SEDONA-87: Support Flink Table and DataStream conversion
SEDONA-93: Add ST_GeomFromGeoJSON

Sedona 1.1.1¶

This version is a maintenance release on Sedona 1.1.X line. It includes bug fixes and a few new functions.

Global¶

New feature:

SEDONA-73: Scala source code supports Scala 2.13

SQL¶

Bug fix:

SEDONA-67: Support Spark 3.2

New features:

SEDONA-43: Add ST_GeoHash and ST_GeomFromGeoHash
SEDONA-45: Add ST_MakePolygon
SEDONA-71: Add ST_AsBinary, ST_AsEWKB, ST_SRID, ST_SetSRID

Sedona 1.1.0¶

This version is a major release on Sedona 1.1.0 line. It includes bug fixes and new features: R language API, Raster data and Map algebra support

Global¶

Dependency upgrade:

SEDONA-30: Use Geotools-wrapper 1.1.0-24.1 to include geotools GeoTiff libraries.

Improvement on join queries in core and SQL:

SEDONA-63: Skip empty partitions in NestedLoopJudgement
SEDONA-64: Broadcast dedupParams to improve performance

Behavior change:

SEDONA-62: Ignore HDF test in order to avoid NASA copyright issue

Core¶

Bug fix:

SEDONA-41: Fix rangeFilter bug when the leftCoveredByRight para is false
SEDONA-53: Fix SpatialKnnQuery NullPointerException

SQL¶

Major update:

SEDONA-30: Add GeoTiff raster I/O and Map Algebra function

New function:

SEDONA-27: Add ST_Subdivide and ST_SubdivideExplode functions

Bug fix:

SEDONA-56: Fix broadcast join with Adapter Query Engine enabled
SEDONA-22, SEDONA-60: Fix join queries in SparkSQL when one side has no rows or only one row

Viz¶

N/A

Python¶

Improvement:

SEDONA-59: Make pyspark dependency of Sedona Python optional

Bug fix:

SEDONA-50: Remove problematic logging conf that leads to errors on Databricks
Fix the issue: Spark dependency in setup.py was configured to be < v3.1.0 by mistake.

R¶

Major update:

SEDONA-31: Add R interface for Sedona

Sedona 1.0.1¶

This version is a maintenance release on Sedona 1.0.0 line. It includes bug fixes, some new features, one API change

Known issue¶

In Sedona v1.0.1 and earlier versions, the Spark dependency in setup.py was configured to be < v3.1.0 by mistake. When you install Sedona Python (apache-sedona v1.0.1) from PyPI, pip might uninstall PySpark 3.1.1 and install PySpark 3.0.2 on your machine.

Three ways to fix this:

After install apache-sedona v1.0.1, uninstall PySpark 3.0.2 and reinstall PySpark 3.1.1
Ask pip not to install Sedona dependencies: pip install --no-deps apache-sedona
Install Sedona from the latest setup.py (on GitHub) manually.

Global¶

Dependency upgrade:

SEDONA-16: Use a GeoTools Maven Central wrapper to fix failed Jupyter notebook examples
SEDONA-29: upgrade to Spark 3.1.1
SEDONA-33: jts2geojson version from 0.14.3 to 0.16.1

Core¶

Bug fix:

SEDONA-35: Address user-data mutability issue with Adapter.toDF()

SQL¶

Bug fix:

SEDONA-14: Saving dataframe to CSV or Parquet fails due to unknown type
SEDONA-15: Add ST_MinimumBoundingRadius and ST_MinimumBoundingCircle functions
SEDONA-19: Global indexing does not work with SQL joins
SEDONA-20: Case object GeometryUDT and GeometryUDT instance not equal in Spark 3.0.2

New function:

SEDONA-21: allows Sedona to be used in pure SQL environment
SEDONA-24: Add ST_LineSubString and ST_LineInterpolatePoint
SEDONA-26: Add broadcast join support

Viz¶

Improvement:

SEDONA-32: Speed up ST_Render

API change:

SEDONA-29: Upgrade to Spark 3.1.1 and fix ST_Pixelize

Python¶

Bug fix:

SEDONA-19: Global indexing does not work with SQL joins

Sedona 1.0.0¶

This version is the first Sedona release since it joins the Apache Incubator. It includes new functions, bug fixes, and API changes.

Global¶

Key dependency upgrade:

SEDONA-1: upgrade to JTS 1.18
upgrade to GeoTools 24.0
upgrade to jts2geojson 0.14.3

Key dependency packaging strategy change:

JTS, GeoTools, jts2geojson are no longer packaged in Sedona jars. End users need to add them manually. See here.

Key compilation target change:

SEDONA-3: Paths and class names have been changed to Apache Sedona
SEDONA-7: build the source code for Spark 2.4, 3.0, Scala 2.11, 2.12, Python 3.7, 3.8, 3.9. See here.

Sedona-core¶

Bug fix:

PR 443: read multiple Shape Files by multiPartitions
PR 451 (API change): modify CRSTransform to ignore datum shift

New function:

SEDONA-8: spatialRDD.flipCoordinates()

API / behavior change:

PR 488: JoinQuery.SpatialJoinQuery/DistanceJoinQuery now returns <Geometry, List> instead of <Geometry, HashSet> because we can no longer use HashSet in Sedona for duplicates removal. All original duplicates in both input RDDs will be preserved in the output.

Sedona-sql¶

Bug fix:

SEDONA-8 (API change): ST_Transform slow due to lock contention.
PR 427: ST_Point and ST_PolygonFromEnvelope now allows Double type

New function:

PR 499: ST_Azimuth, ST_X, ST_Y, ST_StartPoint, ST_Boundary, ST_EndPoint, ST_ExteriorRing, ST_GeometryN, ST_InteriorRingN, ST_Dump, ST_DumpPoints, ST_IsClosed, ST_NumInteriorRings, ST_AddPoint, ST_RemovePoint, ST_IsRing
PR 459: ST_LineMerge
PR 460: ST_NumGeometries
PR 469: ST_AsGeoJSON
SEDONA-8: ST_FlipCoordinates

Behavior change:

PR 480: Aggregate Functions rewrite for new Aggregator API. The functions can be used as typed functions in code and enable compilation-time type check.

API change:

SEDONA-11: Adapter.toDf() will directly generate a geometry type column. ST_GeomFromWKT is no longer needed.

Sedona-viz¶

API change: Drop the function which can generate SVG vector images because the required library has an incompatible license and the SVG image is not good at plotting big data

Sedona Python¶

API/Behavior change:

Python-to-Sedona adapter is moved to a separate module. To use Sedona Python, see here

New function:

PR 448: Add support for partition number in spatialPartitioning function spatial_rdd.spatialPartitioning(grid_type, NUM_PARTITION)