Spatial Joins¶
You can perform spatial joins using standard SQL JOIN syntax. The join condition is defined in the ON clause using a spatial function that specifies the relationship between the geometries of the two tables.
General Spatial Join¶
Use functions like ST_Contains, ST_Intersects, or ST_Within to join tables based on their spatial relationship.
Example¶
Assign a country to each city by checking which country polygon contains each city point.
SELECT
cities.name as city,
countries.name as country
FROM
cities
INNER JOIN
countries
ON ST_Contains(countries.geometry, cities.geometry)
K-Nearest Neighbor (KNN) Join¶
Use the specialized ST_KNN function to find the k nearest neighbors from one table for each geometry in another. This is useful for proximity analysis.
Example¶
For each city, find the 5 other closest cities.
SELECT
cities_l.name AS city,
cities_r.name AS nearest_neighbor
FROM
cities AS cities_l
INNER JOIN
cities AS cities_r
ON ST_KNN(cities_l.geometry, cities_r.geometry, 5, false)
KNN Join Caveats¶
Filter Pushdown Behavior¶
In KNN joins, the query side is the first geometry argument to ST_KNN, and the object side is the second argument. For each query-side row, the join finds the k nearest object-side rows.
The optimizer automatically pushes query-side filters below the KNN join for inner joins. This is safe because filtering query rows before the join only reduces the number of probe points — each remaining query point still gets its full KNN search against all objects.
Object-side filters are never pushed below the KNN join automatically, because doing so would change which candidates are considered for the KNN search, altering the results. All object-side WHERE clause predicates are evaluated after the K nearest neighbor candidates have been selected.
For example, in the following query, r.rating > 4.0 is applied after finding the 3 nearest restaurants for each hotel — it does not reduce the set of candidate restaurants before the KNN search:
SELECT
h.name AS hotel,
r.name AS restaurant,
r.rating
FROM
hotels AS h
INNER JOIN
restaurants AS r
ON ST_KNN(h.geometry, r.geometry, 3, false)
WHERE
r.rating > 4.0
This means the result may contain fewer than 3 restaurants per hotel if some of the nearest neighbors do not pass the filter.
However, a query-side filter like h.stars >= 4 is automatically pushed below the join:
SELECT h.name AS hotel, r.name AS restaurant, r.rating
FROM
hotels AS h
INNER JOIN
restaurants AS r
ON ST_KNN(h.geometry, r.geometry, 3, false)
WHERE
h.stars >= 4
The physical plan shows the filter below the join, inside the query-side input:
SpatialJoinExec: join_type=Inner, on=ST_KNN(geometry, geometry, 3, false)
FilterExec: stars >= 4
...hotels... ← only luxury hotels are scanned
...restaurants...
Only hotels with stars >= 4 are used as query points, and the 3 nearest restaurants are found for each of those luxury hotels.
Pre-Filtering the Object Side¶
To filter the object side before the KNN search (e.g., only consider high-rated restaurants), use a subquery or CTE so the filter is applied before the join sees the data:
SELECT h.name AS hotel, r.name AS restaurant, r.rating
FROM
hotels AS h
INNER JOIN
(SELECT * FROM restaurants WHERE rating > 4.0) AS r
ON ST_KNN(h.geometry, r.geometry, 3, false)
Or equivalently, using a CTE:
WITH high_rated AS (
SELECT * FROM restaurants WHERE rating > 4.0
)
SELECT h.name AS hotel, r.name AS restaurant, r.rating
FROM
hotels AS h
INNER JOIN
high_rated AS r
ON ST_KNN(h.geometry, r.geometry, 3, false)
This answers "What are the 3 nearest high-rated restaurants to each hotel?" because the KNN search only considers restaurants with rating > 4.0.
ST_KNN Predicate Precedence¶
When ST_KNN is combined with other predicates via AND, ST_KNN always takes precedence. It is extracted first to determine the KNN candidates, and the remaining predicates are applied as post-filters on the join output.
For example, the following two queries produce the same results:
-- ST_KNN in ON clause combined with another predicate via AND
SELECT h.name AS hotel, r.name AS restaurant
FROM hotels AS h
JOIN restaurants AS r
ON ST_KNN(h.geometry, r.geometry, 3, false) AND r.rating > 4.0
-- Equivalent: ST_KNN in ON clause, other predicate in WHERE
SELECT h.name AS hotel, r.name AS restaurant
FROM hotels AS h
JOIN restaurants AS r
ON r.rating > 4.0 AND ST_KNN(h.geometry, r.geometry, 3, false)
In both cases, ST_KNN determines the 3 nearest restaurants first, then r.rating > 4.0 filters the results.