Perform a spatial join operation on two Sedona spatial RDDs.
Source:R/spatial_join_op.R
sedona_spatial_join.Rd
Given spatial_rdd
and query_window_rdd
, return a pair RDD containing all
pairs of geometrical elements (p, q) such that p is an element of
spatial_rdd
, q is an element of query_window_rdd
, and (p, q) satisfies
the spatial relation specified by join_type
.
Arguments
- spatial_rdd
Spatial RDD containing geometries to be queried.
- query_window_rdd
Spatial RDD containing the query window(s).
- join_type
Type of the join query (must be either "contain" or "intersect"). If
join_type
is "contain", then a geometry fromspatial_rdd
will match a geometry from thequery_window_rdd
if and only if the former is fully contained in the latter. Ifjoin_type
is "intersect", then a geometry fromspatial_rdd
will match a geometry from thequery_window_rdd
if and only if the former intersects the latter.- partitioner
Spatial partitioning to apply to both
spatial_rdd
andquery_window_rdd
to facilitate the join query. Can be either a grid type (currently "quadtree" and "kdbtree" are supported) or a custom spatial partitioner object. Ifpartitioner
is NULL, then assume the same spatial partitioner has been applied to bothspatial_rdd
andquery_window_rdd
already and skip the partitioning step.- index_type
Controls how
spatial_rdd
andquery_window_rdd
will be indexed (unless they are indexed already). If "NONE", then no index will be constructed and matching geometries will be identified in a doubly nested- loop iterating through all possible pairs of elements fromspatial_rdd
andquery_window_rdd
, which will be inefficient for large data sets.
See also
Other Sedona spatial join operator:
sedona_spatial_join_count_by_key()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
input_location <- "/dev/null" # replace it with the path to your input file
rdd <- sedona_read_dsv_to_typed_rdd(
sc,
location = input_location,
delimiter = ",",
type = "point",
first_spatial_col_index = 1L
)
query_rdd_input_location <- "/dev/null" # replace it with the path to your input file
query_rdd <- sedona_read_shapefile_to_typed_rdd(
sc,
location = query_rdd_input_location,
type = "polygon"
)
join_result_rdd <- sedona_spatial_join(
rdd,
query_rdd,
join_type = "intersect",
partitioner = "quadtree"
)
}