Given a spatial RDD, a query object x
, and an integer k, find the k
nearest spatial objects within the RDD from x
(distance between
x
and another geometrical object will be measured by the minimum
possible length of any line segment connecting those 2 objects).
Arguments
- rdd
A Sedona spatial RDD.
- x
The query object.
- k
Number of nearest spatail objects to return.
- index_type
Index to use to facilitate the KNN query. If NULL, then do not build any additional spatial index on top of
x
. Supported index types are "quadtree" and "rtree".- result_type
Type of result to return. If "rdd" (default), then the k nearest objects will be returned in a Sedona spatial RDD. If "sdf", then a Spark dataframe containing the k nearest objects will be returned. If "raw", then a list of k nearest objects will be returned. Each element within this list will be a JVM object of type
org.locationtech.jts.geom.Geometry
.
See also
Other Sedona spatial query:
sedona_range_query()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
knn_query_pt_x <- -84.01
knn_query_pt_y <- 34.01
knn_query_pt_tbl <- sdf_sql(
sc,
sprintf(
"SELECT ST_GeomFromText(\"POINT(%f %f)\") AS `pt`",
knn_query_pt_x,
knn_query_pt_y
)
) %>%
collect()
knn_query_pt <- knn_query_pt_tbl$pt[[1]]
input_location <- "/dev/null" # replace it with the path to your input file
rdd <- sedona_read_geojson_to_typed_rdd(
sc,
location = input_location,
type = "polygon"
)
knn_result_sdf <- sedona_knn_query(
rdd,
x = knn_query_pt, k = 3, index_type = "rtree", result_type = "sdf"
)
}