Import data from a spatial RDD into a Spark Dataframe.
Source:R/sdf_interface.R
sdf_register.spatial_rdd.Rd
Import data from a spatial RDD (possibly with non-spatial attributes) into a Spark Dataframe.
sdf_register
: method for sparklyr's sdf_register to handle Spatial RDDas.spark.dataframe
: lower level function with more fine-grained control on non-spatial columns
Usage
# S3 method for class 'spatial_rdd'
sdf_register(x, name = NULL)
as.spark.dataframe(x, non_spatial_cols = NULL, name = NULL)
Arguments
- x
A spatial RDD.
- name
Name to assign to the resulting Spark temporary view. If unspecified, then a random name will be assigned.
- non_spatial_cols
Column names for non-spatial attributes in the resulting Spark Dataframe. By default (NULL) it will import all field names if that property exists, in particular for shapefiles.
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
input_location <- "/dev/null" # replace it with the path to your input file
rdd <- sedona_read_geojson_to_typed_rdd(
sc,
location = input_location,
type = "polygon"
)
sdf <- sdf_register(rdd)
input_location <- "/dev/null" # replace it with the path to your input file
rdd <- sedona_read_dsv_to_typed_rdd(
sc,
location = input_location,
delimiter = ",",
type = "point",
first_spatial_col_index = 1L,
repartition = 5
)
sdf <- as.spark.dataframe(rdd, non_spatial_cols = c("attr1", "attr2"))
}