Import data from a spatial RDD into a Spark Dataframe. — sdf_register.spatial

Import data from a spatial RDD (possibly with non-spatial attributes) into a Spark Dataframe.

sdf_register: method for sparklyr's sdf_register to handle Spatial RDD
as.spark.dataframe: lower level function with more fine-grained control on non-spatial columns

Usage

# S3 method for class 'spatial_rdd'
sdf_register(x, name = NULL)

as.spark.dataframe(x, non_spatial_cols = NULL, name = NULL)

Arguments

x: A spatial RDD.
name: Name to assign to the resulting Spark temporary view. If unspecified, then a random name will be assigned.
non_spatial_cols: Column names for non-spatial attributes in the resulting Spark Dataframe. By default (NULL) it will import all field names if that property exists, in particular for shapefiles.

Value

A Spark Dataframe containing the imported spatial data.

Examples

library(sparklyr)
library(apache.sedona)

sc <- spark_connect(master = "spark://HOST:PORT")

if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_geojson_to_typed_rdd(
    sc,
    location = input_location,
    type = "polygon"
  )
  sdf <- sdf_register(rdd)
  
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    delimiter = ",",
    type = "point",
    first_spatial_col_index = 1L,
    repartition = 5
  )
  sdf <- as.spark.dataframe(rdd, non_spatial_cols = c("attr1", "attr2"))
}