Parameter
Usage¶
SedonaSQL supports many parameters. To change their values,
- Set it through SparkConf:
sparkSession = SparkSession.builder().
config("spark.serializer","org.apache.spark.serializer.KryoSerializer").
config("spark.kryo.registrator", "org.apache.sedona.core.serde.SedonaKryoRegistrator").
config("sedona.global.index","true")
master("local[*]").appName("mySedonaSQLdemo").getOrCreate()
- Check your current SedonaSQL configuration:
val sedonaConf = new SedonaConf(sparkSession.conf)
println(sedonaConf)
- Sedona parameters can be changed at runtime:
sparkSession.conf.set("sedona.global.index","false")
In addition, you can also add spark
prefix to the parameter name, for example:
sparkSession.conf.set("spark.sedona.global.index","false")
However, any parameter set through spark
prefix will be honored by Spark, which means you can set these parameters before hand via spark-defaults.conf
or Spark on Kubernetes configuration.
If you set the same parameter through both sedona
and spark.sedona
prefixes, the parameter set through sedona
prefix will override the parameter set through spark.sedona
prefix.
Explanation¶
- sedona.global.index
- Use spatial index (currently, only supports in SQL range join and SQL distance join)
- Default: true
- Possible values: true, false
- sedona.global.indextype
- Spatial index type, only valid when "sedona.global.index" is true
- Default: rtree
- Possible values: rtree, quadtree
- sedona.join.autoBroadcastJoinThreshold
- Configures the maximum size in bytes for a table that will be broadcast to all worker nodes when performing a join. By setting this value to -1 automatic broadcasting can be disabled.
- Default: The default value is the same as spark.sql.autoBroadcastJoinThreshold
- Possible values: any integer with a byte suffix i.e. 10MB or 512KB
- sedona.join.gridtype
- Spatial partitioning grid type for join query
- Default: kdbtree
- Possible values: quadtree, kdbtree
- spark.sedona.join.knn.includeTieBreakers
- KNN join will include all ties in the result, possibly returning more than k results
- Default: false
- Possible values: true, false
- sedona.join.indexbuildside (Advanced users only!)
- The side which Sedona builds spatial indices on
- Default: left
- Possible values: left, right
- sedona.join.numpartition (Advanced users only!)
- Number of partitions for both sides in a join query
- Default: -1, which means use the existing partitions
- Possible values: any integers
- sedona.join.spatitionside (Advanced users only!)
- The dominant side in spatial partitioning stage
- Default: left
- Possible values: left, right
- sedona.join.optimizationmode (Advanced users only!)
- When should Sedona optimize spatial join SQL queries
- Default: nonequi
- Possible values:
- all: Always optimize spatial join queries, even for equi-joins.
- none: Disable optimization for spatial joins.
- nonequi: Optimize spatial join queries that are not equi-joins.
- spark.sedona.enableParserExtension
- Enable the parser extension to parse GEOMETRY data type in SQL DDL statements
- Default: true
- Possible values: true, false