sedona.spark.geopandas.io

sedona.spark.geopandas.io.read_file(filename: str, format: str | None = None, **kwargs)[source]

Alternate constructor to create a GeoDataFrame from a file.

Parameters:
  • filename (str) – File path or file handle to read from. If the path is a directory, Sedona will read all files in the directory into a dataframe.

  • format (str, default None) –

    The format of the file to read. If None, Sedona will infer the format from the file extension. Note, inferring the format from the file extension is not supported for directories. Options:

    • ”shapefile”

    • ”geojson”

    • ”geopackage”

    • ”geoparquet”

See also

GeoDataFrame.to_file

write GeoDataFrame to file

sedona.spark.geopandas.io.read_parquet(path, columns=None, storage_options=None, bbox=None, to_pandas_kwargs=None, **kwargs)[source]

Load a Parquet object from the file path, returning a GeoDataFrame.

  • if no geometry columns are read, this will raise a ValueError - you should use the pandas read_parquet method instead.

If ‘crs’ key is not present in the GeoParquet metadata associated with the Parquet object, it will default to “OGC:CRS84” according to the specification.

Parameters:
  • path (str, path object)

  • columns (list-like of strings, default=None) – Not currently supported in Sedona

  • storage_options (dict, optional) – Not currently supported in Sedona

  • bbox (tuple, optional) – Not currently supported in Sedona

  • to_pandas_kwargs (dict, optional) – Not currently supported in Sedona

Return type:

GeoDataFrame

Examples

from sedona.spark.geopandas import read_parquet >>> df = read_parquet(“data.parquet”) # doctest: +SKIP

Specifying columns to read:

>>> df = read_parquet(
...     "data.parquet",
... )