xarray.Dataset.to_zarr

Dataset.to_zarr(store=None, chunk_store=None, mode=None, synchronizer=None, group=None, encoding=None, compute=True, consolidated=False, append_dim=None, region=None)[source]

Write dataset contents to a zarr group.

Note

Experimental The Zarr backend is new and experimental. Please report any unexpected behavior via github issues.

Parameters
  • store (MutableMapping, str or Path, optional) – Store or path to directory in file system.

  • chunk_store (MutableMapping, str or Path, optional) – Store or path to directory in file system only for Zarr array chunks. Requires zarr-python v2.4.0 or later.

  • mode ({"w", "w-", "a", None}, optional) – Persistence mode: “w” means create (overwrite if exists); “w-” means create (fail if exists); “a” means override existing variables (create if does not exist). If append_dim is set, mode can be omitted as it is internally set to "a". Otherwise, mode will default to w- if not set.

  • synchronizer (object, optional) – Zarr array synchronizer.

  • group (str, optional) – Group path. (a.k.a. path in zarr terminology.)

  • encoding (dict, optional) – Nested dictionary with variable names as keys and dictionaries of variable specific encodings as values, e.g., {"my_variable": {"dtype": "int16", "scale_factor": 0.1,}, ...}

  • compute (bool, optional) – If True write array data immediately, otherwise return a dask.delayed.Delayed object that can be computed to write array data later. Metadata is always updated eagerly.

  • consolidated (bool, optional) – If True, apply zarr’s consolidate_metadata function to the store after writing metadata.

  • append_dim (hashable, optional) – If set, the dimension along which the data will be appended. All other dimensions on overriden variables must remain the same size.

  • region (dict, optional) – Optional mapping from dimension names to integer slices along dataset dimensions to indicate the region of existing zarr array(s) in which to write this dataset’s data. For example, {'x': slice(0, 1000), 'y': slice(10000, 11000)} would indicate that values should be written to the region 0:1000 along x and 10000:11000 along y.

    Two restrictions apply to the use of region:

    • If region is set, _all_ variables in a dataset must have at least one dimension in common with the region. Other variables should be written in a separate call to to_zarr().

    • Dimensions cannot be included in both region and append_dim at the same time. To create empty arrays to fill in with region, use a separate call to to_zarr() with compute=False. See “Appending to existing Zarr stores” in the reference documentation for full details.

References

https://zarr.readthedocs.io/

Notes

Zarr chunking behavior:

If chunks are found in the encoding argument or attribute corresponding to any DataArray, those chunks are used. If a DataArray is a dask array, it is written with those chunks. If not other chunks are found, Zarr uses its own heuristics to choose automatic chunk sizes.

See also

http

//xarray.pydata.org/en/stable/io.html#zarr