xarray.Dataset.map_blocks

Dataset.map_blocks(self, func: 'Callable[..., T_DSorDA]', args: Sequence[Any] = (), kwargs: Mapping[str, Any] = None) → 'T_DSorDA'

Apply a function to each chunk of this Dataset. This method is experimental and its signature may change.

Parameters
  • func (callable) –

    User-provided function that accepts a Dataset as its first parameter. The function will receive a subset of this Dataset, corresponding to one chunk along each chunked dimension. func will be executed as func(obj_subset, *args, **kwargs).

    The function will be first run on mocked-up data, that looks like this Dataset but has sizes 0, to determine properties of the returned object such as dtype, variable names, new dimensions and new indexes (if any).

    This function must return either a single DataArray or a single Dataset.

    This function cannot change size of existing dimensions, or add new chunked dimensions.

  • args (Sequence) – Passed verbatim to func after unpacking, after the sliced DataArray. xarray objects, if any, will not be split by chunks. Passing dask collections is not allowed.

  • kwargs (Mapping) – Passed verbatim to func after unpacking. xarray objects, if any, will not be split by chunks. Passing dask collections is not allowed.

Returns

  • A single DataArray or Dataset with dask backend, reassembled from the outputs of

  • the function.

Notes

This method is designed for when one needs to manipulate a whole xarray object within each chunk. In the more common case where one can work on numpy arrays, it is recommended to use apply_ufunc.

If none of the variables in this Dataset is backed by dask, calling this method is equivalent to calling func(self, *args, **kwargs).