xarray.combine_nested

xarray.combine_nested(datasets, concat_dim, compat='no_conflicts', data_vars='all', coords='different', fill_value=<NA>)

Explicitly combine an N-dimensional grid of datasets into one by using a succession of concat and merge operations along each dimension of the grid.

Does not sort the supplied datasets under any circumstances, so the datasets must be passed in the order you wish them to be concatenated. It does align coordinates, but different variables on datasets can cause it to fail under some scenarios. In complex cases, you may need to clean up your data and use concat/merge explicitly.

To concatenate along multiple dimensions the datasets must be passed as a nested list-of-lists, with a depth equal to the length of concat_dims. manual_combine will concatenate along the top-level list first.

Useful for combining datasets from a set of nested directories, or for collecting the output of a simulation parallelized along multiple dimensions.

Parameters
datasetslist or nested list of xarray.Dataset objects.

Dataset objects to combine. If concatenation or merging along more than one dimension is desired, then datasets must be supplied in a nested list-of-lists.

concat_dimstr, or list of str, DataArray, Index or None

Dimensions along which to concatenate variables, as used by xarray.concat(). Set concat_dim=[..., None, ...] explicitly to disable concatenation and merge instead along a particular dimension. The position of None in the list specifies the dimension of the nested-list input along which to merge. Must be the same length as the depth of the list passed to datasets.

compat{‘identical’, ‘equals’, ‘broadcast_equals’,

‘no_conflicts’}, optional

String indicating how to compare variables of the same name for potential merge conflicts:

  • ‘broadcast_equals’: all values must be equal when variables are broadcast against each other to ensure common dimensions.

  • ‘equals’: all values and dimensions must be the same.

  • ‘identical’: all values, dimensions and attributes must be the same.

  • ‘no_conflicts’: only values which are not null in both datasets must be equal. The returned dataset then contains the combination of all non-null values.

data_vars{‘minimal’, ‘different’, ‘all’ or list of str}, optional

Details are in the documentation of concat

coords{‘minimal’, ‘different’, ‘all’ or list of str}, optional

Details are in the documentation of concat

fill_valuescalar, optional

Value to use for newly missing values

Returns
combinedxarray.Dataset

Examples

A common task is collecting data from a parallelized simulation in which each processor wrote out to a separate file. A domain which was decomposed into 4 parts, 2 each along both the x and y axes, requires organising the datasets into a doubly-nested list, e.g:

>>> x1y1
<xarray.Dataset>
Dimensions:         (x: 2, y: 2)
Dimensions without coordinates: x, y
Data variables:
  temperature       (x, y) float64 11.04 23.57 20.77 ...
  precipitation     (x, y) float64 5.904 2.453 3.404 ...
>>> ds_grid = [[x1y1, x1y2], [x2y1, x2y2]]
>>> combined = xr.combine_nested(ds_grid, concat_dim=['x', 'y'])
<xarray.Dataset>
Dimensions:         (x: 4, y: 4)
Dimensions without coordinates: x, y
Data variables:
  temperature       (x, y) float64 11.04 23.57 20.77 ...
  precipitation     (x, y) float64 5.904 2.453 3.404 ...

manual_combine can also be used to explicitly merge datasets with different variables. For example if we have 4 datasets, which are divided along two times, and contain two different variables, we can pass None to concat_dim to specify the dimension of the nested list over which we wish to use merge instead of concat:

>>> t1temp
<xarray.Dataset>
Dimensions:         (t: 5)
Dimensions without coordinates: t
Data variables:
  temperature       (t) float64 11.04 23.57 20.77 ...
>>> t1precip
<xarray.Dataset>
Dimensions:         (t: 5)
Dimensions without coordinates: t
Data variables:
  precipitation     (t) float64 5.904 2.453 3.404 ...
>>> ds_grid = [[t1temp, t1precip], [t2temp, t2precip]]
>>> combined = xr.combine_nested(ds_grid, concat_dim=['t', None])
<xarray.Dataset>
Dimensions:         (t: 10)
Dimensions without coordinates: t
Data variables:
  temperature       (t) float64 11.04 23.57 20.77 ...
  precipitation     (t) float64 5.904 2.453 3.404 ...