xarray.combine_by_coords

xarray.combine_by_coords(datasets, compat='no_conflicts', data_vars='all', coords='different', fill_value=<NA>, join='outer', combine_attrs='no_conflicts')[source]

Attempt to auto-magically combine the given datasets into one by using dimension coordinates.

This method attempts to combine a group of datasets along any number of dimensions into a single entity by inspecting coords and metadata and using a combination of concat and merge.

Will attempt to order the datasets such that the values in their dimension coordinates are monotonic along all dimensions. If it cannot determine the order in which to concatenate the datasets, it will raise a ValueError. Non-coordinate dimensions will be ignored, as will any coordinate dimensions which do not vary between each dataset.

Aligns coordinates, but different variables on datasets can cause it to fail under some scenarios. In complex cases, you may need to clean up your data and use concat/merge explicitly (also see combine_nested).

Works well if, for example, you have N years of data and M data variables, and each combination of a distinct time period and set of data variables is saved as its own dataset. Also useful for if you have a simulation which is parallelized in multiple dimensions, but has global coordinates saved in each file specifying the positions of points within the global domain.

Parameters
  • datasets (sequence of xarray.Dataset) – Dataset objects to combine.

  • compat ({"identical", "equals", "broadcast_equals", "no_conflicts", "override"}, optional) – String indicating how to compare variables of the same name for potential conflicts:

    • “broadcast_equals”: all values must be equal when variables are broadcast against each other to ensure common dimensions.

    • “equals”: all values and dimensions must be the same.

    • “identical”: all values, dimensions and attributes must be the same.

    • “no_conflicts”: only values which are not null in both datasets must be equal. The returned dataset then contains the combination of all non-null values.

    • “override”: skip comparing and pick variable from first dataset

  • data_vars ({"minimal", "different", "all" or list of str}, optional) – These data variables will be concatenated together:

    • “minimal”: Only data variables in which the dimension already appears are included.

    • “different”: Data variables which are not equal (ignoring attributes) across all datasets are also concatenated (as well as all for which dimension already appears). Beware: this option may load the data payload of data variables into memory if they are not already loaded.

    • “all”: All data variables will be concatenated.

    • list of str: The listed data variables will be concatenated, in addition to the “minimal” data variables.

    If objects are DataArrays, data_vars must be “all”.

  • coords ({"minimal", "different", "all"} or list of str, optional) – As per the “data_vars” kwarg, but for coordinate variables.

  • fill_value (scalar or dict-like, optional) – Value to use for newly missing values. If a dict-like, maps variable names to fill values. Use a data array’s name to refer to its values. If None, raises a ValueError if the passed Datasets do not create a complete hypercube.

  • join ({"outer", "inner", "left", "right", "exact"}, optional) – String indicating how to combine differing indexes in objects

    • “outer”: use the union of object indexes

    • “inner”: use the intersection of object indexes

    • “left”: use indexes from the first object with each dimension

    • “right”: use indexes from the last object with each dimension

    • “exact”: instead of aligning, raise ValueError when indexes to be aligned are not equal

    • “override”: if indexes are of same size, rewrite indexes to be those of the first object with that dimension. Indexes for the same dimension must have the same size in all objects.

  • combine_attrs ({"drop", "identical", "no_conflicts", "drop_conflicts", "override"} or callable(), default: "drop") – A callable or a string indicating how to combine attrs of the objects being merged:

    • “drop”: empty attrs on returned Dataset.

    • “identical”: all attrs must be the same on every object.

    • “no_conflicts”: attrs from all objects are combined, any that have the same name must also have the same value.

    • “drop_conflicts”: attrs from all objects are combined, any that have the same name but different values are dropped.

    • “override”: skip comparing and copy attrs from the first dataset to the result.

    If a callable, it must expect a sequence of attrs dicts and a context object as its only parameters.

Returns

combined (xarray.Dataset)

Examples

Combining two datasets using their common dimension coordinates. Notice they are concatenated based on the values in their dimension coordinates, not on their position in the list passed to combine_by_coords.

>>> x1 = xr.Dataset(
...     {
...         "temperature": (("y", "x"), 20 * np.random.rand(6).reshape(2, 3)),
...         "precipitation": (("y", "x"), np.random.rand(6).reshape(2, 3)),
...     },
...     coords={"y": [0, 1], "x": [10, 20, 30]},
... )
>>> x2 = xr.Dataset(
...     {
...         "temperature": (("y", "x"), 20 * np.random.rand(6).reshape(2, 3)),
...         "precipitation": (("y", "x"), np.random.rand(6).reshape(2, 3)),
...     },
...     coords={"y": [2, 3], "x": [10, 20, 30]},
... )
>>> x3 = xr.Dataset(
...     {
...         "temperature": (("y", "x"), 20 * np.random.rand(6).reshape(2, 3)),
...         "precipitation": (("y", "x"), np.random.rand(6).reshape(2, 3)),
...     },
...     coords={"y": [2, 3], "x": [40, 50, 60]},
... )
>>> x1
<xarray.Dataset>
Dimensions:        (y: 2, x: 3)
Coordinates:
  * y              (y) int64 0 1
  * x              (x) int64 10 20 30
Data variables:
    temperature    (y, x) float64 10.98 14.3 12.06 10.9 8.473 12.92
    precipitation  (y, x) float64 0.4376 0.8918 0.9637 0.3834 0.7917 0.5289
>>> x2
<xarray.Dataset>
Dimensions:        (y: 2, x: 3)
Coordinates:
  * y              (y) int64 2 3
  * x              (x) int64 10 20 30
Data variables:
    temperature    (y, x) float64 11.36 18.51 1.421 1.743 0.4044 16.65
    precipitation  (y, x) float64 0.7782 0.87 0.9786 0.7992 0.4615 0.7805
>>> x3
<xarray.Dataset>
Dimensions:        (y: 2, x: 3)
Coordinates:
  * y              (y) int64 2 3
  * x              (x) int64 40 50 60
Data variables:
    temperature    (y, x) float64 2.365 12.8 2.867 18.89 10.44 8.293
    precipitation  (y, x) float64 0.2646 0.7742 0.4562 0.5684 0.01879 0.6176
>>> xr.combine_by_coords([x2, x1])
<xarray.Dataset>
Dimensions:        (y: 4, x: 3)
Coordinates:
  * y              (y) int64 0 1 2 3
  * x              (x) int64 10 20 30
Data variables:
    temperature    (y, x) float64 10.98 14.3 12.06 10.9 ... 1.743 0.4044 16.65
    precipitation  (y, x) float64 0.4376 0.8918 0.9637 ... 0.7992 0.4615 0.7805
>>> xr.combine_by_coords([x3, x1])
<xarray.Dataset>
Dimensions:        (y: 4, x: 6)
Coordinates:
  * y              (y) int64 0 1 2 3
  * x              (x) int64 10 20 30 40 50 60
Data variables:
    temperature    (y, x) float64 10.98 14.3 12.06 nan ... nan 18.89 10.44 8.293
    precipitation  (y, x) float64 0.4376 0.8918 0.9637 ... 0.5684 0.01879 0.6176
>>> xr.combine_by_coords([x3, x1], join="override")
<xarray.Dataset>
Dimensions:        (y: 2, x: 6)
Coordinates:
  * y              (y) int64 0 1
  * x              (x) int64 10 20 30 40 50 60
Data variables:
    temperature    (y, x) float64 10.98 14.3 12.06 2.365 ... 18.89 10.44 8.293
    precipitation  (y, x) float64 0.4376 0.8918 0.9637 ... 0.5684 0.01879 0.6176
>>> xr.combine_by_coords([x1, x2, x3])
<xarray.Dataset>
Dimensions:        (y: 4, x: 6)
Coordinates:
  * y              (y) int64 0 1 2 3
  * x              (x) int64 10 20 30 40 50 60
Data variables:
    temperature    (y, x) float64 10.98 14.3 12.06 nan ... 18.89 10.44 8.293
    precipitation  (y, x) float64 0.4376 0.8918 0.9637 ... 0.5684 0.01879 0.6176