🍾 Xarray is now 10 years old! 🎉

xarray.DataArray.groupby

xarray.DataArray.groupby#

DataArray.groupby(group, squeeze=None, restore_coord_dims=False)[source]#

Returns a DataArrayGroupBy object for performing grouped operations.

Parameters:
  • group (Hashable, DataArray or IndexVariable) – Array whose unique values should be used to group this array. If a Hashable, must be the name of a coordinate contained in this dataarray.

  • squeeze (bool, default: True) – If “group” is a dimension of any arrays in this dataset, squeeze controls whether the subarrays have a dimension of length 1 along that dimension or if the dimension is squeezed out.

  • restore_coord_dims (bool, default: False) – If True, also restore the dimension order of multi-dimensional coordinates.

Returns:

grouped (DataArrayGroupBy) – A DataArrayGroupBy object patterned after pandas.GroupBy that can be iterated over in the form of (unique_value, grouped_array) pairs.

Examples

Calculate daily anomalies for daily data:

>>> da = xr.DataArray(
...     np.linspace(0, 1826, num=1827),
...     coords=[pd.date_range("2000-01-01", "2004-12-31", freq="D")],
...     dims="time",
... )
>>> da
<xarray.DataArray (time: 1827)> Size: 15kB
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 1.824e+03, 1.825e+03,
       1.826e+03])
Coordinates:
  * time     (time) datetime64[ns] 15kB 2000-01-01 2000-01-02 ... 2004-12-31
>>> da.groupby("time.dayofyear") - da.groupby("time.dayofyear").mean("time")
<xarray.DataArray (time: 1827)> Size: 15kB
array([-730.8, -730.8, -730.8, ...,  730.2,  730.2,  730.5])
Coordinates:
  * time       (time) datetime64[ns] 15kB 2000-01-01 2000-01-02 ... 2004-12-31
    dayofyear  (time) int64 15kB 1 2 3 4 5 6 7 8 ... 360 361 362 363 364 365 366

See also

GroupBy: Group and Bin Data

Users guide explanation of how to group and bin data.

Computational Patterns

Tutorial on Groupby() for windowed computation

Grouped Computations

Tutorial on Groupby() demonstrating reductions, transformation and comparison with resample()

DataArray.groupby_bins Dataset.groupby core.groupby.DataArrayGroupBy DataArray.coarsen pandas.DataFrame.groupby Dataset.resample DataArray.resample