xarray.core.groupby.DatasetGroupBy.quantile#

DatasetGroupBy.quantile(q, dim=None, method='linear', keep_attrs=None, skipna=None, interpolation=None)[source]#

Compute the qth quantile over each array in the groups and concatenate them together into a new array.

Parameters
  • q (float or sequence of float) – Quantile to compute, which must be between 0 and 1 inclusive.

  • dim (..., str or sequence of str, optional) – Dimension(s) over which to apply quantile. Defaults to the grouped dimension.

  • method (str, default: "linear") – This optional parameter specifies the interpolation method to use when the desired quantile lies between two data points. The options sorted by their R type as summarized in the H&F paper 1 are:

    1. “inverted_cdf” (*)

    2. “averaged_inverted_cdf” (*)

    3. “closest_observation” (*)

    4. “interpolated_inverted_cdf” (*)

    5. “hazen” (*)

    6. “weibull” (*)

    7. “linear” (default)

    8. “median_unbiased” (*)

    9. “normal_unbiased” (*)

    The first three methods are discontiuous. The following discontinuous variations of the default “linear” (7.) option are also available:

    • “lower”

    • “higher”

    • “midpoint”

    • “nearest”

    See numpy.quantile() or 1 for details. Methods marked with an asterix require numpy version 1.22 or newer. The “method” argument was previously called “interpolation”, renamed in accordance with numpy version 1.22.0.

  • skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).

Returns

quantiles (Variable) – If q is a single quantile, then the result is a scalar. If multiple percentiles are given, first axis of the result corresponds to the quantile. In either case a quantile dimension is added to the return array. The other dimensions are the dimensions that remain after the reduction of the array.

See also

numpy.nanquantile, numpy.quantile, pandas.Series.quantile, Dataset.quantile, DataArray.quantile

Examples

>>> da = xr.DataArray(
...     [[1.3, 8.4, 0.7, 6.9], [0.7, 4.2, 9.4, 1.5], [6.5, 7.3, 2.6, 1.9]],
...     coords={"x": [0, 0, 1], "y": [1, 1, 2, 2]},
...     dims=("x", "y"),
... )
>>> ds = xr.Dataset({"a": da})
>>> da.groupby("x").quantile(0)
<xarray.DataArray (x: 2, y: 4)>
array([[0.7, 4.2, 0.7, 1.5],
       [6.5, 7.3, 2.6, 1.9]])
Coordinates:
  * y         (y) int64 1 1 2 2
    quantile  float64 0.0
  * x         (x) int64 0 1
>>> ds.groupby("y").quantile(0, dim=...)
<xarray.Dataset>
Dimensions:   (y: 2)
Coordinates:
    quantile  float64 0.0
  * y         (y) int64 1 2
Data variables:
    a         (y) float64 0.7 0.7
>>> da.groupby("x").quantile([0, 0.5, 1])
<xarray.DataArray (x: 2, y: 4, quantile: 3)>
array([[[0.7 , 1.  , 1.3 ],
        [4.2 , 6.3 , 8.4 ],
        [0.7 , 5.05, 9.4 ],
        [1.5 , 4.2 , 6.9 ]],

       [[6.5 , 6.5 , 6.5 ],
        [7.3 , 7.3 , 7.3 ],
        [2.6 , 2.6 , 2.6 ],
        [1.9 , 1.9 , 1.9 ]]])
Coordinates:
  * y         (y) int64 1 1 2 2
  * quantile  (quantile) float64 0.0 0.5 1.0
  * x         (x) int64 0 1
>>> ds.groupby("y").quantile([0, 0.5, 1], dim=...)
<xarray.Dataset>
Dimensions:   (y: 2, quantile: 3)
Coordinates:
  * quantile  (quantile) float64 0.0 0.5 1.0
  * y         (y) int64 1 2
Data variables:
    a         (y, quantile) float64 0.7 5.35 8.4 0.7 2.25 9.4

References

1(1,2)

R. J. Hyndman and Y. Fan, “Sample quantiles in statistical packages,” The American Statistician, 50(4), pp. 361-365, 1996