🍾 Xarray is now 10 years old! 🎉

xarray.Dataset.quantile

xarray.Dataset.quantile#

Dataset.quantile(q, dim=None, *, method='linear', numeric_only=False, keep_attrs=None, skipna=None, interpolation=None)[source]#

Compute the qth quantile of the data along the specified dimension.

Returns the qth quantiles(s) of the array elements for each variable in the Dataset.

Parameters:
  • q (float or array-like of float) – Quantile to compute, which must be between 0 and 1 inclusive.

  • dim (str or Iterable of Hashable, optional) – Dimension(s) over which to apply quantile.

  • method (str, default: "linear") – This optional parameter specifies the interpolation method to use when the desired quantile lies between two data points. The options sorted by their R type as summarized in the H&F paper [1] are:

    1. “inverted_cdf”

    2. “averaged_inverted_cdf”

    3. “closest_observation”

    4. “interpolated_inverted_cdf”

    5. “hazen”

    6. “weibull”

    7. “linear” (default)

    8. “median_unbiased”

    9. “normal_unbiased”

    The first three methods are discontiuous. The following discontinuous variations of the default “linear” (7.) option are also available:

    • “lower”

    • “higher”

    • “midpoint”

    • “nearest”

    See numpy.quantile() or [1] for details. The “method” argument was previously called “interpolation”, renamed in accordance with numpy version 1.22.0.

  • keep_attrs (bool, optional) – If True, the dataset’s attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.

  • numeric_only (bool, optional) – If True, only apply func to variables with a numeric dtype.

  • skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).

Returns:

quantiles (Dataset) – If q is a single quantile, then the result is a scalar for each variable in data_vars. If multiple percentiles are given, first axis of the result corresponds to the quantile and a quantile dimension is added to the return Dataset. The other dimensions are the dimensions that remain after the reduction of the array.

Examples

>>> ds = xr.Dataset(
...     {"a": (("x", "y"), [[0.7, 4.2, 9.4, 1.5], [6.5, 7.3, 2.6, 1.9]])},
...     coords={"x": [7, 9], "y": [1, 1.5, 2, 2.5]},
... )
>>> ds.quantile(0)  # or ds.quantile(0, dim=...)
<xarray.Dataset> Size: 16B
Dimensions:   ()
Coordinates:
    quantile  float64 8B 0.0
Data variables:
    a         float64 8B 0.7
>>> ds.quantile(0, dim="x")
<xarray.Dataset> Size: 72B
Dimensions:   (y: 4)
Coordinates:
  * y         (y) float64 32B 1.0 1.5 2.0 2.5
    quantile  float64 8B 0.0
Data variables:
    a         (y) float64 32B 0.7 4.2 2.6 1.5
>>> ds.quantile([0, 0.5, 1])
<xarray.Dataset> Size: 48B
Dimensions:   (quantile: 3)
Coordinates:
  * quantile  (quantile) float64 24B 0.0 0.5 1.0
Data variables:
    a         (quantile) float64 24B 0.7 3.4 9.4
>>> ds.quantile([0, 0.5, 1], dim="x")
<xarray.Dataset> Size: 152B
Dimensions:   (quantile: 3, y: 4)
Coordinates:
  * y         (y) float64 32B 1.0 1.5 2.0 2.5
  * quantile  (quantile) float64 24B 0.0 0.5 1.0
Data variables:
    a         (quantile, y) float64 96B 0.7 4.2 2.6 1.5 3.6 ... 6.5 7.3 9.4 1.9

References