xarray.Dataset.interpolate_na#

Dataset.interpolate_na(dim=None, method='linear', limit=None, use_coordinate=True, max_gap=None, **kwargs)[source]#

Fill in NaNs by interpolating according to different methods.

Parameters
  • dim (str) – Specifies the dimension along which to interpolate.

  • method (str, optional) – String indicating which method to use for interpolation:

    • ‘linear’: linear interpolation (Default). Additional keyword arguments are passed to numpy.interp()

    • ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘polynomial’: are passed to scipy.interpolate.interp1d(). If method='polynomial', the order keyword argument must also be provided.

    • ‘barycentric’, ‘krog’, ‘pchip’, ‘spline’, ‘akima’: use their respective scipy.interpolate classes.

  • use_coordinate (bool, str, default: True) – Specifies which index to use as the x values in the interpolation formulated as y = f(x). If False, values are treated as if eqaully-spaced along dim. If True, the IndexVariable dim is used. If use_coordinate is a string, it specifies the name of a coordinate variariable to use as the index.

  • limit (int, default: None) – Maximum number of consecutive NaNs to fill. Must be greater than 0 or None for no limit. This filling is done regardless of the size of the gap in the data. To only interpolate over gaps less than a given length, see max_gap.

  • max_gap (int, float, str, pandas.Timedelta, numpy.timedelta64, datetime.timedelta, default: None) – Maximum size of gap, a continuous sequence of NaNs, that will be filled. Use None for no limit. When interpolating along a datetime64 dimension and use_coordinate=True, max_gap can be one of the following:

    Otherwise, max_gap must be an int or a float. Use of max_gap with unlabeled dimensions has not been implemented yet. Gap length is defined as the difference between coordinate values at the first data point after a gap and the last value before a gap. For gaps at the beginning (end), gap length is defined as the difference between coordinate values at the first (last) valid data point and the first (last) NaN. For example, consider:

    <xarray.DataArray (x: 9)>
    array([nan, nan, nan,  1., nan, nan,  4., nan, nan])
    Coordinates:
      * x        (x) int64 0 1 2 3 4 5 6 7 8
    

    The gap lengths are 3-0 = 3; 6-3 = 3; and 8-6 = 2 respectively

  • **kwargs (dict, optional) – parameters passed verbatim to the underlying interpolation function

Returns

interpolated (Dataset) – Filled in Dataset.

Examples

>>> ds = xr.Dataset(
...     {
...         "A": ("x", [np.nan, 2, 3, np.nan, 0]),
...         "B": ("x", [3, 4, np.nan, 1, 7]),
...         "C": ("x", [np.nan, np.nan, np.nan, 5, 0]),
...         "D": ("x", [np.nan, 3, np.nan, -1, 4]),
...     },
...     coords={"x": [0, 1, 2, 3, 4]},
... )
>>> ds
<xarray.Dataset>
Dimensions:  (x: 5)
Coordinates:
  * x        (x) int64 0 1 2 3 4
Data variables:
    A        (x) float64 nan 2.0 3.0 nan 0.0
    B        (x) float64 3.0 4.0 nan 1.0 7.0
    C        (x) float64 nan nan nan 5.0 0.0
    D        (x) float64 nan 3.0 nan -1.0 4.0
>>> ds.interpolate_na(dim="x", method="linear")
<xarray.Dataset>
Dimensions:  (x: 5)
Coordinates:
  * x        (x) int64 0 1 2 3 4
Data variables:
    A        (x) float64 nan 2.0 3.0 1.5 0.0
    B        (x) float64 3.0 4.0 2.5 1.0 7.0
    C        (x) float64 nan nan nan 5.0 0.0
    D        (x) float64 nan 3.0 1.0 -1.0 4.0
>>> ds.interpolate_na(dim="x", method="linear", fill_value="extrapolate")
<xarray.Dataset>
Dimensions:  (x: 5)
Coordinates:
  * x        (x) int64 0 1 2 3 4
Data variables:
    A        (x) float64 1.0 2.0 3.0 1.5 0.0
    B        (x) float64 3.0 4.0 2.5 1.0 7.0
    C        (x) float64 20.0 15.0 10.0 5.0 0.0
    D        (x) float64 5.0 3.0 1.0 -1.0 4.0