What’s New

v0.3.2 (in development)

This release focused on bug-fixes, speedups and resolving some niggling inconsistencies.

There are a few cases where the behavior of xray differs from the previous version. However, I expect that in almost all cases your code will continue to run unmodified.


xray now requires pandas v0.15.0 or later. This was necessary for supporting TimedeltaIndex without too many painful hacks.

Backwards incompatible changes

  • Arrays of datetime.datetime objects are now automatically cast to datetime64[ns] arrays when stored in an xray object, using machinery borrowed from pandas:

    In [1]: from datetime import datetime
    In [2]: xray.Dataset({'t': [datetime(2000, 1, 1)]})
    Dimensions:  (t: 1)
      * t        (t) datetime64[ns] 2000-01-01
  • xray now has support (including serialization to netCDF) for TimedeltaIndex. datetime.timedelta objects are thus accordingly cast to timedelta64[ns] objects when appropriate.

  • Masked arrays are now properly coerced to use NaN as a sentinel value (GH259).


  • Due to popular demand, we have added experimental attribute style access as a shortcut for dataset variables, coordinates and attributes:

    In [3]: ds = xray.Dataset({'tmin': ([], 25, {'units': 'celcius'})})
    In [4]: ds.tmin.units
    Out[4]: 'celcius'

    Tab-completion for these variables should work in editors such as IPython. However, setting variables or attributes in this fashion is not yet supported because there are some unresolved ambiguities (GH300).

  • You can now use a dictionary for indexing with labeled dimensions. This provides a safe way to do assignment with labeled dimensions:

    In [5]: array = xray.DataArray(np.zeros(5), dims=['x'])
    In [6]: array[dict(x=slice(3))] = 1
    In [7]: array
    <xray.DataArray (x: 5)>
    array([ 1.,  1.,  1.,  0.,  0.])
      * x        (x) int64 0 1 2 3 4
  • Non-index coordinates can now be faithfully written to and restored from netCDF files. This is done according to CF conventions when possible by using the coordinates attribute on a data variable. When not possible, xray defines a global coordinates attribute.

  • Preliminary support for converting xray.DataArray objects to and from CDAT cdms2 variables.

  • We sped up any operation that involves creating a new Dataset or DataArray (e.g., indexing, aggregation, arithmetic) by a factor of 30 to 50%. The full speed up requires cyordereddict to be installed.

Bug fixes

  • Fix for to_dataframe() with 0d string/object coordinates (GH287)
  • Fix for to_netcdf with 0d string variable (GH284)
  • Fix writing datetime64 arrays to netcdf if NaT is present (GH270)
  • Fix align silently upcasts data arrays when NaNs are inserted (GH264)

Future plans

  • I am contemplating switching to the terms “coordinate variables” and “data variables” instead of the (currently used) “coordinates” and “variables”, following their use in CF Conventions (GH293). This would mostly have implications for the documentation, but I would also change the Dataset attribute vars to data.
  • I no longer certain that automatic label alignment for arithmetic would be a good idea for xray – it is a feature from pandas that I have not missed (GH186).
  • The main API breakage that I do anticipate in the next release is finally making all aggregation operations skip missing values by default (GH130). I’m pretty sick of writing ds.reduce(np.nanmean, 'time').
  • The next version of xray (0.4) will remove deprecated features and aliases whose use currently raises a warning.

If you have opinions about any of these anticipated changes, I would love to hear them – please add a note to any of the referenced GitHub issues.

v0.3.1 (22 October, 2014)

This is mostly a bug-fix release to make xray compatible with the latest release of pandas (v0.15).

We added several features to better support working with missing values and exporting xray objects to pandas. We also reorganized the internal API for serializing and deserializing datasets, but this change should be almost entirely transparent to users.

Other than breaking the experimental DataStore API, there should be no backwards incompatible changes.

New features

  • Added count() and dropna() methods, copied from pandas, for working with missing values (GH247, GH58).
  • Added DataArray.to_pandas for converting a data array into the pandas object with the same dimensionality (1D to Series, 2D to DataFrame, etc.) (GH255).
  • Support for reading gzipped netCDF3 files (GH239).
  • Reduced memory usage when writing netCDF files (GH251).
  • ‘missing_value’ is now supported as an alias for the ‘_FillValue’ attribute on netCDF variables (GH245).
  • Trivial indexes, equivalent to range(n) where n is the length of the dimension, are no longer written to disk (GH245).

Bug fixes

  • Compatibility fixes for pandas v0.15 (GH262).
  • Fixes for display and indexing of NaT (not-a-time) (GH238, GH240)
  • Fix slicing by label was an argument is a data array (GH250).
  • Test data is now shipped with the source distribution (GH253).
  • Ensure order does not matter when doing arithemtic with scalar data arrays (GH254).
  • Order of dimensions preserved with DataArray.to_dataframe (GH260).

v0.3.0 (21 September 2014)

New features

  • Revamped coordinates: “coordinates” now refer to all arrays that are not used to index a dimension. Coordinates are intended to allow for keeping track of arrays of metadata that describe the grid on which the points in “variable” arrays lie. They are preserved (when unambiguous) even though mathematical operations.
  • Dataset math Dataset objects now support all arithmetic operations directly. Dataset-array operations map across all dataset variables; dataset-dataset operations act on each pair of variables with the same name.
  • GroupBy math: This provides a convenient shortcut for normalizing by the average value of a group.
  • The dataset __repr__ method has been entirely overhauled; dataset objects now show their values when printed.
  • You can now index a dataset with a list of variables to return a new dataset: ds[['foo', 'bar']].

Backwards incompatible changes

  • Dataset.__eq__ and Dataset.__ne__ are now element-wise operations instead of comparing all values to obtain a single boolean. Use the method equals() instead.


  • Dataset.noncoords is deprecated: use Dataset.vars instead.
  • Dataset.select_vars deprecated: index a Dataset with a list of variable names instead.
  • DataArray.select_vars and DataArray.drop_vars deprecated: use reset_coords() instead.

v0.2.0 (14 August 2014)

This is major release that includes some new features and quite a few bug fixes. Here are the highlights:

  • There is now a direct constructor for DataArray objects, which makes it possible to create a DataArray without using a Dataset. This is highlighted in the refreshed tutorial.
  • You can perform aggregation operations like mean directly on Dataset objects, thanks to Joe Hamman. These aggregation methods also worked on grouped datasets.
  • xray now works on Python 2.6, thanks to Anna Kuznetsova.
  • A number of methods and attributes were given more sensible (usually shorter) names: labeled -> sel, indexed -> isel, select -> select_vars, unselect -> drop_vars, dimensions -> dims, coordinates -> coords, attributes -> attrs.
  • New load_data() and close() methods for datasets facilitate lower level of control of data loaded from disk.

v0.1.1 (20 May 2014)

xray 0.1.1 is a bug-fix release that includes changes that should be almost entirely backwards compatible with v0.1:

  • Python 3 support (GH53)
  • Required numpy version relaxed to 1.7 (GH129)
  • Return numpy.datetime64 arrays for non-standard calendars (GH126)
  • Support for opening datasets associated with NetCDF4 groups (GH127)
  • Bug-fixes for concatenating datetime arrays (GH134)

Special thanks to new contributors Thomas Kluyver, Joe Hamman and Alistair Miles.

v0.1 (2 May 2014)

Initial release.