N-D labeled arrays and datasets in Python¶
xarray (formerly xray) is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures.
Our goal is to provide a pandas-like and pandas-compatible toolkit for
analytics on multi-dimensional arrays, rather than the tabular data for which
pandas excels. Our approach adopts the Common Data Model for self-
describing scientific data in widespread use in the Earth sciences:
xarray.Dataset is an in-memory representation of a netCDF file.
- Data Structures
- Indexing and selecting data
- GroupBy: split-apply-combine
- Reshaping and reorganizing data
- Combining data
- Time series data
- Working with pandas
- Serialization and IO
- Parallel computing with dask
Help & reference
- Stephan Hoyer and Joe Hamman’s Journal of Open Research Software paper describing the xarray project.
- The UW eScience Institute’s Geohackweek tutorial on xarray for geospatial data scientists.
- Stephan Hoyer’s SciPy2015 talk introducing xarray to a general audience.
- Stephan Hoyer’s 2015 Unidata Users Workshop talk and tutorial (with answers) introducing xarray to users familiar with netCDF.
- Nicolas Fauchereau’s tutorial on xarray for netCDF users.