class xray.Dataset(variables=None, coords=None, attrs=None, compat='broadcast_equals')

A multi-dimensional, in memory, array database.

A dataset resembles an in-memory representation of a NetCDF file, and consists of variables, coordinates and attributes which together form a self describing dataset.

Dataset implements the mapping interface with keys given by variable names and values given by DataArray objects for each variable name.

One dimensional variables with name equal to their dimension are index coordinates used for label based indexing.

__init__(variables=None, coords=None, attrs=None, compat='broadcast_equals')

To load data from a file or file-like object, use the open_dataset function.


variables : dict-like, optional

A mapping from variable names to DataArray objects, Variable objects or tuples of the form (dims, data[, attrs]) which can be used as arguments to create a new Variable. Each dimension must have the same length in all variables in which it appears.

coords : dict-like, optional

Another mapping in the same form as the variables argument, except the each item is saved on the dataset as a “coordinate”. These variables have an associated meaning: they describe constant/fixed/independent quantities, unlike the varying/measured/dependent quantities that belong in variables. Coordinates values may be given by 1-dimensional arrays or scalars, in which case dims do not need to be supplied: 1D arrays will be assumed to give index values along the dimension with the same name.

attrs : dict-like, optional

Global attributes to save on this dataset.

compat : {‘broadcast_equals’, ‘equals’, ‘identical’}, optional

String indicating how to compare variables of the same name for potential conflicts:

  • ‘broadcast_equals’: all values must be equal when variables are broadcast against each other to ensure common dimensions.
  • ‘equals’: all values and dimensions must be the same.
  • ‘identical’: all values, dimensions and attributes must be the same.


__init__([variables, coords, attrs, compat]) To load data from a file or file-like object, use the open_dataset function.
all([dim, keep_attrs]) Reduce this Dataset’s data by applying all along some dimension(s).
any([dim, keep_attrs]) Reduce this Dataset’s data by applying any along some dimension(s).
apply(func[, keep_attrs, args]) Apply a function over the variables in this dataset.
argmax([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying argmax along some dimension(s).
argmin([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying argmin along some dimension(s).
argsort([axis, kind, order]) Returns the indices that would sort this array.
astype(dtype[, order, casting, subok, copy]) Copy of the array, cast to a specified type.
broadcast_equals(other) Two Datasets are broadcast equal if they are equal after broadcasting all variables against each other.
clip(a_min, a_max[, out]) Return an array whose values are limited to [a_min, a_max].
close() Close any files linked to this dataset
conj() Complex-conjugate all elements.
conjugate() Return the complex conjugate, element-wise.
copy([deep]) Returns a copy of this dataset.
count([dim, keep_attrs]) Reduce this Dataset’s data by applying count along some dimension(s).
drop(labels[, dim]) Drop variables or index labels from this dataset.
dropna(dim[, how, thresh, subset]) Returns a new dataset with dropped labels for missing values along the provided dimension.
dump(*args, **kwargs) Write dataset contents to a netCDF file.
dump_to_store(store[, encoder]) Store dataset contents to a backends.*DataStore object.
dumps(*args, **kwargs) Write dataset contents to a netCDF file.
equals(other) Two Datasets are equal if they have matching variables and coordinates, all of which are equal.
from_dataframe(dataframe) Convert a pandas.DataFrame into an xray.Dataset
get((k[,d]) -> D[k] if k in D, ...)
groupby(group[, squeeze]) Returns a GroupBy object for performing grouped operations.
identical(other) Like equals, but also checks all dataset attributes and the attributes on all variables and coordinates.
isel(**indexers) Returns a new dataset with each array indexed along the specified dimension(s).
isnull(*args, **kwargs) Detect missing values (NaN in numeric arrays, None/NaN in object arrays)
items(() -> list of D’s (key, value) pairs, ...)
iteritems(() -> an iterator over the (key, ...)
iterkeys(() -> an iterator over the keys of D)
keys(() -> list of D’s keys)
load_data() Manually trigger loading of this dataset’s data from disk or a remote source into memory and return this dataset.
load_store(store[, decoder]) Create a new dataset from the contents of a backends.*DataStore
max([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying max along some dimension(s).
mean([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying mean along some dimension(s).
median([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying median along some dimension(s).
merge(other[, inplace, overwrite_vars, ...]) Merge the arrays of two datasets into a single dataset.
min([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying min along some dimension(s).
notnull(*args, **kwargs) Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays.
prod([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying prod along some dimension(s).
reduce(func[, dim, keep_attrs, numeric_only]) Reduce this dataset by applying func along some dimension(s).
reindex([indexers, method, copy]) Conform this object onto a new set of indexes, filling in missing values with NaN.
reindex_like(other[, method, copy]) Conform this object onto the indexes of another object, filling in missing values with NaN.
rename(name_dict[, inplace]) Returns a new object with renamed variables and dimensions.
reset_coords([names, drop, inplace]) Given names of coordinates, reset them to become variables
round([decimals, out]) Return a with each element rounded to the given number of decimals.
sel(**indexers) Returns a new dataset with each array indexed by tick labels along the specified dimension(s).
set_coords(names[, inplace]) Given names of one or more variables, set them as coordinates
squeeze([dim]) Returns a new dataset with squeezed data.
std([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying std along some dimension(s).
sum([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying sum along some dimension(s).
to_dataframe() Convert this dataset into a pandas.DataFrame.
to_netcdf([path, mode, format, group]) Write dataset contents to a netCDF file.
transpose(*dims) Return a new Dataset object with all array dimensions transposed.
update(other[, inplace]) Update this dataset’s variables with those from another dataset.
values(() -> list of D’s values)
var([dim, keep_attrs, skipna]) Reduce this Dataset’s data by applying var along some dimension(s).


attrs Dictionary of global attributes on this dataset
coords Dictionary of xray.DataArray objects corresponding to coordinate
data_vars Dictionary of xray.DataArray objects corresponding to data variables
dims Mapping from dimension names to lengths.
indexes OrderedDict of pandas.Index objects used for label based indexing
loc Attribute for location based indexing.
variables Frozen dictionary of xray.Variable objects constituting this
virtual_variables A frozenset of names that don’t exist in this dataset but for which DataArrays could be created on demand.