xray.Dataset

class xray.Dataset(variables=None, coords=None, attrs=None)

A multi-dimensional, in memory, array database.

A dataset resembles an in-memory representation of a NetCDF file, and consists of variables, coordinates and attributes which together form a self describing dataset.

Dataset implements the mapping interface with keys given by variable names and values given by DataArray objects for each variable name.

One dimensional variables with name equal to their dimension are index coordinates used for label based indexing.

__init__(variables=None, coords=None, attrs=None)

To load data from a file or file-like object, use the open_dataset function.

Parameters:

variables : dict-like, optional

A mapping from variable names to DataArray objects, Variable objects or tuples of the form (dims, data[, attrs]) which can be used as arguments to create a new Variable. Each dimension must have the same length in all variables in which it appears.

coords : dict-like, optional

Another mapping in the same form as the variables argument, except the each item is saved on the dataset as a “coordinate”. These arrays have an associated meaning: they describe constant/fixed/independent quantities, unlike the varying/measured/dependent quantities that belong in variables. Coordinates values may be given by 1-dimensional arrays or scalars, in which case dims do not need to be supplied: 1D arrays will be assumed to give index values along the dimension with the same name.

attrs : dict-like, optional

Global attributes to save on this dataset.

Methods

__init__([variables, coords, attrs]) To load data from a file or file-like object, use the open_dataset function.
all([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.all along some dimension(s).
any([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.any along some dimension(s).
apply(func[, keep_attrs, args]) Apply a function over the variables in this dataset.
argmax([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.argmax along some dimension(s).
argmin([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.argmin along some dimension(s).
argsort([axis, kind, order]) Returns the indices that would sort this array.
astype(dtype[, order, casting, subok, copy]) Copy of the array, cast to a specified type.
clip(a_min, a_max[, out]) Return an array whose values are limited to [a_min, a_max].
close() Close any files linked to this dataset
concat(*args, **kwargs) Deprecated; use xray.concat instead
conj() Complex-conjugate all elements.
conjugate() Return the complex conjugate, element-wise.
copy([deep]) Returns a copy of this dataset.
count([dim, keep_attrs]) Reduce this Dataset’s data by applying count along some dimension(s).
drop_vars(*names) Returns a new dataset without the named variables.
dropna(dim[, how, thresh, subset]) Returns a new dataset with dropped labels for missing values along the provided dimension.
dump(filepath, **kwdargs) Dump dataset contents to a location on disk using the netCDF4 package.
dump_to_store(store[, encoder]) Store dataset contents to a backends.*DataStore object.
dumps(**kwargs) Serialize dataset contents to a string.
equals(other) Two Datasets are equal if they have matching variables and coordinates, all of which are equal.
from_dataframe(dataframe) Convert a pandas.DataFrame into an xray.Dataset
get((k[,d]) -> D[k] if k in D, ...)
groupby(group[, squeeze]) Returns a GroupBy object for performing grouped operations.
identical(other) Like equals, but also checks all dataset attributes and the attributes on all variables and coordinates.
indexed(*args, **kwargs) Returns a new dataset with each array indexed along the specified dimension(s).
isel(**indexers) Returns a new dataset with each array indexed along the specified dimension(s).
isnull(*args, **kwargs) Detect missing values (NaN in numeric arrays, None/NaN in object arrays)
items(() -> list of D’s (key, value) pairs, ...)
iteritems(() -> an iterator over the (key, ...)
iterkeys(() -> an iterator over the keys of D)
itervalues(...)
keys(() -> list of D’s keys)
labeled(*args, **kwargs) Returns a new dataset with each array indexed by tick labels along the specified dimension(s).
load_data() Manually trigger loading of this dataset’s data from disk or a remote source into memory and return this dataset.
load_store(store[, decoder]) Create a new dataset from the contents of a backends.*DataStore
max([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.max along some dimension(s).
mean([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.mean along some dimension(s).
merge(other[, inplace, overwrite_vars, compat]) Merge the arrays of two datasets into a single dataset.
min([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.min along some dimension(s).
notnull(*args, **kwargs) Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays.
prod([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.prod along some dimension(s).
ptp([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.ptp along some dimension(s).
reduce(func[, dim, keep_attrs]) Reduce this dataset by applying func along some dimension(s).
reindex([copy]) Conform this object onto a new set of indexes, filling in missing values with NaN.
reindex_like(other[, copy]) Conform this object onto the indexes of another object, filling in missing values with NaN.
rename(name_dict[, inplace]) Returns a new object with renamed variables and dimensions.
reset_coords([names, drop, inplace]) Given names of coordinates, reset them to become variables
round([decimals, out]) Return a with each element rounded to the given number of decimals.
sel(**indexers) Returns a new dataset with each array indexed by tick labels along the specified dimension(s).
select(*args, **kwargs) Deprecated.
select_vars(*names) Deprecated.
set_coords(names[, inplace]) Given names of one or more variables, set them as coordinates
squeeze([dim]) Returns a new dataset with squeezed data.
std([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.std along some dimension(s).
sum([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.sum along some dimension(s).
to_dataframe() Convert this dataset into a pandas.DataFrame.
to_netcdf(filepath, **kwdargs) Dump dataset contents to a location on disk using the netCDF4 package.
transpose(*dims) Return a new Dataset object with all array dimensions transposed.
unselect(*args, **kwargs) Returns a new dataset without the named variables.
update(other[, inplace]) Update this dataset’s variables and attributes with those from another dataset.
values(() -> list of D’s values)
var([dim, keep_attrs]) Reduce this Dataset’s data by applying numpy.var along some dimension(s).

Attributes

T
attributes Deprecated; do not use
attrs Dictionary of global attributes on this dataset
coordinates
coords Dictionary of xray.Coordinate objects used for label based indexing.
dimensions
dims Mapping from dimension names to lengths.
indexes OrderedDict of pandas.Index objects used for label based indexing
noncoordinates Dictionary of DataArrays whose names do not match dimensions.
noncoords Dictionary of DataArrays whose names do not match dimensions.
variables Deprecated; do not use
vars
virtual_variables A frozenset of names that don’t exist in this dataset but for which DataArrays could be created on demand.