xarray.open_mfdataset

xarray.open_mfdataset(paths, chunks=None, concat_dim='__infer_concat_dim__', compat='no_conflicts', preprocess=None, engine=None, lock=None, **kwargs)

Open multiple files as a single dataset.

Experimental. Requires dask to be installed.

Parameters:

paths : str or sequence

Either a string glob in the form “path/to/my/files/*.nc” or an explicit list of files to open.

chunks : int or dict, optional

Dictionary with keys given by dimension names and values given by chunk sizes. In general, these should divide the dimensions of each dataset. If int, chunk each dimension by chunks. By default, chunks will be chosen to load entire input files into memory at once. This has a major impact on performance: please see the full documentation for more details.

concat_dim : None, str, DataArray or Index, optional

Dimension to concatenate files along. This argument is passed on to xarray.auto_combine() along with the dataset objects. You only need to provide this argument if the dimension along which you want to concatenate is not a dimension in the original datasets, e.g., if you want to stack a collection of 2D arrays along a third dimension. By default, xarray attempts to infer this argument by examining component files. Set concat_dim=None explicitly to disable concatenation.

compat : {‘identical’, ‘equals’, ‘broadcast_equals’,

‘no_conflicts’}, optional

String indicating how to compare variables of the same name for potential conflicts when merging:

  • ‘broadcast_equals’: all values must be equal when variables are broadcast against each other to ensure common dimensions.
  • ‘equals’: all values and dimensions must be the same.
  • ‘identical’: all values, dimensions and attributes must be the same.
  • ‘no_conflicts’: only values which are not null in both datasets must be equal. The returned dataset then contains the combination of all non-null values.

preprocess : callable, optional

If provided, call this function on each dataset prior to concatenation.

engine : {‘netcdf4’, ‘scipy’, ‘pydap’, ‘h5netcdf’, ‘pynio’}, optional

Engine to use when reading files. If not provided, the default engine is chosen based on available dependencies, with a preference for ‘netcdf4’.

lock : False, True or threading.Lock, optional

This argument is passed on to dask.array.from_array(). By default, a per-variable lock is used when reading data from netCDF files with the netcdf4 and h5netcdf engines to avoid issues with concurrent access when using dask’s multithreaded backend.

**kwargs : optional

Additional arguments passed on to xarray.open_dataset().

Returns:

xarray.Dataset