pyconform.datasets

DatasetDesc Interface Class

This file contains the interface classes to the input and output multi-file datasets.

Copyright 2017-2020, University Corporation for Atmospheric Research LICENSE: See the LICENSE.rst file for details

class pyconform.datasets.DatasetDesc(name='dataset', files=())[source]

Bases: object

A class describing a self-consistent set of dimensions and variables in one or more files

In simplest terms, a single NetCDF file is a dataset. Hence, the DatasetDesc object can be viewed as a simple container for the header information of a NetCDF file. However, the DatasetDesc can span multiple files, as long as dimensions and variables are consistent across all of the files in the DatasetDesc.

Self-consistency is defined as:
  1. Dimensions with names that appear in multiple files must all have the same size and limited/unlimited status, and

  2. Variables with names that appear in multiple files must have the same datatype and dimensions, and they must refer to the same data.

property dimensions

Dicitonary of dimension descriptors contained in the dataset

property files

Dictionary of file descriptors contained in the dataset

property name

Name of the dataset (optional)

property variables

Dictionary of variable descriptors contained in the dataset

exception pyconform.datasets.DefinitionWarning[source]

Bases: Warning

Warning to indicate that a variable definition might be bad

class pyconform.datasets.DimensionDesc(name, size=None, unlimited=False, stringlen=False)[source]

Bases: object

Descriptor for a dimension in a DatasetDesc

Contains the name of the dimensions, its size, and whether the dimension is limited or unlimited.

is_set()[source]

Return True if the dimension size and unlimited status is set, False otherwise

property name

Name of the dimension

set(dd)[source]

Set the size and unlimited status from another DimensionDesc

Parameters

dd (DimensionDesc) – The DimensionDesc from which to set the size and unlimited status

property size

Numeric size of the dimension (if set)

property stringlen

Boolean indicating whether the dimension represents a string length or not

static unique(descs)[source]

Return a mapping of names to unique DimensionDescs

Parameters

descs – A list of DimensionDesc objects

property unlimited

Boolean indicating whether the dimension is unlimited or not

unset()[source]

Unset the dimension’s size and unlimited status

class pyconform.datasets.FileDesc(name, format='NETCDF4_CLASSIC', deflate=2, variables=(), attributes={}, autoparse_time_variable=None)[source]

Bases: object

A class describing the contents of a single dataset file

In simplest terms, the FileDesc contains the header information for a single NetCDF file. It contains the name of the file, the type of the file, a dictionary of global attributes in the file, a dict of DimensionDesc objects, and a dict of VariableDesc objects.

property attributes

Dictionary of global attributes of the file

property deflate

Deflate level for variables in the file

property dimensions

Dictionary of dimension descriptors associated with the file

exists()[source]

Whether the file exists or not

property format

Format of the file

property name

Name of the file

static unique(descs)[source]

Return a mapping of names to unique FileDescs

Parameters

descs – A list of FileDesc objects

property variables

Dictionary of variable descriptors associated with the file

class pyconform.datasets.InputDatasetDesc(name='input', filenames=())[source]

Bases: pyconform.datasets.DatasetDesc

DatasetDesc that can be used as input (i.e., can be read from file)

The InputDatasetDesc is a kind of DatasetDesc where all of the DatasetDesc information is read from the headers of existing NetCDF files. The files must be self-consistent according to the standard DatasetDesc definition.

Variables in an InputDatasetDesc must have unset “definition” parameters, and the “filenames” parameter will contain the names of files from which the variable data can be read.

class pyconform.datasets.OutputDatasetDesc(name='output', dsdict={})[source]

Bases: pyconform.datasets.DatasetDesc

DatasetDesc that can be used for output (i.e., to be written to files)

The OutputDatasetDesc contains all of the header information needed to write a DatasetDesc to files. Unlike the InputDatasetDesc, it is not assumed that all of the variable and dimension information can be found in existing files. Instead, the OutputDatasetDesc contains a minimal subset of the output file headers, and information about how to construct the variable data and dimensions by using the ‘definition’ parameter of the variables.

The information to define an OutputDatasetDesc must be specified as a nested dictionary, where the first level of the dictionary are unique names of variables in the dataset. Each named variable defines another nested dictionary.

Each ‘variable’ dictionary is assued to contain the following:
  1. ‘attributes’: A dictionary of the variable’s attributes

  2. ‘datatype’: A string specifying the type of the variable’s data

  3. ‘dimensions’: A tuple of names of dimensions upon which the variable depends

  4. ‘definition’: Either a string mathematical expression representing how to construct

    the variable’s data from input variables or functions, or an array declaring the actual data from which to construct the variable

  5. ‘file’: A dictionary containing a string ‘filename’, a string ‘format’ (which can be

    one of ‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_CLASSIC’, ‘NETCDF3_64BIT_OFFSET’ or ‘NETCDF3_64BIT_DATA’), a dictionary of ‘attributes’, and a list of ‘metavars’ specifying the names of other variables that should be added to the file, in addition to obvious metadata variables and the variable containing the ‘file’ section.

class pyconform.datasets.VariableDesc(name, datatype=None, dimensions=(), definition=None, attributes={})[source]

Bases: object

Descriptor for a variable in a dataset

Contains the variable name, string datatype, dimensions tuple, attributes dictionary, and a string definition (how to construct the data for the variable) or data array (if the data is contained in the variable declaration).

property attributes

Variable attributes dictionary

calendar()[source]

Retrieve the calendar attribute, if it exists, otherwise None

cfunits()[source]

Construct a cf_units.Unit object from the units/calendar attributes

property datatype

String datatype of the variable

property definition
property dimensions

Dictionary of dimension descriptors for dimensions on which the variable depends

property dtype

NumPy dtype of the variable data

property files

Dictionary of file descriptors for files containing this variable

property name

Name of the variable

refdatetime()[source]

Retrieve the reference datetime string, otherwise None

static unique(descs)[source]

Return a mapping of names to unique VariableDescs

Parameters

descs – A list of VariableDesc objects

units()[source]

Retrieve the units string otherwise None

units_attr()[source]