The VAPOR data analysis environment targets rectilinear or structured gridded data sets that are time-varying, multivariate, and possessing very high spatial resolutions. Aggregate data sets generated from a single experiment that are terabytes in size are not uncommon. To accommodate the unique needs of these large data sets, VAPOR defines its own mechanism for storing sampled data and its associated attributes (metadata). In the VAPOR environment, a collection of related data, typically having been produced from a single numerical simulation, is known as a VAPOR Data Collection (VDC).
A VDC is composed of two components: metadata and field data. Metadata are data that describe field data. Examples of metadata include the grid type, spatial resolution, name of the field variables, number of time steps, and possibly user-defined attributes. Field data are the numerical outputs produced by the simulation (sampled 2D or 3D functions). Examples include: components of a velocity field, a temperature field, etc.
The VDC model is different from more traditional scientific data representations, such as netCDF and hdf, in two important ways:
Prior to analyzing your gridded data with VAPOR you must first convert your data to a VDC. There are a number of tools provided by VAPOR for performing this data conversion. The remainder of this document discusses your options.
Note: VAPOR also supports the direct import of some data formats without prior conversion to a VDC. However, VAPOR's progressive data access capabilities are not available when data is directly imported. |