2 General Information About Data Files

2.1 Units and Abbreviations

This report uses the SI system of units, but with many exceptions. Among them are the following:

The millibar (mb), equal to one hectopascal (hPa), was used for pressure with some older variables.
Many variables are presented in the units most often used for that variable, even when they involve CGS units or mixed CGS-MKS units, as for example [g m^-3] for liquid water content or [cm^-3] for droplet concentration.
Flow rates are often quoted in liters per minute (LPM) or standard liters per minute (SLPM) because those terms are linked to properties of commercially available instruments with flow control. One liter is 10^-3m³. Standard temperature and pressure are respectively 273.15 K and 1013.25 hPa. However, there is considerable ambiguity in the definition of “standard” conditions (mostly regarding the choice of the reference temperature) because some flow controllers and flowmeters specify a different “standard” temperature, so the particular usage will be documented when this term is used. Mass flow meters provide a measure of the flow of mass but usually report the measurement in terms of the volume flow that would be present under standard conditions (e.g.., SLPM). Therefore, to convert to volumetric flow at other conditions, if the fluid density is \(\rho\) and the mass flow rate in units of mass per time is denoted by \(\dot{m}^\prime\), the volumetric flow is \(Q=\dot{m}^\prime/\rho\). Then the mass flow rate in units of standard volume per time is \(\dot{m}=\dot{m}^\prime/\rho_s\) where \(\rho_s\) is the density of the fluid under standard conditions. To convert to volumetric flow under other conditions, \(Q(p,T)=\dot{m}^\prime/\rho\) = \(\dot{m}\rho_s/\rho\) = \(\dot{m}p_{s}T/(p\thinspace T_{s})\) where \(p\) and \(T\) are the pressure and absolute temperature for the desired measurement and \(p_s\) amd \(T_s\) are the corresponding values for standard conditions.
The International Bureau of Weights and Measures recommends against use of units like percent or parts per million, but these are in common use in atmospheric chemistry and elsewhere so data files continue to use those units for relative humidity or the concentration of chemical species. Proper SI units for a volumetric mixing ratio would be, e.g., \(\mu\)mol mol^-1, nmol mol^-1, or pmol mol^-1, but variables are instead often assigned the respective units of ppmv, ppbv or pptv for parts per million, billion or trillion by volume. Care must be taken to interpret ppbv especially, because “billion” has different meaning in different languages and different countries; herein, 1 ppbv means a volumetric ratio of 1:10⁹. Many measurements produce native results in terms of a mass ratio, often described as a mixing ratio \(r_m\) that specifies the mass of the measured gas per unit mass of “air” (where the mass of the “air” does not include the variable constituents, usually only significant for water vapor). The ideal gas law relates the density ratio of two gases \((\rho_1:\rho_2)\) to the ratio of their partial pressures \((p_1:p_2)\) or number densities \((n_1:n_2)\) as follows:
\[\begin{align} r_{m}=\frac{\rho_{1}}{\rho_{2}}=\frac{p_{1}M_{1}}{p_{2}M_{2}}=\frac{n_{1}M_{1}}{n_{2}M_{2}} \tag{2.1} \end{align}\] where \(M_1\) and \(M_2\) are respective molecular weights for the two gases. The ratio of number densities or, equivalently, partial pressures, denoted here as \(r_v\) because it is also the volumetric mixing ratio, is related to the mass mixing ratio as follows:
\[\begin{equation} r_{v}=\frac{n_{1}}{n_{2}}=\left(\frac{M_{2}}{M_{1}}\right)r_{m} \tag{2.2} \end{equation}\] When concentrations are recorded with units of “ppmv”, “ppbv” or “pptv”, these units refer respectively to \(10^6r_v\), \(10^9r_v\), and \(10^{12}r_v\) with \(r_v\) given by the above equation.
The unit “hertz” (abbreviation Hz) is the proper unit for a periodic sampling frequency and will be used here in place of the more awkward “samples per second.”
In some cases, particularly for older data files, speed has been recorded in units of knots (= 0.514444 m/s) and distance in nautical miles ≡ 1852 m).

In Appendix A there is a list of symbols.⁴ The next table defines some abbreviations and additional symbols used for units in this report, in addition to the standard abbreviations for the mks system of units:

abbreviation or symbol	definition\(^a\)
º	degree, angle measurement ≡ \(\pi/180\)
ft	foot ≡ 0.3048 m
mb	millibar ≡ 100 Pa ≡ 1 hPa
ppmv	parts per million by volume (see item 4 above)
ppbv	parts per billion (\(10^9\)) by volume (see item 4 above)
pptv	parts per trillion by volume (see item 4 above)
n mi	nautical mile ≡1852 m
kt	knot (n mi/hour) ≡ (1852/3600) m/s = 0.514444… m/s
^a where ≡ is used, the relation is exact by definition

2.2 Variables Used to Denote Time

Although there are some exceptions in old archived data files, the data in all modern output files are referenced to Coordinated Universal Time (UTC). The time and date of the data acquisition system are synchronized to time from the Global Positioning System (GPS) at the beginning of each flight, and for data acquired by the present ADS-3 (NIDAS) data acquisition system time is synchronized continuously with the GPS time. Time variables vary for older archived data files; some of the following are obsolete, but are included here for reference because they are important to those wanting to use those archives.

Time (s): Time

The reference-time counter for the output data files, used by data system versions beginning with ADS-3. It is an integer output at 1 Hz and has an initial value of zero at the start of the flight. Add this to the “Time:units” attribute found in the NETCDF header section to obtain the UTC time.

Example attribute:
Time:units = “seconds since 2006-04-26 12:55:00 +0000” ;

For code examples that show how to use “Time” see:
http://www.eol.ucar.edu/raf/Software/TimeExamp.html

Reference Start Time (s): base_time (Obsolete; versions before ADS-3 only)

The reference time for the netCDF output data files for data system versions before ADS-3. It represents the time of the first data record. Its format is Unix time (elapsed seconds after midnight 1 January 1970). Add time_offset (below) to obtain the time for each data record. (Note: base_time is a single scalar, not a “record” variable, so it occurs just once in the output file.)

Time Offset from Reference Start Time (s): time_offset (Obsolete)

The time offset from base_time of each data record used for the NETCDF output files produced by data system versions before ADS-3. It starts at zero (0) and increments each second, so it can also be thought of as a record counter. Use this measurement and add base_time to obtain the time for each data record.

Raw Tape Time (h, min, s): HOUR, MINUTE, SECOND (Obsolete)

These three time variables are recorded directly from the aircraft’s data system. Since ADS-3, this information is replaced by the “Time” variable and the “Time:units” attribute of that variable.

Date (m, d, y): MONTH, DAY, YEAR (Obsolete)

These three variables represent the date when the aircraft’s data system began recording data. They are repeated as 1 Hz variables but are NOT incremented if the time rolls over to the next day. Use base_time and time_offset for reference timing. Since ADS-3, this information is replaced by the “Time” variable and the “Time:units” attribute of that variable.

2.3 Synchronization of Measurements

Measurements sampled under control of the “NIDAS” sampling system are acquired at 50 Hz. However, the standard archive files are produced at a rate of 1 Hz, and each sample is the average of 50 samples. Therefore, the time associated with measurements reported at 1 Hz is actually an average over the specified second, so the reference time for the averaged measurement is actually 0.5 s past the reported time. Analogous offsets apply to variables reported at other rates different from 50 Hz. Where it applies, electronic filters with cutoff frequency of 25 Hz are used with analog measurements. Higher-rate files are sometimes produced, standardized to 25 Hz but sometimes at other frequencies.

There are time shifts inherent in many of the measurements, and in some cases (e.g., those produced by inertial reference units) these time shifts arise because the information is transmitted from the measuring system at a time later than when it was sampled. In these cases, shifts (“lags”) are applied to the measurements. The lags may be either static or dynamic. Static lags are specified in a configuration file, saved for each project; dynamic lags provided as part of data sampling by specific instruments are recorded by NIDAS for use in processing. Dynamic lags are usually a difference in time from a gridded time value to the time it was actually acquired. e.g. for a 5-Hz parameter the expected or gridded millisecond offset into each second would be 0, 200, 400, 600, and 800. If the data actually were sampled or acquired at 50, 250, 450, 650, and 850 ms then the dynamic lag for this particular second would be -50 ms. Corrections for time lags are applied to measurements before conversion to one of the standard data rates.

Where data rates for particular measurements do not match the basic 50 Hz sampling rate, linear interpolation is used to obtain higher-rate values. For 1 Hz data files, measurements are then averaged within each second. For 25 Hz files, 50 Hz measurements are digitally filtered using a finite impulse response (FIR) filter, while data acquired at less than 25 Hz are linearly interpolated to 25 Hz and then FIR-filtered for smoothing.

2.4 Other Comments on Terminology

2.4.1 Variable Names in Equations

This report often uses variable names in equations, and sometimes there is potential for confusion because the variable names consist of multiple characters. In most cases, to denote that the variable name is the variable in the equation (as opposed to each of the letters in the variable name representing quantities to be multiplied together), the variable name has been enclosed in brackets, as in {TASX}. In addition, variable names are displayed with upright Roman character sets, while other symbols in equations are shown using slanted (script) character sets as is conventional for mathematical equations. In cases where code segments (usually expressed in C code) are included to document how calculations are performed, monospaced character sets indicate that the segment is a representation of how the processing could be coded. Such a code segment is not always a direct copy of the code in use, but such code is sometimes the most convenient way to express the algorithm in use.

2.4.2 Distinction Between Original Measurements and Derived Variables

Many of the variables in the data files and in this report are derived from combinations of measurements. The terms “raw” or “original” measurement are sometimes used for a minimally processed output received directly from a sensor or instrument. Such measurements may be converted to engineering units via calibration coefficients, but otherwise they are a direct representation of the output from a sensor.⁵ In contrast, derived variables (e.g., potential temperature) depend on one or more “raw” measurements and are not direct results of output from an instrument. For most derived measurements, a box that follows an introductory comment is used in this report to document the processing algorithm. The box has two parts; in the top are definitions used and explanations regarding variables that enter the calculation, while the bottom portion contains the equation, algorithm, or code segment that documents how the variable is calculated.

2.4.3 Dimensions in Equations

An effort has been made to avoid dimensions in equations except where it would be awkward otherwise. Some scale factors are introduced for only this purpose (e.g., to avoid dimensions in arguments to logarithmic or exponential functions), and some effort was made to isolate dimensions to defined constants rather than requiring that variables in equations be used with specific units. However, some exceptions remain to be consistent with historical usage.

Some symbols used only once and defined where they are used are omitted from this list↩︎
Calibration coefficients, e.g. those used to convert from voltage output from an analog sensor to a measured quantity with physical units like ºC, are not included or discussed in this report. They are normally included in project reports and, in recent years, many are included in the header of the NETCDF file.↩︎