Statistics#

Observation space diagnostics is a set of metrics used to assess the quality of data assimilation system in observation-space. In order to produce diagnostics in observation space, forward operators (also known as forward models) are used to map the model state to the observation. A forward operator is calculated from each of the model states to give an ensemble of ‘predicted’ observations. The ensemble of predicted observations together with the observation error is used to calculate the statistics for observation space diagnostics.

Observation space diagnostics are calculated per observation type. The statistics are calculated for each observation, then averaged over all observations of that type in the region and time period of interest. The region of interest may be a vertical layer, a horizontal region, or a combination of both.

The statistics are calculated for a DataFrame of observations using the stats module. The squared error, total variance, and bias are calculated for each observation (row) in the DataFrame by the stats.diag_stats() function. RMSE, spread, total spread, bias can calculated by aggregating over any number of rows to produce diagnostics for the region and time period of interest. If the observation sequence contains posterior information, posterior statistics will be calculated, otherwise only prior statistics will be calculated.

Definitions#

Some symbols used throughout:

$y$ : observation
$ϵ$ : observation error estimate
$N$ : Number of observations in the group, i.e. in the region and time period of interest
$n$ : group member
$μ$ : ensemble mean
$σ$ : ensemble spread (standard deviation)
$σ^{2}$ : ensemble variance

RMSE#

The root mean square error is defined

RMSE \equiv \sqrt{\frac{1}{N} \sum_{n = 1}^{N} (μ_{n} - y_{n})^{2}}

Spread#

The spread is the variability among ensemble members for a given observation.

σ \equiv \sqrt{\frac{1}{N} \sum_{n = 1}^{N} (σ_{n})^{2}}

Total spread#

Total Spread ( $σ_{T}$ ) is a measure of the combined uncertainty in an ensemble of estimated observations. It accounts for both:

Ensemble spread ( $σ_{n}$ ): the variability among ensemble members for a given observation.
Observation error estimate ( $ϵ_{n}$ ): the expected measurement error associated with each observation.

σ_{T} \equiv \sqrt{\frac{1}{N} \sum_{n = 1}^{N} (σ_{n}^{2} + ϵ_{n}^{2})}

Bias#

Bias is the average difference between the ensemble mean and the observations

bias \equiv \frac{1}{N} \sum_{n = 1}^{N} (μ_{n} - y_{n})

Multi-component Observations#

Some observations are multi-component, such as wind. These observations are combined from two (or more) scalar observations. For example, a horizontal wind observation $s$ is made up of two components, $u$ velocity and $v$ velocity.

The velocity components $u$ and $v$ are handled individually, as above, and in addition the statistics are calculated for the magnitude of the wind vector, $s$ are calculated as follows:

s_{y}^{n} \equiv \sqrt{y_{u}^{2} + y_{v}^{2}}

s_{e}^{n} \equiv \sqrt{μ_{u}^{2} + μ_{v}^{2}}

bias \equiv \frac{1}{N} \sum_{n = 1}^{N} (s_{e}^{n} - s_{y}^{n})

σ_{s} \equiv \sqrt{\frac{1}{N} \sum_{n = 1}^{N} (σ_{u}^{2} + σ_{v}^{2})}

σ_{T, s} \equiv \sqrt{\frac{1}{N} \sum_{n = 1}^{N} (σ_{u}^{2} + σ_{v}^{2} + ϵ_{u}^{2} + ϵ_{v}^{2})}

Rank Histogram#

The rank histogram requires the full ensemble of forward operator values for each observation. Sampling noise is added to each member of the forward operator ensemble.:

X_{i} = f_{i} + N (μ, σ)

where:

$i$ is the ensemble member index
$f_{i}$ is the forward operator value of the $i$ -th ensemble member
$N (μ, σ)$ is a random number drawn from a normal distribution with the mean $μ$ and standard deviation $σ$ of the ensemble.

The rank of the observation is the number of ensemble members $X$ whose value is less than the observation value.

R = \sum_{i = 1}^{M} 1 (X_{i} < y) + 1

where:

$R$ is the rank of the observation $X_{o}$ ,
$M$ is the number of ensemble members,
$X_{i}$ represents the value of the $i$ -th ensemble member,
$y$ is the observation value,
$1 (\cdot)$ is the indicator function, which is 1 if $X_{i} < y$ and 0 otherwise, The $+ 1$ ensures a 1-based rank (i.e., the observation is ranked among the ensemble members).

The number of bins is equal to the number of ensemble members. The count $H$ of observations in in each bin is:

H (k) = \sum_{i = 1}^{N} 1 (R_{i} = k)

where:

$H (k)$ is the count of forecasts that fall into rank bin $k$ ,
$N$ is the total number of observations,
$R_{i}$ is the rank of the observation within the ensemble for case i,
$1 (\cdot)$ is the indicator function, which is 1 if the condition is true and 0 otherwise.

Trusted Observations#

The DART quality control (DART_QC) values indicate, for each observation, whether the observation was used in the assimilation, and if not, why. DART_QC 0 indicates that the observation was assimilated. You may choose to include trusted observations in your observation space diagnostics, in which case, include DART_QC 0 and DART_QQC 7 observations in the calculation of the statistics.

For reference, here is the DART QC values and their meaning.

DART Quality Control (DART_QC) Values#
QC Value	Description
0	Observation was assimilated successfully
1	Observation was evaluated only so not used in the assimilation
2	The observation was used but one or more of the posterior forward observation operators failed
3	The observation was evaluated only so not used AND one or more of the posterior forward observation operators failed
4	One or more prior forward observation operators failed so the observation was not used
5	The observation was not used because it was not selected in the namelist to be assimilated or evaluated
6	The incoming quality control value was larger than the threshold so the observation was not used
7	Outlier threshold test failed (as described above)
8	The location conversion to the vertical localization unit failed so the observation was not used

For more detail on the DART QC values refer to the DART documentation.