Statistics#
Observation space diagnostics is a set of metrics used to assess the quality of data assimilation system in observation-space. In order to produce diagnostics in observation space, forward operators (also known as forward models) are used to map the model state to the observation. A forward operator is calculated from each of the model states to give an ensemble of ‘predicted’ observations. The ensemble of predicted observations together with the observation error is used to calculate the statistics for observation space diagnostics.
Observation space diagnostics are calculated per observation type. The statistics are calculated for each observation, then averaged over all observations of that type in the region and time period of interest. The region of interest may be a vertical layer, a horizontal region, or a combination of both.
The statistics are calculated for a DataFrame of observations using the
stats
module. The squared error, total variance, and bias are calculated
for each observation (row) in the DataFrame by the stats.diag_stats()
function.
RMSE, spread, total spread, bias can calculated by aggregating over any
number of rows to produce diagnostics for the region and time period of interest.
If the observation sequence contains posterior information, posterior statistics
will be calculated, otherwise only prior statistics will be calculated.
Definitions#
Some symbols used throughout:
: observation : observation error estimate : Number of observations in the group, i.e. in the region and time period of interest : group member : ensemble mean : ensemble spread (standard deviation) : ensemble variance
RMSE#
The root mean square error is defined
Spread#
The spread is the variability among ensemble members for a given observation.
Total spread#
Total Spread (
Ensemble spread (
): the variability among ensemble members for a given observation.Observation error estimate (
): the expected measurement error associated with each observation.
Bias#
Bias is the average difference between the ensemble mean and the observations
Multi-component Observations#
Some observations are multi-component, such as wind. These observations are combined from
two (or more) scalar observations. For example, a horizontal wind observation
The velocity components
Rank Histogram#
The rank histogram requires the full ensemble of forward operator values for each observation. Sampling noise is added to each member of the forward operator ensemble.:
where:
is the ensemble member index is the forward operator value of the -th ensemble member is a random number drawn from a normal distribution with the mean and standard deviation of the ensemble.
The rank of the observation is the number of ensemble members
where:
is the rank of the observation , is the number of ensemble members, represents the value of the -th ensemble member, is the observation value, is the indicator function, which is 1 if and 0 otherwise, The ensures a 1-based rank (i.e., the observation is ranked among the ensemble members).
The number of bins is equal to the number of ensemble members.
The count
where:
is the count of forecasts that fall into rank bin , is the total number of observations, is the rank of the observation within the ensemble for case i, is the indicator function, which is 1 if the condition is true and 0 otherwise.
Trusted Observations#
The DART quality control (DART_QC) values indicate, for each observation, whether the observation was used in the assimilation, and if not, why. DART_QC 0 indicates that the observation was assimilated. You may choose to include trusted observations in your observation space diagnostics, in which case, include DART_QC 0 and DART_QQC 7 observations in the calculation of the statistics.
For reference, here is the DART QC values and their meaning.
QC Value |
Description |
---|---|
0 |
Observation was assimilated successfully |
1 |
Observation was evaluated only so not used in the assimilation |
2 |
The observation was used but one or more of the posterior forward observation operators failed |
3 |
The observation was evaluated only so not used AND one or more of the posterior forward observation operators failed |
4 |
One or more prior forward observation operators failed so the observation was not used |
5 |
The observation was not used because it was not selected in the namelist to be assimilated or evaluated |
6 |
The incoming quality control value was larger than the threshold so the observation was not used |
7 |
Outlier threshold test failed (as described above) |
8 |
The location conversion to the vertical localization unit failed so the observation was not used |
For more detail on the DART QC values refer to the DART documentation.