Quality Control and Validation of Observations, Analyses, and Models
As discussed in Chapter 1, the application of data assimilation includes a structured open-ended learning process that improves our understanding of both the factors governing the time-dependent evolution of a geophysical fluid system and the characteristics of the data. This learning process occurs through the many ways in which different data streams are systematically compared with each other and with predictions of short-range fore-casts from the assimilating model (Hollingsworth, 1989). Inconsistencies between the observations and predictions are easily documented and demand explanation, providing the basis for quality control and validation of observations, analyses, and the models themselves.
Significant improvements over the last decade in the use of data assimilation for numerical weather prediction (NWP) have resulted in the development of new techniques for global validation and quality assurance of both in situ and remotely sensed data (Hollingsworth et al., 1986).
Modern data assimilation systems use the relevant prior data and a state-of-the-art computer model of the atmosphere to provide an accurate quantitative estimate of the current state of the atmosphere. For each and every observation of a variable that is made, a background value derived from the model forecast is also available for comparison. In data-sparse areas, sys-
tematic differences between the forecast values and the observations are frequently indicative of systematic errors in the observations. In data-rich areas the observations and the forecast values have similar accuracies on the scales resolved by the forecast model, and the accuracy of the forecast is usually good enough to be very useful in detecting uncharacteristically large errors in the observations. However, there are instances in which the model has rejected valid observations representing a significant change that result in a subsequent incorrect forecast. Methods have been developed to recognize these critical events, but it cannot be said that state-of-the-art models have entirely resolved this problem.
The relatively high accuracy of short-range forecasts is used by NWP centers to monitor systematically many types of observations. Such monitoring has successfully identified long-standing errors in radiosondes at remote stations, in reports from ships plying remote routes, in aircraft reports over the oceans, in cloud-track winds from geostationary satellites, and in temperature soundings from polar-orbiting satellites. As the NWP systems have improved, they have exposed significant defects in current satellite ground-processing schemes and limitations of current instrumental technology. Serious problems of bias and noise have been uncovered in retrievals of wind and temperature (Andersson et al., 1991; Kelly et al., 1991).
In many cases the analyzed fields provided by the assimilation system can be verified directly against in situ measurements. Thus, the model provides the guidance to identify problems, and carefully chosen in situ measurements provide conclusive proof.
EPISODIC MODEL REJECTION OF DATA
The relationship between data that must be quality checked and the background field that the data are used to correct is somewhat paradoxical. That is, the background field generated by the model must be sufficiently accurate to allow detection of erroneous data. However, these model-generated background fields contain significant errors, especially in data-sparse regions—if these errors did not develop in the forecast background field, there would be no need for data to correct them. Thus, a decision whether to correct a background field based on a current datum or to reject the datum because it differs by some increment from the prior estimate is difficult.
This problem is complicated by the fact that even correct data should sometimes be rejected if they reflect processes that cannot be resolved on the scale of the grid system used in the analysis. For example, an observation may correctly reflect the existence of a local thermal or velocity field anomaly associated with a thunderstorm outflow boundary; however, it would be undesirable if this datum were allowed to modify a background field on
a grid with, say, grid boxes of 100 km on a side, because the datum is not representative of the scales resolved by the model grid and its influence would be distributed over an erroneously large area. Thus, sometimes even good data need to be rejected by the assimilation system.
The uncertainty that stems from an inability of automated quality control of data to deal with all situations encountered requires follow-up human evaluation and intervention. Human intervention at times includes reintroduction of information and previously rejected data into the analysis by more subjective procedures. Ideally, an experienced analyst would recognize a poorly observed atmospheric process that is not represented well by the background field and subjectively modify the analysis to ensure consistency with a conceptual understanding of the process or phenomenon on the scale of the model. This is an important area of research, particularly for mesoscale data assimilation where sometimes few data points are available to define important mesoscale structures.
VALIDATION OF REMOTELY SENSED EARTH OBSERVING SYSTEM DATA
Data from any new or existing satellite observing system will not be really useful unless the errors in the data are comparable with or smaller than the errors in prior information, as represented by the model-produced background values. This is just as important in the use of satellite data for climate studies as for NWP. Sustained and intensive monitoring, quality assurance, and global validation of the algorithms and data from a new system such as the Earth Observing System (EOS) are essential to ensure that the observations from the system are of sufficient quality to be useful for both climate and NWP analyses.
The importance of a global approach to validation of remotely sensed satellite data has been demonstrated by operational experience with cloud track winds and temperature soundings. For example, wind data from the satellite Seasat-A scatterometer had many anomalies and biases. Seven years after launch, data assimilation studies of the scatterometer data documented the already-known biases and identified others. Both sets of biases were documented conclusively using the Seasat-minus-forecast and Seasat-minus-ship comparisons generated by a few days of data assimilation (Anderson et al., 1991). The biases would have been detected much sooner if the Seasat data had been critically evaluated in a data assimilation system.
The distinction between research satellite missions and operational satellite missions is losing its significance. To produce research-quality data from a new satellite system, the observed data should be subjected to a critical evaluation by an assimilation system in order to identify error characteristics of the instruments and the algorithms. If an operational NWP
system has predictive capability for the remotely sensed quantity, the real-time assimilation system can provide a basis for validating and assimilating the new data. The data assimilation system then provides a powerful and systematic means for comparing the new remote measurements with all earlier and current in situ and remotely sensed measurements. Thus, the real-time operational assimilation system can provide quality assurance and validation of the new satellite observations.
Experience with many satellite systems shows that real-time assimilation systems at NWP centers provide rapid identification and diagnosis of problems that would otherwise pass undetected for long periods, thereby corrupting irreplaceable data. Real-time indications of sudden problems can be provided within a few hours, as erroneous data are rejected by the assimilation system's quality control programs. In these checks, ''toss-out'' criteria are applied at two stages in the comparison of background field and observed values.
In addition, the real-time assimilation system provides a comprehensive, quantitative "quick-look" synthesis on a regular grid of all current and past observations. Oceanographers, for example, will require most EOS atmospheric data in assimilated form on a regular grid as input to ocean wave and ocean circulation models. Early availability of real-time operational analyses will stimulate research on new satellite observations and demand for more rapid production of delayed-mode analyses.
The use of EOS data in operational data assimilation will therefore be the first iteration of the research use of the data, and so the operational centers can contribute a great deal to the success of the overall research goals provided they are tasked and funded to produce research-quality assimilation data sets during the daily cycle.
VALIDATION OF MODEL-ASSIMILATED ANALYSES
Many methods have been used to validate the description of the atmosphere provided by model-assimilated data sets. The simplest method is to calculate the mean and root mean square differences between the analyses and the observations used in the analyses. Any analysis scheme ought to be able to fit the data used to within reasonable bounds (Hollingsworth et al., 1985).
A somewhat more stringent method, which can be applied in data-dense regions, is to withhold some of the observational data from the analysis procedure and to validate the analyses against the withheld data.
A method much favored by NWP researchers is to validate the analysis through observational verification of very short range forecasts made from the analyses. The random component of observation error is independent of the forecast, so an estimate of the forecast error, which is an upper bound
on the analysis error, can be obtained. Since the short-range forecast usually provides the background field for the next analysis, this approach is equivalent to observational verification of the background field (Hollingsworth and Lönnberg, 1986).
All of the approaches thus far involve calculations at single points. A more revealing approach is to examine two-point correlations of the departure between analyses and observations. This is a very efficient method to determine if the analyses have in fact extracted all the available information from the data. If the calculations are multivariate (involving, say, wind or wind shear at one point and geopotential or thickness at another), one can readily determine if the analyses reflect the balance of the observations (Hollingsworth and Lönnberg, 1989).
The vertically integrated latent and sensible heating of the atmosphere estimated from model-assimilated data sets can be validated in a number of ways. Measurements of rainfall can be used to validate the vertically integrated latent heat release. The sum of the net radiation at the top of the atmosphere and the atmospheric heating gives the net flux into the underlying land or ocean surface, which can be measured or estimated in various ways. In the tropics, outgoing long-wave radiation (OLR) data (which is still not used operationally) is useful for validating inferred vertical velocity and diabatic heating fields in the tropics (Arpe, 1991). Ultimately, satellite-borne active and passive microwave, measurements will provide more accurate measurements of moist diabatic processes.
Diagnostic studies have deduced many highly derived quantities from model-assimilated data sets. Intercomparisons of these results for Global Weather Experiment (GWE) data showed large discrepancies, indicating that some or all of the analyses were unreliable. Over the last 7 years, successive reanalyses of GWE data have produced a marked convergence in the results, although there is still some way to go. The atmospheric energy budget is a prime concern of several component programs of the World Meteorological Organization's (WMO) World Climate Research Program (WCRP), especially of the Global Energy and Water Cycle Experiment (GEWEX).
A validation method of increasing importance is the use of atmospheric model-assimilated data sets to drive ocean circulation or ocean wave models. Both models are controlled by atmospheric forcing and are extremely sensitive to it (Harrison et al., 1989; Janssen et al., 1989). The responses of ocean models to differing atmospheric forcing are large enough to be easily verified against oceanographic observations. Such ocean models provide useful tests of atmospheric model-assimilated data sets.
The most comprehensive validation of model-assimilated data sets is the skill of daily forecasts. Forecast skill has improved markedly over the last decade, largely as a result of increased analysis accuracy.
VALIDATION OF FORECAST MODELS
An important application of model-assimilated data sets is validation of the physical parameterizations used in general circulation models. Model-assimilated data sets reveal the close ties between model parameterization schemes; the initial tendency for mean errors; and the fully developed mean errors after many days of integration by tracking energy, momentum, and other balance requirements for the atmosphere.
The "balance requirement" approach has been used for decades, especially in the GWE data, to estimate atmospheric diabatic forcing by means of the fact that the time mean tendency of the atmosphere is zero on time scales of a month, apart from a small and easily calculated seasonal trend. The observed balance of the atmosphere is used to infer the mean diabatic forcing of the atmosphere through tendency calculations that apply the adiabatic equations of motion to the analyzed data. From the principle of balance, the average diabatic forcing is the negative of the average adiabatic tendency. New applications of this principle have recently been found in the validation of the parameterizations used in models.
Calculations of the monthly average of a model's initial adiabatic tendency and diabatic tendency then provide three-dimensional fields of the "true" diabatic forcing and of the errors in the model's diabatic forcing. Recent applications of this methodology to model-assimilated data sets and to 1-day and 10-day forecasts have shown that there are close similarities between the mean errors (sometimes referred to in jargon as climate drift) evident in 1-hour, 1-day, and 10-day forecasts.
Comparisons of such results for data sets assimilated with models differing only in their physical parameterizations have been of great value in documenting differences in performance of the parameterizations and in linking the differences in parameterization to differences in mean error evolution.
This new approach to validation of parameterization schemes is important for their successful development. Hitherto, parameterization schemes have been based on conceptual understanding of physical processes supported by field experiments and detailed modeling. The field data for process studies are frequently combined into a time-evolving single-column composite description representing mean values of an areal observing network. As a result, parameterization schemes generally have been developed in a one-column atmospheric context and validated on one-dimensional field data.
The balance requirement approach to validation of parameterizations uses a "top-down," rather than a "bottom-up," approach to model validation. Using copious quantities of operationally available data that describe synoptic-scale phenomena, the method asks, what do the synoptic-scale and
large-scale motions need from the parameterization schemes in order to maintain the observed balance? The approach has proved valuable in diagnosing problems in model formulations of gravity wave drag near mountains; in radiative, convective, and planetary boundary layer schemes used in models; and even in analysis methods.
Of course, this new approach only indicates that a problem exists in certain terms in the equations; it does not of itself solve the problem. Improvement of a parameterization scheme within a forecast model is a delicate task, since parameterization schemes can produce complex dynamical effects on the evolution of the flow in a forecast model. However, examination of the very short range forecast errors is an effective first step in diagnosing errors in the physical parameterizations of a model. Deficiencies in the interactions between the dynamics and the parameterization schemes can be difficult to identify, so a thorough understanding of how each physical process operates is required in order to suggest possible improvements. This new approach to model validation provides useful insight into the sources of errors in the physical parameterizations, and it provides a valuable additional method for systematic assessment of the performance of model physics. Its effectiveness depends entirely on the availability of the model-assimilated data sets.