by Judith Curry
How do you estimate the state of the global atmosphere and ocean when observational data sets are incomplete, imperfect and noisy?
Because dynamical process states in a large and complex dynamical system are only partially accessible by measurements, most quantities must be determined via model-based state estimation. This is accomplished by inverse modelling.
The Wikipedia describes inverse problems in the following way:
The inverse problem can be conceptually formulated as follows:
- Data → Model parameters
The inverse problem is considered the “inverse” to the forward problem which relates the model parameters to the data that we observe:
- Model parameters → Data
The transformation from data to model parameters (or vice versa) is a result of the interaction of a physical system with the object that we wish to infer properties about. In other words, the transformation is the physics that relates the physical quantity (i.e. the model parameters) to the observed data.
The mathematics of dealing with inverse modelling of large scale state estimation problems is rather hairy, using various methodologies including variational, Kalman filter, maximum likelihood ensemble filter and other ensemble methods. The Wikipedia article on data assimilation provides a reasonable introduction to the basic technical application of this to weather forecasting.
So skipping over the technical details of all this, lets discuss climate reanalysis products, which are products of model-based state estimation.
Reanalysis is a climate or weather model simulation of the past that includes data assimilation of historical observations. The rationale for climate reanalysis is given by reanalyses.org:
Reanalysis is a scientific method for developing a comprehensive record of how weather and climate are changing over time. In it, observations and a numerical model that simulates one or more aspects of the Earth system are combined objectively to generate a synthesized estimate of the state of the system. A reanalysis typically extends over several decades or longer, and covers the entire globe from the Earth’s surface to well above the stratosphere. Reanalysis products are used extensively in climate research and services, including for monitoring and comparing current climate conditions with those of the past, identifying the causes of climate variations and change, and preparing climate predictions. Information derived from reanalyses is also being used increasingly in commercial and business applications in sectors such as energy, agriculture, water resources, and insurance.
Overviews of climate reanalyis are found at this page, under Meeting Presentations. I refer specifically to useful presentation by Kevin Trenberth on atmospheric reanalyses. Some excerpts from the text:
Data Assimilation merges observations & model predictions to provide a superior state estimate. It provides a dynamically- consistent estimate of the state of the system using the best blend of past, current, and perhaps future observations. Experience mainly in atmosphere; developing in ocean, land surface, sea ice.
[Using a weather prediction model] The observations are used to correct errors in the short forecast from the previous analysis time. Every 12 hours ECMWF assimilates 7 – 9,000,000 observations to correct the 80,000,000 variables that define the model’s virtual atmosphere. This is done by a careful 4-dimensional interpolation in space and time of the available observations; this operation takes as much computer power as the 10-day forecast.
Operational four dimensional data assimilation continually changes as methods and assimilating models improve, creating huge discontinuities in the implied climate record. Reanalysis is the retrospective analysis onto global grids using a multivariate physically consistent approach with a constant analysis system.
Reanalysis has been applied to atmospheric data covering the past five decades. Although the resulting products have proven very useful, considerable effort is needed to ensure that reanalysis products are suitable for climate monitoring applications.
At reanalyses.org, the second generation of reanalysis products is described, here are the main products that cover the longest period of time:
ECMWF Interim Reanalysis (ERA-Interim): 1979-present. ERA-Interim was originally planned as an ‘interim’ reanalysis in preparation for the next-generation extended reanalysis to replace ERA-40. It uses a December 2006 version of the ECMWF Integrated Forecast Model (IFS Cy31r2). It originally covered dates from 1 Jan 1989 but an additional decade, from 1 January 1979, was added later. ERA-Interim is being continued in real time. The spectral resolution is T255 (about 80 km) and there are 60 vertical levels, with the model top at 0.1 hPa (about 64 km). The data assimilation is based on a 12-hourly four-dimensional variational analysis (4D-Var) with adaptive estimation of biases in satellite radiance data (VarBC). With some exceptions, ERA-Interim uses input observations prepared for ERA-40 until 2002, and data from ECMWF’s operational archive thereafter. See Dee et al. (2011) in the references below for a full description of the ERA-Interim system.
NASA Modern Era Reanalysis for Research and Applications (MERRA): 1979-present. MERRA is a NASA reanalysis for the satellite era using a major new version of the Goddard Earth Observing System Data Assimilation System Version 5 (GEOS-5) produced by the NASA GSFC Global Modeling and Assimilation Office (GMAO). The Project focuses on historical analyses of the hydrological cycle on a broad range of weather and climate time scales and places the NASA EOS suite of observations in a climate context.
NCEP Climate Forecast System Reanalysis (CFSR): 1979-Jan 2010. The National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR) was completed over the 31-year period of 1979 to 2009 in January 2010. The CFSR was designed and executed as a global, high resolution, coupled atmosphere-ocean-land surface-sea ice system to provide the best estimate of the state of these coupled domains over this period. The current CFSR will be extended as an operational, real time product into the future.
NOAA CIRES 20th Century Reanalysis V2 (20CR): 1871-2008. The 20th Century Reanalysis version 2 (20CRv2) dataset contains global weather conditions and their uncertainty in six hour intervals from the year 1871 to 2008. Surface and sea level pressure observations are combined with a short-term forecast from an ensemble of integrations of an NCEP numerical weather prediction model using the Ensemble Kalman Filter technique to produce an estimate of the complete state of the atmosphere, and the uncertainty in that estimate. Additional observations and a newer version of the NCEP model that includes time-varying CO2 concentrations, solar variability, and volcanic aerosols are used in version 2. The long time range of this dataset allows scientists to examine better long time scale climate processes such as the Pacific Decadal Oscillation and the Atlantic Multidecadal Oscillation as well as looking at the dynamics of historical climate and weather events. Verification tests have shown that using only pressure creates reasonable atmospheric fields up to the tropopause. Additional tests suggest some correspondence with observed variations in the lower stratosphere.
Atmospheric reanalyses comparison table is found [here].
Reanalyses intercomparison and plotting tools are found [here].
JC comments. I’ve used the first generation reanalysis products from ECMWF and NCEP quite extensively, they are invaluable tools albeit with significant limitations. I’ve also started using the ECMWF Interim Reanalysis, which is a substantial improvement, and am also looking at the MERRA, which looks quite good especially for clouds and hydrological cycle. These products are tremendously useful for a variety of applications, but the new shouldn’t be used for trend analysis without more assessment of their capabilities.
Those of you who don’t like models probably won’t like reanalysis products. But this is the best alternative in the face of incomplete and inconsistent data sets. Such state estimation using inverse modeling and data assimilation is far preferable to statistical “homogenization” and use of EOFs to fill in for missing data.