by Judith Curry
Arguably the most poorly documented aspect of climate models is how they are calibrated, or ‘tuned’
I have raised a number of concerns in my Uncertainty Monster paper and also in previous blog posts. A recent paper from the German climate modeling group at MPI sheds some light on this issue.
Tuning the climate of a global model
Thorsten Mauritsen, Bjorn Stevens, Erich Roeckner, Traute Crueger, Monika Esch, Marco Giorgetta, Helmuth Haak, Johann Jungclaus, Daniel Klocke, Daniela Matei, Uwe Mikolajewicz, Dirk Notz, Robert Pincus, Hauke Schmidt, and Lorenzo Tomassini
Abstract. During a development stage global climate models have their properties adjusted or tuned in various ways to best match the known state of the Earth’s climate system. These desired properties are observables, such as the radiation balance at the top of the atmosphere, the global mean temperature, sea ice, clouds and wind fields. The tuning is typically performed by adjusting uncertain, or even non-observable, parameters related to processes not explicitly represented at the model grid resolution. The practice of climate model tuning has seen an increasing level of attention because key model properties, such as climate sensitivity, have been shown to depend on frequently used tuning parameters. Here we provide insights into how climate model tuning is practically done in the case of closing the radiation balance and adjusting the global mean temperature for the Max Planck Institute Earth System Model (MPIESM). We demonstrate that considerable ambiguity exists in the choice of parameters, and present and compare three alternatively tuned, yet plausible configurations of the climate model. The impacts of parameter tuning on climate sensitivity was less than anticipated.
Published in Journal of Advances in Modeling Earth Systems [link]. Excerpts are provided below that provide some context; interested readers are encouraged to read the entire paper.
Although climate models and their configuration are well-documented, the process through which a particular model configuration comes into being is not, and as a result, the process of selecting a model configuration is shrouded in mystery.
Model tuning is not a well-defined term. Often, model calibration or model tuning is associated with the last step of a broader model development cycle, after structural enhancements, improved parameterizations and refined boundary conditions have been implemented, wherein selected parameters are adjusted so as to better match the model results with some targeted features of the climate system . The idea that models need to be harmonized with observations is of course applicable to the model development process as a whole, as parameterizations and grid configurations are usually selected based on their ability to improve the representation of some aspect of the climate system.
The need to tune models became apparent in the early days of coupled climate modeling, when the top of the atmosphere (TOA) radiative imbalance was so large that models would quickly drift away from the observed state. Initially, a practice to input or extract heat and freshwater from the model, by applying flux-corrections, was invented to address this problem . As models gradually improved to a point when flux-corrections were no longer necessary , this practice is now less accepted in the climate modeling community. Instead, the radiation balance is controlled primarily by tuning cloud-related parameters at most climate modeling centers , while others adjust the ocean surface albedo or scale the natural aerosol climatology to achieve radiation balance. Tuning cloud parameters partly masks the deficiencies in the simulated climate, as there is considerable uncertainty in the representation of cloud processes. But just like adding flux-corrections, adjusting cloud parameters involves a process of error compensation, as it is well appreciated that climate models poorly represent clouds and convective processes.
Arguably, the most basic physical property that we expect global climate models to predict is how the global mean surface air temperature varies naturally, and responds to changes in atmospheric composition and solar insolation. We usually focus on temperature anomalies, rather than the absolute temperature that the models produce, and for many purposes this is sufficient.
Figure 1 instead shows the absolute temperature evolution from 1850 till present in realizations of the coupled climate models obtained from the CMIP3 and CMIP5 multimodel datasets. There is considerable coherence between the model realizations and the observations; models are generally able to reproduce the observed 20th century warming of about 0.7 K, and details such as the years of cooling following the volcanic eruptions.
Yet, the span between the coldest and the warmest model is almost 3 K, distributed equally far above and below the best observational estimates, while the majority of models are cold-biased. Although the inter-model span is only one percent relative to absolute zero, that argument fails to be reassuring. Relative to the 20th century warming the span is a factor four larger, while it is about the same as our best estimate of the climate response to a doubling of CO2, and about half the difference between the last glacial maximum and present.
To parameterized processes that are non-linearly dependent on the absolute temperature it is a prerequisite that they be exposed to realistic temperatures for them to act as intended. Prime examples are processes involving phase transitions of water: Evaporation and precipitation depend non-linearly on temperature through the Clausius-Clapeyron relation, while snow, sea-ice, tundra and glacier melt are critical to freezing temperatures in certain regions. The models in CMIP3 were frequently criticized for not being able to capture the timing of the observed rapid Arctic sea-ice decline.
While unlikely the only reason, provided that sea ice melt occurs at a specific absolute temperature, this model ensemble behavior seems not too surprising when the majority of models do start out too cold.
In addition to targeting a TOA radiation balance and a global mean temperature, model tuning might strive to address additional objectives, such as a good representation of the atmospheric circulation, tropical variability or sea-ice seasonality. But in all these cases it is usually to be expected that improved performance arises not because uncertain or non-observable parameters match their intrinsic value – although this would clearly be desirable – rather that compensation among model errors is occurring. This raises the question as to whether tuning a model influences model-behavior, and places the burden on the model developers to articulate their tuning goals, as including quantities in model evaluation that were targeted by tuning is of little value. Evaluating models based on their ability to represent the TOA radiation balance usually reflects how closely the models were tuned to that particular target, rather than the models intrinsic qualities.
These issues motivate our present contribution where we both document and reflect on the model tuning that accompanied the preparation of a new version of our model system for participation in CMIP5. As decisions were made, often in the interest of expediency, a nagging question remained unanswered: To what extent did our results depend on the decisions we had just made?
Tuning the Model Climate
Formulating and prioritizing our goals is challenging. To us, a global mean temperature in close absolute agreement with observations is of highest priority because it sets the stage for temperature-dependent processes to act. For this, we target the 1850–1880 observed global mean temperature of about 13.7C. Beyond that, we prioritize having globally averaged TOA shortwave absorption and outgoing longwave radiation in good agreement with satellite observations, along with a representation of important climate variability modes. We would accept a model if the global mean cloud cover is above 60 percent in present-day climate, even if satellite-estimates are generally higher, while the bulk of observational estimates would allow a broader range.
Within the foreseeable future climate model tuning will continue to be necessary as the prospects of constraining the relevant unresolved processes with sufficient precision are not good.
Climate model tuning has developed well beyond just controlling global mean temperature drift. Today, we tune several aspects of the models, including the extratropical wind- and pressure fields, sea-ice volume and to some extent cloud-field properties. By doing so we clearly run the risk of building the models’ performance upon compensating errors, and the practice of tuning is partly masking these structural errors. As one continues to evaluate the models, sooner or later these compensating errors will become apparent, but the errors may prove tedious to rectify without jeopardizing other aspects of the model that have been adjusted to them. To aid the longterm development of our model we choose a tuning-strategy with only a small number of parameter changes between different model versions and resolutions, such that it will be easier to identify and understand how the model formulation can be improved.
The model tuning process at our institute is artisanal in character, in that both the adjustment of parameters at each tuning iteration and the evaluation of the resulting candidate models are done by hand, as is done at most other modeling centers. It is, however, at least conceptually possible to automate this process and find optimal sets of parameters with respect to certain targets. When considering model biases that appear on long time-scales (months to years), one option is to use the full model and search through parameter-space seeking areas in which errors are minimized. Alternatively, one can use a relatively small number of model runs to build a statistical model, or emulator, of the error as a function of parameter space to obtain parameter sets that minimize model error. Any such objective tuning algorithm requires a subjective choice of a cost function and this involves weighting trade-offs against one another, which is difficult to do ahead of time.
One of the few tests we can expose climate models to, is whether they are able to represent the observed temperature record from the dawn of industrialization until present. Models are surprisingly skillful in this respect, considering the large range in climate sensitivities among models – an ensemble behavior that has been attributed to a compensation with 20th century anthropogenic forcing: Models that have a high climate sensitivity tend to have a weak total anthropogenic forcing, and vice-versa. A large part of the variability in inter-model spread in 20th century forcing was further found to originate in different aerosol forcings.
It seems unlikely that the anti-correlation between forcing and sensitivity simply happened by chance. Rational explanations are that 1) either modelers somehow changed their climate sensitivities, 2) deliberately chose suitable forcings, or 3) that there exists an intrinsic compensation such that models with strong aerosol forcing also have a high climate sensitivity. Support for the latter is found in studies showing that parametric model tuning can influence the aerosol forcing . Understanding this complex is well beyond our scope, but it seems appropriate to linger for a moment at the question of whether we deliberately changed our model to better agree with the 20th century temperature record.
The MPI-ESM was not tuned to better fit the 20th century. In fact, we only had the capability to run the full 20th Century simulation according to the CMIP5-protocol after the point in time when the model was frozen. Yet, we were in the fortunate situation that the MPI-ESM-LR performed acceptably in this respect, and we did have good reasons to believe this would be the case in advance because the predecessor was capable of doing so. During the development of MPI-ESM-LR we worked under the perception that two of our tuning parameters had an influence on the climate sensitivity, namely the convective cloud entrainment rate and the convective cloud mass flux above the level of nonbuoyancy, so we decided to minimize changes relative to the previous model. The results presented here show that this perception was not correct as these parameters had only small impacts on the climate sensitivity of our model.
Climate models ability to simulate the 20th century temperature increase with fidelity has become something of a show-stopper as a model unable to reproduce the 20th century would probably not see publication, and as such it has effectively lost its purpose as a model quality measure. Most other observational datasets sooner or later meet the same destiny, at least beyond the first time they are applied for model evaluation.
That is not to say that climate models can be readily adapted to fit any dataset, but once aware of the data we will compare with model output and invariably make decisions in the model development on the basis of the results. Rather, our confidence in the results provided by climate models is gained through the development of a fundamental physical understanding of the basic processes that create climate change. More than a century ago it was first realized that increasing the atmospheric CO2 concentration leads to surface warming, and today the underlying physics and feedback mechanisms are reasonably understood (while quantitative uncertainty in climate sensitivity is still large).
JC comment: This paper is indeed a very welcome addition to the climate modeling literature. The existence of this paper highlights the failure of climate modeling groups to adequately document their tuning/calibration and to adequately confront the issues of introducing subjective bias into the models through the tuning process.
Tuning/calibration is unavoidable in a complex nonlinear coupled modeling system. The key is to document the tuning, both the goals and actual calibration process, in the manner in which the German climate modeling group has done.