Site icon Climate Etc.

Reflection on reliability of climate models

by Judith Curry

Failure to communicate the relevant ‘weak link’ is sometimes under-appreciated as a critical element of science-based policy-making.

Lenny Smith and Arthur Petersen have written a very interesting and insightful paper Variations on Reliability: Connecting Climate Predictions to Climate Policy,  [link] to complete manuscript. Excerpts:

Our general claim is that methodological reflections on uncertainty in scientific practices should provide guidance on how their results can be used more responsibly in decision support. In the case of decisions that need to be made to adapt to climate change, societal actors, both public and private, are confronted with deep uncertainty. The notions of ‘reliability’ are examined critically, in particular the manner(s) in which the reliability of climate model findings pertaining to model-based high-resolution climate predictions is communicated.

Findings can be considered ‘reliable’ in many different ways. Often only a statistical notion of reliability is implied, but we consider wider variations in the meaning of ‘reliability’, some more relevant to decision support than the mere uncertainty in a particular calculation. We distinguish between three dimensions of ‘reliability’ – statistical reliability, methodological reliability and public reliability – and we furthermore understand reliability as reliability for a given purpose, which is why we refer to the reliability of particular findings and not to the reliability of a model, or set of models, per se.

At times, the statistical notion of reliability, or ‘statistical uncertainty’, dominates uncertainty communication. One must, however, seriously question whether the statistical uncertainty adequately captures the ‘relevant dominant uncertainty’ (RDU). The RDU can be thought of as the most likely known unknown limiting our ability to make a more informative scientific probability distribution on some outcome of interest; perhaps preventing even the provision of a robust statement of subjective probabilities altogether. Here we are particularly interested in the RDU in simulation studies, especially in cases where the phenomena contributing to that uncertainty are neither sampled explicitly nor reflected in the probability distributions provided to those who frame policy or those who make decisions. For the understanding, characterisation and communication of uncertainty to be ‘sufficient’ in the context of decision-making we argue that the RDU should be clearly noted. Ideally the probability that a given characterisation of uncertainty will prove misleading to decision-makers should be provided explicitly.

Science tends to focus on uncertainties that can be quantified today, ideally reduced today. But a detailed probability density function (PDF) of the likely amount of fuel an aircraft would require to cross the Atlantic is of limited value to the pilot if in fact there is a good chance that metal fatigue will result in the wings separating from the fuselage. Indeed, focusing on the ability to carry enough fuel is a distraction when the integrity of the plane is thought to be at risk. The RDU is the uncertainty most-likely to alter the decision-maker’s conclusions given the evidence, while the scientist’s focus is understandably on some detailed component of the big picture. How can one motivate the scientist to communicate the extent to which her detailed contribution has both quantified the uncertainty under the assumption that the RDU is of no consequence, and also provided an idea of the timescales, impact and probability of the potential effects of the RDU? Failure to communicate the relevant ‘weak link’ is sometimes under-appreciated as a critical element of science-based policy-making.

While the IPCC has led the climate science community in codifying uncertainty characterisation, it has hardly paid attention to specifying the RDU. The focus is at times more on ensuring reproducibility of computation than on relevance (fidelity) to the Earth’s climate system, in fact it is not always easy to distinguish which of these two are being discussed. Instead, the attention has mainly been on increasing the transparency of the IPCC’s characterisation of uncertainty. Being transparent, while certainly a good thing in itself, is not the same as communicating the RDU for the main findings.

A central insight is to note that when the level of scientific understanding is low, ruling out aspects of uncertainty in a phenomenon without commenting on less well understood aspects of the same phenomenon can ultimately undermine the general trust decision-makers place in scientists (and thus lower the public reliability of their findings). Often epistemic uncertainty or mathematical intractability means that there is no strong evidence that an impact will occur; simultaneously there may be good scientific reason to believe the probability of a significant impact is nontrivial, say, greater than 1 in 200. How do we stimulate insightful discussions of things we can neither rule out nor rule in with precision, but which would have significant impact were they to occur?

Statistical reliability (reliability1). A statistical uncertainty distribution, or statistical reliability (denoted by reliability1), can be given for findings when uncertainty can be adequately expressed in statistical terms, e.g., as a range with associated probability (for example, uncertainty associated with modelled internal climate variability). Statistical uncertainty ranges based on varying real numbers associated with models constitute a dominant mode of describing uncertainty in science. One cannot immediately assume that the model relations involved offer adequate descriptions of the real system under study (or even that one has the correct model class), or that the (calibration-)data employed are representative of the target situation.

Methodological reliability (reliability2). We know that models are not perfect and never will be perfect. Especially when extrapolating models into the unknown, we wish ‘both to use the most reliable model available and to have an idea of how reliable that model is’ but the reliability of a model as a forecast of the real world in extrapolation cannot be established. There is no statistical fix here; one should not confuse the range of diverse outcomes across an ensemble of model simulations (projections), such as used by the IPCC, with a statistical measure of uncertainty in the behaviour of the Earth. This does not remotely suggest that there is no information in the ensemble or that the model(s) is worthless, but it does imply that each dimension of ‘reliability’ needs to be assessed.

If the referent is the real world (and not a universe of mathematical models) and the purpose is to generate findings about properties of the climate system or prediction of particular quantities, then‘reliability1’ is uninformative: one can have a reproducible, well-conditioned model distribution which is reliable1 without reliable being read as informative regarding the real world. A methodological definition of reliability, denoted by reliability2, indicates the extent to which a given output of a simulation is expected to reflect its counterpart (target) in reality.

Public reliability (reliability3). In addition to the qualitative evaluation of the reliability of a model, increasingly also the reliability of the modellers is taken into account in the internal and external evaluation of model results in climate science. We therefore introduce the notion of reliability3 of findings based on climate models, which reflects the extent to which scientists in general and the modellers in particular are trusted by others.

As we argue below, climate scientists can indicate the shortcomings of the details of their modelling results, while making clear that solid basic science implies that significant risks exist. If climate scientists are seen as ‘hiding’ uncertainties, however, the public reliability (reliability3) of their findings may decrease, and with it the reliability3 of solid basic physical insight.

Information on the relevant dominant uncertainty is more useful when it is identified clearly as the Relevant Dominant Uncertainty; it is less useful when buried amongst information of other uncertainties that are well quantified, have small impacts, or are an inescapable fact of all scientific simulation. Extrapolation problems like climate change can benefit from new insights into: how to better apply the current science, how to advertise its weaknesses and more clearly establish its limitations; all for the immediate improvement of decision support and the improvement of future studies. This might also aid the challenge of training not only the next generation of expert modellers, but also the next generation scientists who can look at the physical system as a whole and successfully use the science to identify the likely candidates of future RDU.

Climate policy on mitigation and decision-making on adaptation provide a rich field of evidence on the use and abuse of science and scientific language. We have a deep ignorance of what detailed weather the future will hold, even as we have a strong scientific basis for the belief that anthropogenic gases will warm the surface of the planet significantly. It seems rational to hold the probability that this is the case far in excess of the ‘1 in 200’ threshold which the financial sector is regulated to consider. Yet there is also an anti-science lobby which uses very scientific sounding words and graphs to bash well-meaning science and state-of-the-art modelling. If the response to this onslaught is to ‘circle the wagons’ and lower the profile of discussion of scientific error in the current science, one places the very foundation of science-based policy at risk.

Failing to highlight the shortcomings of the current science will not only lead to poor decision-making, but is likely to generate a new generation of insightful academic sceptics, rightly sceptical of oversell, of any over-interpretation of statistical evidence, and of any unjustified faith in the relevance of model-based probabilities. Statisticians and physical scientists outside climate science might become scientifically sceptical, sometimes wrongly, of the basic climate science in the face of unchecked oversell of model simulations. This mistrust will lead to a low assessment by these actors of the reliability3 (public reliability) of findings from such simulations, even where the reliability1 and reliability2 are relatively high (e.g. with respect to the attribution of climate change to significant increases in atmospheric CO2). Learning to better deal with deep uncertainty (ambiguity) and known model inadequacy can advance significantly and foster the more effective use of model-based probabilities in the real world.

JC comments:  This paper articulates in a different way (and probably better way) many of the issues I have been raising about climate model uncertainty.  I particularly like the idea of the Relevant Dominant Uncertainty, but the nature of the uncertainties is often such that it is unclear which among the uncertainties is dominant.  What I find to be particularly important is the recognition that ever more precise estimates of statistical uncertainty can be completely meaningless for decision making.

Arthur Petersen is one of a new breed of philosophers of science that is working in the area of the applied philosophy of climate science.  Lenny Smith is unique in the climate field by integrating nonlinear science, statistics, economics and increasingly applied philosophy of science to consider broad issues in climate modeling, climate science, and decision making under uncertainty.  Note, Lenny Smith will be giving one of the big invited talks in the Atmospheric Sciences Section at the forthcoming Fall AGU meeting.

 

 

 

 

 

Exit mobile version