by Judith Curry
From Roger Pielke Jr.: A fundamental problem with climate science in the public realm, as conventionally practiced by the IPCC, is the essential ink blot nature of its presentation. By “ink blot” I mean that there is literally nothing that could occur in the real world that would allow those who are skeptical of scientific claims to revise their views due to unfolding experience. That is to say, anything that occurs with respect to the climate on planet earth is “consistent with” projections made by the climate science community.
Pielke Jr.’s original inkblot post generated substantial discussion with James Annan James Annan (here, here, and here) with additional posts from Pielke Jr (here, here, and here), with a synopsis from John Nielsen-Gammon (here).
There are two ways for the climate science community to move beyond an ink blot (if it wishes to do so). One would be to advance predictions that are in fact conventionally falsifiable (or otherwise able to be evaluated) based on experience. This would mean risking being wrong, like economists do all the time. The second would be to openly admit that uncertainties are so large that such predictions are not in the offing. This would neither diminish the case for action on climate change nor the standing of climate science, in fact it may just have the opposite effect.
This series of posts is quite interesting, especially the comments. This sort of topic is right up my alley, although I am a bit late coming to this particular party. Roger’s second way is a better choice than the first way: “uncertainties are so large that such predictions are not in the offing.” This issue is discussed at length in my uncertainty monster paper, section 2.3. However, the concept of prediction/model falsification is misleading in the manner in which it seems to be used by Pielke Jr.
Skill scores for probabilistic forecasts
An issue that arose in the Pielke-Annan posts is the validation of probabilistic forecasts. In a proposal that I just submitted, I have a section on skill scores for probabilistic and ensemble forecasts (for weather and seasonal climate forecasts):
In the meteorological literature there are several methods for assessing the value of probabilistic and ensemble forecasts, including the Brier Score, Ranked Probability Score, Relative Operating Characteristics (ROC), Bounding Box, Rank Histograms. Of particular relevance to anomalous relatively rare events , skill scores are needed that address user sensitivities to event identification versus false alarms of relevance to the target decision makers. For example, the Ignorance Score penalizes strongly for a missed event, but not so strongly for a false alarm. [Note: missed event versus false alarm relates to Type I,II errors]
I spotted this presentation by a scientist at ECMWF that gives some examples. In the weather/climate community, these skill scores are used to evaluate a large population of daily or monthly forecasts.
For climate models simulating climate change, we are most often talking about century or decadal time scales, for which we have only a few realizations (with marginal external forcing for all but the most recent few decades). Evaluating previous ensemble climate projections against subsequent observations is probably most sensibly done using a bounding box approach (which is essentially what Lucia Liljegren has been doing).
Model outcome uncertainty
What exactly does falsification of a prediction mean? For an ensemble prediction, the prediction is said to have no skill if the actual realization falls outside of the bounding box of the ensembles (or whatever skill score for whatever variable has been decided in advance). A prediction with no skill does not imply falsification or rejection of a model. Falsification of a climate model is precluded by the complexity of a climate model. Here is some text from an earlier, extended version of the uncertainty monster paper:
Model outcome uncertainty, sometimes referred to as prediction error, arises from all of the aforementioned uncertainties that are propagated through the model simulations and are evidenced by estimates of the model outcomes. Reducing prediction error is a fundamental objective of model calibration. Model prediction error can be evaluated against known analytical solutions, comparisons with other simulations, and/or comparison with observations. Assessing prediction error by comparing with observations is not straightforward. Simulations are generally used to generate representations of systems for which data are sparse. In addition to accounting for the representativity error, it is important to judge empirical adequacy of the model by accounting for observational noise.
This challenge to model improvement arises not only from the nonlinearity of the model, but Winsberg (2008) argues that climate models suffer from a particularly severe form of confirmation holism that makes the models analytically impenetrable. Confirmation holism in the context of a complex model implies that a single element of the model cannot be tested in isolation since each element depends on the other elements, and hence it is impossible to determine if the underlying theories are false by reference to the evidence. Continual ad hoc adjustments of the model (calibration) provides a means for a theory to avoid being falsified. Occam’s razor presupposes that the model least dependent on continual ad hoc modification is to be preferred.
Owing to inadequacies in the observational data and confirmation holism, assessing empirical adequacy should not be the only method for judging a model. Winsberg points out that models should be justified internally, based on their own internal form, and not solely on the basis of what they produce. Each element of the model that is not properly understood and managed represents a potential threat to the simulation results.
Model verification and validation
This leads us back to the issue of model verification and validation. Several previous Climate Etc. threads have been devoted to this topic:
- The culture of building confidence in climate models
- Climate model verification and validation
- Should we assess climate model predictions in light of severe tests?