Site icon Climate Etc.

Overconfidence in IPCC’s detection and attribution: Part I

by Judith Curry

Arguably the most important conclusion of IPCC AR4 is the following statement:

“Most of the observed increase in global average temperatures since the mid-20th century is very likely due to the observed increase in anthropogenic greenhouse gas concentrations.”

where “very likely” denotes a confidence level of >90%.   The basis for this statement is comparison of global climate model simulations with observations for the 20th century, for simulations conducted with natural forcing (solar and volcanic) only and natural plus anthropogenic forcing (greenhouse gases and anthropogenic aerosol). See Figure SPM.4.  The agreement between 20th century global surface temperature observations and simulations with natural plus anthropogenic forcing provides the primary evidence to support this conclusion.

I have made several public statements that I think the IPCC’s “very likely” confidence level for attribution  too high, and I have been chastised in the blogosphere for making such statements.   Here I lay out my arguments in support of my statement regarding the IPCC’s overconfidence.

After digging deeply into this topic, I have finally diagnosed the cause of my recent head spinning symptoms:  overexposure to circular reasoning by the IPCC.

Basic references used here are:

Many direct quotes from these three documents are used here, including entire paragraphs.  For ease of reading, I have not blocked or italicized the quotes, but indicate them with quotation marks and a parenthetical citation at the end of the paragraph.  (clarification for the plagiarism police :) )

Overview of IPCC’s detection and attribution

“The response to anthropogenic changes in climate forcing occurs against a backdrop of natural internal and externally forced climate variability that can occur on similar temporal and spatial scales. Internal climate variability, by which we mean climate variability not forced by external agents, occurs on all time-scales from weeks to centuries and millennia. Slow climate components, such as the ocean, have particularly important roles on decadal and century time-scales because they integrate high-frequency weather variability and interact with faster components. Thus the climate is capable of producing long time-scale internal variations of considerable magnitude without any external influences. Externally forced climate variations may be due to changes in natural forcing factors, such as solar radiation or volcanic aerosols, or to changes in anthropogenic forcing factors, such as increasing concentrations of greenhouse gases or sulphate aerosols. “ (IPCC TAR)

The presence of this natural climate variability means that the detection and attribution of anthropogenic climate change is a statistical “signal-in-noise” problem. Detection is the process of demonstrating that an observed change is significantly different (in a statistical sense) than can be explained by natural internal variability. (IPCC TAR)

An identified change is detected in observations if its likelihood of occurrence by chance due to internal variability alone is determined to be small, for example, <10%. Attribution is defined as the process of evaluating the relative contributionsof multiple causal factors to a change or event with an assignment of statistical confidence. Attribution seeks to determine whether a specified set of external forcings and/or drivers are the cause of an observed change in a specific system.” (IPCC 2009)

“[U]nequivocal attribution would require controlled experimentation with our climate system. Since that is not possible, in practice attribution is understood to mean demonstration that a detected change is “consistent with the estimated responses to the given combination of anthropogenic and natural forcing” and “not consistent with alternative, physically-plausible explanations of recent climate change that exclude important elements of the given combination of forcings” (IPCC 2009)

Let me clarify the distinction between detection and attribution, as used by the IPCC.  Detection refers to change above and beyond natural internal variability.  Once a change is detected, attribution attempts to identify external drivers of the change.

“Information about the expected responses to external forcing, so-called ‘fingerprints’, is usually derived from simulations by climate models. The consistency between an observed change and the estimated response to a forcing can be determined by estimating the amplitude of a ‘fingerprint’ from observations and then assessing whether this estimate is statistically consistent with the expected amplitude of the pattern from a model.” (IPCC 2009)  Details of the optimal fingerprint method employed by the AR4 are given in Appendix 9A.  This method is a generalized multivariate regression that uses a maximum likelihood method to estimate the amplitude of externally forced signals in observations. Bayesian approaches are increasingly being used in this method.

I haven’t delved into the statistics of the fingerprinting method, but “eyeball analysis” of the climate model results for surface temperature (see Figure SPM.4 and 9.5) is sufficient to get the idea.  Note, other variables are also examined (e.g. atmospheric temperatures) but for simplicity the discussion  here is focused on surface temperature.

(IPCC AR4) “Climate simulations are consistent in showing that the global mean warming observed since 1970 can only be reproduced when models are forced with combinations of external forcings that include anthropogenic forcings (Figure 9.5). This conclusion holds despite a variety of different anthropogenic forcings and processes being included in these models. In all cases, the response to forcing from well-mixed greenhouse gases dominates the anthropogenic warming in the model. No climate model using natural forcings alone has reproduced the observed global warming trend in the second half of the 20th century. Therefore, modelling studies suggest that late 20th-century warming is much more likely to be anthropogenic than natural in origin, a finding which is confirmed by studies relying on formal detection and attribution methods (Section 9.4.1.4).”

“Modelling studies are also in moderately good agreement with observations during the first half of the 20th century when both anthropogenic and natural forcings are considered, although assessments of which forcings are important differ, with some studies finding that solar forcing is more important (Meehl et al., 2004) while other studies find that volcanic forcing (Broccoli et al., 2003) or internal variability (Delworth and Knutson, 2000) could be more important. . .  The mid-century cooling that the model simulates in some regions is also observed, and is caused in the model by regional negative surface forcing from organic and black carbon associated with biomass burning. Variations in the Atlantic Multi-decadal Oscillation (see Section 3.6.6 for a more detailed discussion) could account for some of the evolution of global and hemispheric mean temperatures during the instrumental period; Knight et al. (2005) estimate that variations in the Atlantic Multi-decadal Oscillation could account for up to 0.2°C peak-to-trough variability in NH mean decadal temperatures.”  (IPCC AR4)

In summary, the models all agree on the attribution of warming in the latter half of the 20th century, but do not agree on the causal factors for the early century warming and the mid-century cooling.

Detection and the issue of natural internal variability

My assessment of the IPCC’s argument for detection and attribution starts with the issue of detection, which relates to the background of natural internal variability against which forced variability is evaluated.  Detection (ruling out that observed changes are only an instance of internal variability) is thus the first step in the process of attribution. The issue of detection receives much more attention in the TAR; it appears that the AR4 bases its arguments on the detection analysis done in the TAR.

The modes of natural internal variability of greatest relevance are the Atlantic modes (AMO, NAO) and the Pacific models (PDO, often referred to as IPO) of multidecadal climate variability, with nominal time scales of 60-70+ years.  A number of studies (journal publications and blogospheric analyses) have attributed 20th century regional and/or global surface temperature variability to the PDO and AMO;  no attempt is made here to document these studies (this will be the topic of a future post), but see Roy Spencer and appinsys for the general idea.

There are three possible methods for assessing the background of natural internal variability:  examination of the historical data record, examination of the paleoclimatic proxy data record, and long-term climate model simulations.

There are several problems with using the historic surface temperature observations.  The time period (~150 years) is short relative to the ~70 year oscillations of interest.  Further, the method of constructing the global sea surface temperature data sets uses EOFs to infer missing data and smooth available observations  by making assumptions about the statistical properties of the observations using data from the relatively data rich period 1960-1990. This presumption almost certainly damps the longer internal multidecadal oscillations particularly in the data sparse Pacific Ocean (note this will be discussed in detail in a future post).  Another problem is that in order to infer natural internal variability from the historical data set, the forced variability must be removed.  This is accomplished using a climate model; however, the accuracy of this method is limited by incomplete knowledge of the forcings and by the accuracy of the climate model used to estimate the response.

The problems with the paleoclimate data are well known and will not be summarized here; however, the issue of interest in this context is not the “blade” of the hockey stick, but rather the modes of variability and their magnitude seen in the stick handle.  Getting these multidecadal variations correct in the reconstructions would be very valuable in understanding the modes of natural internal climate variability, and to what extent such variations might explain 20th century climate variability.  Interpretation of the natural modes of variability from the paleoclimate record suffers from the same challenge as for the historical data set; the forced variability (e.g. solar, volcanic) must be removed.

Owing to the problems in using both historical and paleo data in documenting the magnitude of natural internal climate variability, climate models seem to be best option.  Several modeling groups have conducted 1000 year unforced control simulations, which are described in IPCC TAR.  Figure 12.2 shows the power spectra of global mean temperatures in terms of period.  For modes exceeding 60 years (which is the period of relevance for the PDO and AMO), the models all have less power than the spectra for te historical observations.  And recall that the historical spectra for these periods is likely to be damped for two reasons: the EOFs damp variability at longer time scales; and the assumption of forced variability made in the model calculations used to separate forced from internal variability.

The summary from AR3 is:  “These findings emphasise that there is still considerable uncertainty in the magnitude of internal climate variability.” The AR4 did little to build upon the AR3 analysis of natural internal variability. Relevant text from the AR4 Chapter 8:  “Atmosphere-Ocean General Circulation Models do not seem to have difficulty in simulating IPO-like variability . . .  [T]here has been little work evaluating the amplitude of Pacific decadal variability in AOGCMs. Manabe and Stouffer (1996) showed that the variability has roughly the right magnitude in their AOGCM, but a more detailed investigation using recent AOGCMs with a specific focus on IPO-like variability would be useful.”  “Atmosphere-Ocean General Circulation Models simulate Atlantic multi-decadal variability, and the simulated space-time structure is consistent with that observed (Delworth and Mann, 2000).”  Note, these models capture the general mode of variability; they do not simulate the timing of the observed 20th century oscillations, which reflects ontic uncertainty.

In spite of the  uncertainties associated in documenting natural internal variability on time scales of 60-70+ years, natural internal variability plays virtually no role in the IPCC’s explanation of 20th century climate variability (which depends solely on natural and anthropogenic forcing).

Increasing attention is being paid to IPCC misrepresentations of natural oceanic variability on decadal scales (Compo and Sardeshmukh 2009):  “Several recent studies suggest that the observed SST variability may be misrepresented in the coupled models used in preparing the IPCC’s Fourth Assessment Report, with substantial errors on interannual and decadal scales (e.g., Shukla et al. 2006, DelSole, 2006; Newman 2007; Newman et al. 2008). There is a hint of an underestimation of simulated decadal SST variability even in the published IPCC Report (Hegerl et al. 2007, FAQ9.2 Figure 1). Given these and other misrepresentations of natural oceanic variability on decadal scales (e.g., Zhang and McPhaden 2006), a role for natural causes of at least some of the recent oceanic warming should not be ruled out.”

Rethinking detection

The relative lack of attention to natural internal variability given by the AR4 leads to the inference that the IPCC regards natural internal variability as noise that averages out in an ensemble of simulations and on the timescales of interest. However, the primary modes of interest are those having timescales 60-70+ years, which is comparable to the time scale of the main features of the 20th century global temperature time series.

The temperature “bump” in the 1930’s and 1940’s, which has its greatest expression in the Arctic (Polyakov et al. 2003 Fig 2), is ambiguously explained by the IPCC as a combination of solar forcing, volcanic forcing, and anthropogenic aerosols.  The AMO and PDO is at least equally plausible to the IPCC explanation for this feature.

The climate community and the IPCC needs to work much harder at clarifying natural internal variability on timescales of 60-100 years.  The historical sea surface temperature data needs expanding and cleaning up.  A focus of paleo reconstructions for the past 2000 years should be detecting multidecadal variability, rather than trying to convince that the recent decade is the warmest decade, etc.

The experimental design for elucidating internal variability from climate models needs rethinking.  A single 1000 year simulation is inadequate for ~70 year oscillations.  An ensemble of simulations is needed, for at least 2000 years.  Owing to computational resource limitations, it seems that relatively low resolution models could be used for this (without flux corrections).

Spectral studies such as Figure 12.2 in the IPCC TAR need to be expanded, but using climate models to eliminate the forced behavior in the observed time series introduced circular reasoning in the detection/attribution argument (more on this in Part II).

And finally, attribution studies can’t simply rely on model simulations, since model simulations (even if they capture the correct spectrum of variability) won’t match the observed realization of the multidecadal modes in terms of timing.

Part II: forthcoming

The uncertainty monster associated with IPCC’s detection and attribution argument  is of the hydra variety: the more I dig, the more heads the monster develops.  In Part II, we will examine issues surrounding the forcing data and model inadequacy as it related specifically to the attribution problem (and how this then feeds back onto the detection problem).

Exit mobile version