Site icon Climate Etc.

Waving the Italian flag. Part I: uncertainty and pedigree

by Judith Curry

The Italian flag (IF) is a representation of three-valued logic in which evidence for a proposition is represented as green, evidence against is represented as red, and residual uncertainty is represented as white.  The white area reflects uncommitted belief, which can be associated with uncertainty in evidence or unknowns.

The IF was introduced and applied previously on the hurricanedoubt, and detection and attribution threads. The IF was used on these threads as an heuristic device to enable understanding of the role of uncertainty in scientific problems where there are “conflicting certainties” and expert assessments of confidence levels.

These applications of the IF have engendered much confusion in the climate blogosphere.  In the interests of further developing ideas for applying the IF to aspects of the climate problem, I am devoting a two part series to the IF.

If you intend to follow this closely, you need to download:

I refer to numerous figures from these papers (note: if you don’t like homework, just download these to refer to the figures).

The IPCC’s logic for evidential judgments

Before describing the IF, I provide context regarding the IPCC’s framework. The IPCC’s logic for evidential judgments is described by Moss and Schneider.  As summarized by Morgan et al.:

Guidance developed by Moss and Schneider (2000) for the IPCC on dealing with uncertainty describes two key attributes that they argue are important in any judgment about climate change: the amount of evidence available to support the judgment being made and the degree of consensus within the scientific community about that judgment.

In Fig 4. of Moss and Schneider, the judgment has two dimensions:  on the x-axis is the amount of evidence (e.g. model output, observations) and on the y-axis is the level of agreement/ consensus.  On this chart, four boxes are delineated that are referred to as “state of knowledge” descriptors:

• Well-established (high amount of evidence, high consensus): models incorporate known processes; observations largely consistent with models for important variables; or multiple lines of evidence support the finding)

• Established but Incomplete (low amount of evidence, high consensus): models incorporate most known processes, although some parameterizations may not be well tested; observations are somewhat consistent with theoretical or model results but incomplete; current empirical estimates are well founded, but the possibility of changes in governing processes over time is considerable; or only one or a few lines of evidence support the finding

• Competing Explanations (high amount of evidence, low consensus): different model representations account for different aspects of observations or evidence, or incorporate different aspects of key processes, leading to competing explanations

• Speculative (low amount of evidence, low consensus): conceptually plausible ideas that haven’t received much attention in the literature or that are laced with difficult to reduce uncertainties or have few available observational tests
The Moss and Schneider document also includes this recommendation:

6. Prepare a “traceable account” of how the [uncertainty] estimates were constructed that describes the writing team’s reasons for adopting a particular probability distribution, including important lines of evidence used, standards of evidence applied, approaches to combining/reconciling multiple lines of evidence, explicit explanations of methods for aggregation, and critical uncertainties. In constructing the composite distributions, it is important to include a “traceable account” of how the estimates were constructed.

Good recommendation.  Unfortunately the IPCC hasn’t heeded it.

Three-value logic

The following description is from the Tesla document:

Evidential judgments based on classical probability theory follow two-value logic, whereby evidence must either be in favour of a hypothesis, or against it.  [That is,] evidence for and against are treated as complementary concepts (i.e. p(A) + p(not A) =1, where p(A) is the probability of event A occurring, or in other words the evidence supporting the occurrence of A.)

Three-value logic extends this to allow for a measure of uncertainty as well, recognizing that belief in a proposition may be only partial and that some level of belief concerning the meaning of the evidence may be assigned to an uncommitted state.  Uncertainties are handled as “intervals” that enable the admission of a general level of uncertainty, providing a recognition that information may be incomplete and possibly inconsistent (i.e. evidence for + evidence against + uncertainty = 1).  This is represented visually by the “Italian flag”, in which evidence for a proposition is represented as green, evidence against as red, and residual uncertainty is white. . . As an alternative to the Italian flag representation, the values may be simply represented in the triplet form [evidence for, uncertainty, evidence against].

The overall assessment of degree of belief in the evidence, b(E), needs to take into account the net value of the evidence that exists, n(E), and the estimated uncertainty due to lack of knowledge, k(E).  Thus b(E) = n(E) k(E).   The residual uncertainty [is] given by 1 – (b(E) + b(notE)). A parallel analysis is then conducted for b(not E).  With the three valued formalism, evidence for and evidence against can be evaluated independently, each ranging from 0 to 1, with uncertainty taking a value from -1 to 1.  An uncertainty of 1 implies that there is no evidence at all on which to base a judgment, whereas a negative value indicates a situation in which the evidence appears to be in conflict.  [W]here evidence is in conflict (say, for example, [0.65, -.32, 0.67], [the white portion of the IF] is indicated by a yellow central bar.

The uncertainty due to lack of knowledge, k(E), is determined (using expert jugment) as the ratio of the information you actually have to the information you would ideally wish to have in order to be confident in the judgment.

The net value of the evidence, n(E), is determined as a function of the face value of the evidence and the confidence in the evidence (Figure 5 in the Tesla document).  Figure 5 in Tesla is similar to Figure 4 in Moss and Schneider, if we equate consensus/agreement with confidence and face value of evidence with amount of evidence, and hence the IPCC judgment could be interpreted as       b(E) = n(E).  Equating the labels in these two diagrams is not entirely appropriate, given the verbal descriptions for the four boxes in Figure 4 of Moss and Schneider. The IF expression for belief in the evidence is

b(E) = n(E) k(E) = 1 – b(not E) – residual uncertainty

Relative to the IPCC method that considers the amount of evidence and the consensus regarding this evidence, the IF method explicitly (and traceably) includes the estimated uncertainty due to lack of knowledge, the incompleteness of the information, and belief in competing hypotheses.

Assessing uncertainty

With regards to the “white” portion of the  Italian flag, the Tesla document describes the residual uncertainty in the following way:

There are many potential contributions to residual uncertainty in the treatment of evidence; in essence, the assignment of a level of belief to an uncommitted state (i.e. neither for nor against) ought to reflect “anything we are not sure of.”  This incorporates not only the awareness that exists in relation to uncertainties in the system under review and its behaviour, but also a measure of degree of belief in that understanding.

The assessment of uncertainty is not straightforward.  As a reminder, consider the uncertainty lexicon on the previous uncertainty monster thread and also the section on climate model imperfections on what can we learn from climate models thread.   Figure 3 in Refsgaard et al. provides a useful summary of uncertainty taxonomy.

Refsgaard et al. provide a framework and guidance for assessing and characterizing uncertainty in the context of environmental modeling. Refsgaard et al. review 14 different  methods commonly used in uncertainty assessment and characterization, 12 of which are relevant for type of problems relevant here (IPCC WG1): data uncertainty engine (DUE), error propagation equations, expert elicitation, inverse modelling (parameter estimation), inverse modelling, Monte Carlo analysis, multiple model simulation, NUSAP, quality assurance, scenario analysis, sensitivity analysis, and uncertainty matrix.

Table 5 of Refsgaard et al. categorizes the different methods according to their utility for the following:

The focus here is on qualitive methods for identification and characterization of uncertainty and assessment of levels of uncertainty.  Propagation of uncertainty is the topic of Part II in this series.

The uncertainty matrix (Table 1 in Refsgaard et al.)  can be used to provide an overview of the various sources of uncertainty.  The vertical axis lists the locations or sources of uncertainty while the horizontal axis covers the level and nature of uncertainty for each uncertainty location.

Funtowicz and Ravetz (1990) introduced the NUSAP system for multidimensional uncertainty analysis.  The NUSAP acronym stands for numeral, unit, spread, assessment, pedigree.  NUSAP combines quantitative analysis (numeral, unit, spread) with expert judgment of reliability (assessment) and the reliability of the knowledge base (pedigree).

As described by Refsgaard et al.,

The strength of NUSAP is its integration of quantitative and qualitative uncertainty. It can be used on different levels of comprehensiveness: from a ‘back of the envelope’ sketch based on self elicitation to a comprehensive and sophisticated procedure involving structured, informed, in-depth group discussions on a parameter by parameter format. The key limitation is that the scoring of pedigree criteria is to a large extent based on subjective judgements. Therefore, outcomes may be sensitive to the selection of experts.

Pedigree

Funtowicz and Ravetz’s concept of pedigree  is described in the Tesla document, whereby pedigree relates to the origin and trustworthiness of the knowledge.   Pedigree is evaluated in a chart (Figure 6 in the Tesla document), whereby the columns are quality indicators that include theoretical basis, scientific method, auditability, calibration, validation, and objectivity.  The rows describe quality scores, ranking from very low to very high.

Objectivity is described by the Tesla document as:

Whilst the scientific method provides a logical framework for improving understanding it does not guarantee objectivity.  The influence of entrenched values, motivational bias and peer and institutional pressures may obscure true objectivity.  In order to maintain a check on the quality and objectivity of our interpretations we rely on peer review and expeosure to critique through peer reviewed publication.  This indidcator is used to give a judgment on the extent to which information can be said to be objective and free from bias.

Why use the Italian flag?

As described by the Tesla document:

Moreover, whilst there may be a large volume of information relating to the [hypothesis] at hand, it may on the whole be only of partial relevance, incomplete and/or uncertain, or even conflicting in terms of the level of support it provides for a given interpreation.  The range of available evidence may appear to give an indistinct picture, with no clear indication of how best to arget resources in order to improve understanding.  There may be disputed interpretations, pehaps because some practitioners appear to be biased by excessive reliance on a particular source of evidence in the face of contradictory, or seemingly more equivocal, evidence from elsewhere.  Hence, in order to provide a justified interpretation of the  available evidence, which can be audit-traced from start to fiish, it is necessary to examine and make visible judgments on both the quality of the data and the quality of the interpretation and modelling process.

Returning to the issue of the IPCC’s statement in the AR4 regarding attribution of 20th century warming:

Most of the observed increase in global average temperatures since the mid-20th century is very likely due to the observed increase in anthropogenic greenhouse gas concentrations.

which was discussed previously on the detection and attribution threads.  While much evidence is presented in the AR4, there is an absence of traceability of the uncertainty analysis that makes the statement ambiguous and diminishes the defensibility of the statement other than by a “consensus.”

Reasoning about uncertainty in the presence of substantial amounts of often conflicting information with varying levels of quality and reliability will be the topic of Part II.

Exit mobile version