by Judith Curry

The new International Journal of Uncertainty Quantification has some very interesting papers.  Lets take a look at a paper entitled ‘Error and Uncertainty Quantification and Sensitivity Analysis in Mechanics Computational Models.’

Error and Uncertainty Quantification and Sensitivity Analysis in Mechanics Computational Models.

Bin Liang and Sankharan Mahadevan

Abstract.  Multiple sources of errors and uncertainty arise inmechanics computational models and contribute to the uncertainty in the final model prediction. This paper develops a systematic error quantification methodology for computational models. Some types of errors are deterministic, and some are stochastic. Appropriate procedures are developed to either correct the model prediction for deterministic errors or to account for the stochastic errors through sampling. First, input error, discretization error in finite element analysis (FEA), surrogate model error, and output measurement error are considered. Next, uncertainty quantification error, which arises due to the use of sampling-based methods, is also investigated. Model form error is estimated based on the comparison of corrected model prediction against physical observations and after accounting for solution approximation errors, uncertainty quantification errors, and experimental errors (input and output). Both local and global sensitivity measures are investigated to estimate and rank the contribution of each source of error to the uncertainty in the final result. Two numerical examples are used to demonstrate the proposed methodology by considering mechanical stress analysis and fatigue crack growth analysis.

Citation:  International Journal for Uncertainty Quantification, 1 (2): 147–161 (2011).  [Link] to complete paper online.

The paper is quite readable, and well worth reading. From the Introduction:

The motivation of this paper is to develop a methodology that provides quantitative information regarding the relative contribution of various sources of error and uncertainty to the overall model prediction uncertainty. Such information can guide decisions regarding model improvement (e.g., model refinement, additional data collection) so as to enhance both accuracy and confidence in the prediction. The information sought is in the form of rankings of the various errors and uncertainty sources that contribute to the model prediction uncertainty. It is more advantageous to spend resources toward reducing an error with a higher ranking than one with a lower ranking. The rankings are based on systematic sensitivity analysis, which is possible only after quantifying the effect of each error source on the model prediction uncertainty.

The error in a computationalmodel prediction consists of two parts: model formerror (model) and solution approximation error or numerical error (num) . The model form error depends on whether the selected model correctly represents the real phenomenon (e.g., small deformation versus large deformation model, linear elastic versus elasto plastic model, or Euler versus Navier-Stokes equation). The solution approximation error arises when numerically solving the model equations. In other words, the model form error is related to the question—Did I choose the correct equation?—which is answered using validation experiments, while the solution approximation error is related to the question—Did I solve the equation correctly?—which is answered through verification studies.

Note that the numerical error depends on the choice of the model form; thus, the two errors are not independent. The numerical error num is a nonlinear combination of various components. This paper first considers three typical numerical error components and their quantification and combination, including input error, discretization error in FEA, and surrogate model error.

One concern in this paper is how to obtain a model prediction yc corrected for numerical error sources. Among all errors, some errors are stochastic, such as input error and surrogate model error, and some errors are deterministic, such as discretization error in FEA. In this paper, a simple but efficient approach is developed to obtain yc. The basic idea is to quantify and correct for each error where it arises. Stochastic error is corrected for by adding its randomly sampled values to the original result. Deterministic error is corrected for by directly adding it to the corresponding result. For example, to correct for the discretization error, every time a particular FEA result is obtained, the corresponding discretization error is calculated, added to the original result, and the corrected FEA result is used in further computation to obtain yc.

In addition to the model form and solution approximation errors mentioned above, another error arises due to Monte Carlo sampling used in the error quantification procedure itself. This error is referred to here as uncertainty quantification (UQ) error. For example, when estimating the cumulative distribution function (CDF) of a random variable from sparse data, there is error in the CDF value, and methods to quantify this UQ error are available.

Then, if more samples are generated by the inverse CDF method using the CDF estimated from sparse data, then the UQ error is propagated as sampling error to the newly generated samples. An approach is developed in Section 3 to quantify this sampling error. This method is particularly useful in quantifying model form error (Section 4).

After a probabilistic framework to manage all sources of uncertainty and error is established, sensitivity analyses are performed in Section 5 to assess the contribution of each source of uncertainty and error to the overall uncertainty in the corrected model prediction. The sensitivity analysis result can be used to guide resource allocation for different activities, such as model fidelity improvement, data collection, etc., according to the importance ranking of errors in orders to trade off between accuracy and computational/experimental effort.

The contributions of this paper can be summarized as follows:

1. A systematic methodology for error and uncertainty quantification and propagation in computational mechanics models is developed. Previous literature has developed methods to quantify the discretization error and to propagate input randomness through computational models. However, the combination of various error and uncertainty sources is not straightforward: some are additive, some multiplicative, some nonlinear, and some even nested. Also, some errors are deterministic and some are stochastic. The methodology in this paper provides a template to track the propagation of various error and uncertainty sources through the computational model.

2. The error and uncertainty quantification methodology itself has an error due to the use of limited number of samples when large finite element models are used; therefore, this UQ error is quantified. A methodology is proposed in this paper to also quantify the propagation of this UQ error through further calculations in model prediction and assessment, and is used in the quantification of model form error.

3. Sensitivity analysis methods are developed to identify the contribution of each error and uncertainty source to the overall uncertainty in the model prediction. Previous literature in global sensitivity analysis has only considered the effect of input random variables, and this paper extends the methodology to include data uncertainty and model errors. The sensitivity information is helpful in identifying the dominant contributors to model prediction uncertainty and in guiding resource allocation for model improvement. The sensitivity analysis method is further enhanced to compare the contributions of both deterministic and stochastic errors on the same level, in order to facilitate model improvement decision making.

JC comments:  This paper tackles the whole problem of uncertainty quantification in dynamical models in an integrated way.  It is rare to see someone trying to quantify model functional form error (although I am not sure how this would work for climate models).  I also like the global sensitivity analysis approach.  I am not at all sure about the proposed method for actually using the error analysis to correct the model prediction.  I look forward to discussion on this.

UQ in climate models

I am starting to hear the term ‘UQ’ in the context of climate models, which is a really good development IMO:

This Fact Sheet from ORNL lays it all out:

Complex simulation systems, ranging from the suite of global climate models used by the Intergovernmental Panel on Climate Change (IPCC) for informing policy-makers to the agent-based models of social processes which may be used for informing military commanders and strategic planners, typically cannot be evaluated using standard techniques in mathematics and computer science. Uncertainties in these systems arise from intrinsic characteristics which include complex interactions, feedback, nonlinearity, thresholds, long-range dependence and space-time variability, as well as our lack of understanding of the underlying processes compounded by the complicated noise and dependence structures in observed or simulated data. The complicated dependence structures, interactions and feedback rule out the use of mathematical or statistical methods on individual differential equations or state transition rules. The direct application of computational data science approaches like spatial of spatio-temporal data mining is limited by the nonlinear interactions and complex dependence structures. The complexity of the systems precludes multiple model runs to comprehensively explore the space of input data, random model parameters, or key component processes. The importance of extreme values and space-time variability, as well as the ability to produce surprising or emergent behavior and retain predictive ability, further complicates the systematic evaluation and uncertainty quantification. The area of verification and validation within modeling and simulation is not yet equipped to handle these challenges. We have developed a set of tools, which range from pragmatic applications of the state-of-the-art to new adaptations of recent methodologies and all the way to novel approaches, for a wide variety of complex systems. These include systematic evaluation of social processes and agent-based simulations for an activity funded by DARPA/ DOD, multi-scale uncertainty characterization and reduction for simulations from the IPCC suite of global climate models towards research funded by ORNL / DOE and DOD, as well as high-resolution population modeling funded by multiple agencies. The set of tools and methodologies, which represents an interdisciplinary blend of mathematics, statistics, signal processing, nonlinear dynamics, computer science, and operations research, has demonstrated wide applicability over multidisciplinary simulation systems.

Sounds like this is exactly what is needed for climate models.  I hope this project is well supported and is successful.

Moderation note:  this is a technical thread and comments will be moderated for relevance.

159 responses to “UQ

  1. The ORNL statement seems to imply that the Liang et al approach will not work for climate type models, to which I agree. Liang et al are looking at relatively simple mechanics models that are applied many, many times to actual situations, so they are extensively tested. This means, for example, that one can do Monte Carlo analysis. Climate models cannot be tested once, much less many times, for the reasons given by ORNL.

    Also, most of the Laing analysis and UQ method seems to be focused on what they call numerical error, as opposed to model form error. But in the climate case it is arguably model form error that is most significant, by far, especially the twin terrors of feedback and natural variability.

    In short, UQ is unlikely to be possible in cases where one does not understand the underlying physical processes, especially if they are probably nonlinear. Premature UQ of climate models may well cause more problems then it solves, just as we see with quantitative CO2 sensitivity estimates. If you don’t know what you are doing it is probably not possible to quantify by how much.

    • I don’t think there’s anything in David’s post to improve on. That is exactly the problem I have with the Climate Science community’s approach to modeling. They seem generally to give it weight that is disproportional to what it can reasonably be expected to provide.

    • I agree. David explains well why UQ (uncertainty quantification) is not solving the problem:

      ” If you don’t know what you are doing it is probably not possible to quantify by how much.”

      1. Inability to grasp reality
      2. Inability to communicate reality
      3. Ignorance of observational reality
      4. Trying to force reality to fit simplified models of reality
      5, Using sophisticated calculations to generate gobbledygook outcomes

      1. Matter is stored energy (E = mc^2)
      2. The universe moves in only one direction
      3. Matter is being converted into energy (m is becoming E)
      4. That is why acorns become oak trees, water runs downhill, the wind blows, a few large quanta of nuclear energy become many small quanta of heat energy, and we all march unidirectionally from birth to death as the universe expands!

      Using Reality
      1. Wind mills
      2. Nuclear reactors
      3. Hydroelectric generators
      4. Steam and combustion engines

      Not Models of Reality
      e.g., Fusion reactors that operate like the Bilderberg solar model [Bilderberg model, Solar Physics 3, 5-25 (1968)]

      For more details, see: http://joannenova.com.au/2012/04/when-is-a-free-market-solution-not-the-answer-when-it-isnt-free/#comment-1046350

    • UQ is not the answer to unreliable climate models, solar models or fusion reactors, . . . designed to meet society’s future energy needs by operating like the supposedly reliable Bilderberg solar model ["Bilderberg model. . . ", Solar Physics 3, 5-25 (1968)]

    • Gareth Williams

      Yes, this relates the the point I was making here:


      There are two very different ways of using computer models in science. When you have a thoroughly understood and validated physical model you can think about using it to make real-world predictions with a useful degree of accuracy. It is at that point you need to calculate your error bars. These error bars (deterministic and stochastic) are *based on the assumption that the underlying model is correct*.

      Model form error is a different matter. As the authors of this paper say (at the start of section 4) model form error can only be assessed by comparing the results of the model with experiment (hardly any surprise here – this is the basis of the scientific method).

      When you are learning how to model a phenomenon you run computer simulations so see how your model compares with the real world. This kind of model makes “predictions”, but they are predictions to compare with experiment, not predictions to go to the bank with, or base policy on. We predict such-and-such *if our theory is correct*. And it may take many such verified prediction before we have confidence to use the model to make real-world predictions.

      The problem with GCMs is that they are the latter kind of model. They have not been verified by experiment. None of them predicted a decade long hiatus in warming. So we just don’t know if they are modeling the right processes, the right feedbacks, but the evidence is there is something missing. (The ability to “retrodict” climate is not convincing in models with so many variable parameters). Now that would not matter if these “predictions” were understood for what they are. But they are widely understood by the public and politicians, and presented by the IPCC, as real-world predictions of what the climate will actually be like in future decades.

      GCMs in their current state are not going to be helped at all by the techniques in Liang and Mehadevan’s paper. It is simply premature to try to quantify the uncertainty. You first need to develop a demonstrably working model of climate, and that has not been done.

      Actually, the long-term effect of adding CO2 to the atmosphere is an important question that needs to be modelled. But I doubt that attempting to “wind forward” today’s climate will give us a useful answer. We are unlikely to be able to tell if the answers we get represent the real climate, or are artefacts of the model. If there is something missing, the prediction will only get worse the further forward you wind it.

      And we don’t actually need to know what the climate will be like in 2060. We just need to know the worst case. I would guess it would be more fruitful to approach this problem based on general physical principles such as energy balance. What is the highest plausible temperature from doubling C02? what has to happen for that to occur? Maybe that analysis has been done, but I don’t remember reading about it.

    • David Wojick

      I do not understand your “plausible worst case” argument. The models are plagued by runaway positive feedbacks, which give catastrophic worst case predictions. That is what the debate is all about. There is no plausible worst case.

    • Most plausible ‘worst case’ is a warmer world sustaining more life and more diversity of life. But it’s cooling, folks; for how long even kim doesn’t know.

    • Steven Mosher

      there are no runaway feedbacks.

    • So the 6 to 10 degree warmings are realistic in your mind? There is a literature on this problem, as you should know.

    • Moshe means they’re unrealistic.

    • Steven Mosher

      David, please dont use misuse the term “runaway”.
      Second, it doesnt matter if there is “literature” on this problem.
      “literature” doesnt determine what is “realistic” and what is not.

    • David Springer

      Mosher, I’m glad to hear there are no runaway feedbacks. Tell me then what limits the positive feedbacks in climate models. I asked here for what it might be and physicistdave and Fred Moolten responded that it is 5500C (the temperature of the sun’s photosphere).

    • Gareth Williams

      The IPCC publishes a range of model results. Eg in the “high emissions” scenario temperature rises by 2100 range from 2C to 6C. The IPCC regards these as a range of possible outcomes. But they are all based on different assumptions, some of which are surely wrong.

      If I had the time I would want to look very carefully at the 6C scenario. Clearly under the set of assumptions that go into that model, this is the result that pops out. But what are those assumptions? Which ones make the difference? What does that 2100 world look like in detail? It may be possible to compare that scenario with paleoclimate data. We may be able to rule out the catastrophic predictions for reasons that were not included in to the model.

      This is not a detailed proposal, just based on my experience of numerically modelling various systems. The model is just telling you the consequences of your own assumptions. But it can be hard to understand what the results are really telling you. Calculation is not a substitute for physical insight. For example there are better ways to approach the problem of whether an orbit is stable than naively integrating the equations over a billion years.

      My main point is just that it might be possible to rule out the 6C scenario without settling the question of whether the right answer is 2C or 3C. Knowing whether or not that is possible would change the whole climate debate.

    • There’s a bit more detail on the make-up of these projections in Chapter 10

      There weren’t any GCM realisations performed with the A1FI scenario you’re talking about. Lower complexity models were used instead. I can see three main assumptions complicit in producing the largest amount of warming:

      – Fast-feedback climate sensitivity: the likely maximum is 4.5ºC.

      – Carbon-cycle mechanisms produce likely maximum CO2 concentrations. See this figure for details.

      – Scenario (i.e. what humans will do over the coming decades) follows a path of accelerating carbon emissions.

      I don’t think any of these can be ruled out at this point.

    • I agree. “model form error depends on whether the selected model correctly represents the real phenomenon” — and the models are thus invalidated from the get-go. They are incapable of giving useful or meaningful output.

    • Well of course there is runaway global warming. It ran away about 15 years ago.

    • Steven Mosher

      “None of them predicted a decade long hiatus in warming.”

      Wrong. If you look at the individual runs of the models you will find a small percentage that “predicted” a hiatus. The average behavior of all the runs of all the models does not.

    • MattStat/MatthewRMarler

      Steven Mosher: Wrong. If you look at the individual runs of the models you will find a small percentage that “predicted” a hiatus. The average behavior of all the runs of all the models does not.

      The mean of the model runs is presented sometimes as a sort of prediction, with the variation among the model runs as a sort of estimate of the error of the prediction. Would it make sense to computed a weighted mean instead of an unweighted mean, with the weights being chosen to be proportional to the inverse of the squared prediction error? Year after year this could be updated using weights proportional to the cumulative sum of the squared prediction error. If some models were consistently more accurate, they would come to dominate the mean prediction for the future, and guide study into why they are the more accurate models.

      There is a question in there: What do you think?

    • Steven Mosher

      There is a discussion on one method of discounting here


    • Indeed this is true, because the model runs are all over the place. Warming a lot, or a little, or not at all, a run has predicted it. (Although none shows cooling that I know of, their greatest error.) The real question is why do not all these contradictory runs falsify the models? A model that predicts everything predicts nothing. It is untestable under a huge range of possible outcomes. That is not science.

      Even worse of course is the IPCC showing just a select two of these runs to prove AGW. That is true fraud.

    • The various model runs are not contradictory, they show natural varialbility, which is something they are trying to model.

    • Steven Mosher

      you are wrong again. the models are not “all over the place” to use your precise technical term. how many show cooling?
      Well, first off the earth doesnt “show cooling”. What we have is an uncertain ESTIMATE of the temperature of the earth. You know that record you dont trust. Next a small percentage of the model runs run cooler than the earth has run.

      These models do not predict everything. The only “model” that predicts everything is the “model” that says natural variaton explains everything.

      They are not untestable. You are wrong.
      Further the IPCC does not just show two of these runs. You are wrong again.

    • David Springer

      Do any of the models reproduce ice ages?

      If they do then by definition some of them “show cooling”. If they do not then they are flawed.

      It’s really that simple. So which is it, Mosher? Do any models show cooling or are they all flawed?

    • A coin should give a 50% chance of heads or tails. The probability of getting a particular sequence (HTHTHHTTTHTHT) is just 0.5^n. However, one could ask all the model runs for the annual warmer, equal and cooler and calculate the frequency distribution of 7 year chunks (i.e. WWNCCWW) and see how good the frequency distribution of the models matches the frequency distribution of reality (what ever that is).

    • With that zoo of models, you can find one or two that retro-dicted anything. But — will the SAME model(s) predict the next change/trend/datum? Good luck with that.

    • Steven Mosher

      Not talking about retrodicting. read.

    • Steve

      With all due respect, the margin or error that the model developers state is acceptable for near term predictions is unacceptably high. laughably so in fact. Almost anything could happen and be within the error range of the models

    • Steven Mosher

      I would agree that we do not need a GCM for policy. Much simpler models suffice.

    • MattStat/MatthewRMarler

      Steven Mosher: I would agree that we do not need a GCM for policy. Much simpler models suffice.

      What we need for policy is at least 1 model with a demonstrated record of sufficient accuracy for the purpose. If there is an extant model that is sufficiently accurate, I don’t think it has a demonstrated record of sufficient accuracy yet. Perhaps one of the models in the previous interchange, that forecast a decade of little or no warming?

    • Steven Mosher

      why do you think an accurate model is needed to for policy.

      We don’t have very good flood prediction models, yet we set policy about flood plains all the time. We cannot predict earthquakes, yet we have earthquake building codes. Your notion that one needs an accurate model to set policy is not bourne out by a simple examination of what we in fact do. We do in fact make decisions under uncertainty all the time.

      You think the accuracy is not good enough. That is subjective.

      All of the models have decades of random cooling. THAT is the basis of santers conclusion about needing 17 years of data.

    • > We don’t have very good flood prediction models, yet we set policy about flood plains all the time.

      There is a difference though. Floods could possibly go to zero, but can’t go negative, like the CO2 impact on temperature perhaps can. (Or are there combined flood-drought models?)

    • David Springer

      Mosher doesn’t understand the difference between a flood and global warming, evidently. There are a great many floods so with a sample size that large we use actuarial techniques to assess risk. How many catastrophic anthropogenic global warmings have there been so that we may use actuarial techniques to make risk/reward decisions?


    • A simpler version of our lack of understanding is not better.

    • Steven Mosher

      your lack of understanding cant be accounted for

    • Arcs_n_Sparks

      >We cannot predict earthquakes, yet we have earthquake building codes.

      This engineer would explain that we have a good understanding of the bounding parameters regarding earthquakes (which of course, can and are revised with new data over time), and the country is divided into various zones for that building code feature. Moreover, we do not mandate building codes for a 11.9 quake because some model predicts it might happen in the next 1000 years; it would be far too costly.

    • Steve Milesworthy

      The ability to “retrodict” climate is not convincing in models with so many variable parameters

      What if you were to rerun an identical model with identical parameter settings, but instead of using the guesses you might have had about future greenhouse gas and aerosol levels, and future solar activity to force your model, you used the measured values.

      Or if you took your existing all-purpose model and ran it with the old and the new forcings and demonstrated that the new forcings showed the correct amount of warming and the old ones showed similar errors.

      Of course this could mean that the forcings have been fiddled with rather than the models. The point is that issues such as “retrodicting” can be all about the input data rather than the models, and the input data may be obtained entirely independently.

      (It could also mean that you’ve fixed your new model explicitly to deal with the 1998-2010 problem but anyone who thinks that needs to read up a bit more on the subject)

    • Gareth,

      Do you know how many model runs have been published?
      Have you looked at all of them?
      How can you say that none of them have predicted a decade long hiatus in warming?

      It is just a pet peeve of mine to see someone say something that is equivalent to saying the sun didn’t come up yesterday, because I was in bed all day.

    • Gareth Williams

      No, I have not looked at very model run. But the failure of the earth to warm since 1998 is a major embarrassment (e.g.Trenberth’s “travesty” statement). If anyone could plausibly claim to have predicted that, they would be making a loud noise about it.
      (The result of one model run is not necessarily a real-world prediction, particularly if the same model is producing a variety of results. For it to count as such, someone has to say so at the time).

    • The failure of the earth to warm since 1998, means you are not looking at the temperature record critically because according to some of the records there has been warming since 1998.

      Trenberth’s statement has more to do with the adequacy of our measurements of temperature and heat flow, in other words there are areas of the climate system that are not measured very well. When you account for the effects of ENSO and Solar variations it is apparent that the earth has continued to warm since 1998.

      Anyway, I dont believe you can produce any short term temperature predictions from any climatologist.

    • bob droege | April 18, 2012 at 6:09 pm |
      > When you account for the effects of ENSO and Solar variations it is apparent that the earth has continued to warm since 1998.

      It’s been 14 years now, and ENSO only has a 5-year cycle, so how does that work ?

    • Quote Gareth Williams | April 17, 2012 at 2:18 pm:

      “Actually, the long-term effect of adding CO2 to the atmosphere is an important question that needs to be modelled.”

      A ‘real time’ experiment is being conducted in Australia:

      “The project involves large cranes pumping carbon dioxide gas into the ecosystem at the canopy level of the forest to mimic expected atmospheric conditions in 2050.
      Over a ten-year period, changes such as water consumption and growth will be monitored to determine how carbon dioxide affects forestry ecosystems.”


      For a humorous angle:

      So really, you might not unreasonably think, any large-scale experiment to discover what kind of effect increased levels of CO2 has on eucalyptus trees would fall into roughly the same category as:

      * A large-scale experiment in the sea to discover whether or not water is wet.

      * A large-scale experiment at Christmas to discover whether or not Father Christmas is a big, fat, jolly man with a white beard and a red outfit trimmed with white fur.


    • ” If you don’t know what you are doing it is probably not possible to quantify by how much.”

      Love it.

    • yes. It would appear to be one of those ‘inconvenient truths.’

  2. Bernie Schreiver

    The attention of Climate etc. readers is directed to the National Agency for Finite Element Methods and Standards (NAFEMS), which despite the word “national” in its name, is the world’s premier standards authority for engineering simulation software and training.

    The word “training” is important, because generally speaking, a skilled simulationist running buggy software will consistently obtain predictions that are more accurate and more reliable that an unskilled simulationist running perfect software.

    The common-sense point of NAFEMS (and its journals) is that experience has established that devoting resources to software V&V is futile, unless the V&V is accompanied by matching investments in human training with regard to the software’s foundations in physics and mathematics.

    • So, Bernie. What does the skilled simulationist have to do to obtain good results. Can you give some specific scenarios and what would be done to the simulation to obtain “more accurate” predictions?

    • I can give you a simple example. Some years ago a large part of my job was designing distance measuring probes. Due to mechanical tolerances a wide variance in response was typical for any production run. Prior to going into production a small number of probes were manufactured and curve fitting software was used to characterize the probes. The curve fitting software would typically produce 10 or more equations that would fit the response of the probes adequately. It wasn’t as simple as picking the first equation that was spit out by the software. The best equation was one that resembled the theoretical physical model.

    • I’ve created parametric equations to fit conductance to the concentration of a fixed ratio of certain ions, so I think I see what you are getting at. A parametric equation doesn’t contain the same level of “intelligence” as a purely physics-based model.

    • David Wojick

      I agree that getting the math and science right is far more important than getting the code right. But in the case of climate this is not a matter of training, rather it is a matter of science. The models are no good because we do not yet understand how climate works, the math as well as the physics. The ORNL fact sheet is pretty good in this regard.

    • I’m having trouble disagreeing with anyone today. This is usually not my problem I was the lead architect and one of the engineers on an AI project about 15 years ago. The software eventually was quite helpful when used by users that understood the way the problem (telephony sales configuration) had to be approached. However, only a segment of the user community could ever use it effectively because it was assumed by their management (and sadly mine) that the software would take care of everything. Actually it did, but you had to understand the limitations the software had in framing the problem.

    • There are a lot of steps between where they are and anything as anal as V&V. Basic common sense coding practices, tests, code reviews, documentation, naming, commenting, specs…

      I spent last night looking through a GISS codebase, it isn’t hilariously bad or anything, but it lacks the rigor and clarity of normal software. That will be holding it back in many ways, not just with bugs.

    • Steven Mosher

      you looking at ModelE? you might also have a look at the MIT code. In their approach I believe they have a scientist and a programmer. If you can keep them from killing each other that works.

    • Lol : ). Yes, I was looking at ModelE. I will definitely give the MIT code a look, thanks for the tip again. Donning Kevlar and going in ; ).

    • The word “training” is important, because generally speaking, a skilled simulationist running buggy software will consistently obtain predictions that are more accurate and more reliable that an unskilled simulationist running perfect software.

      Clearly and without question, this statement needs to be qualified relative to the nature of the bugs in the ‘buggy software’. It cannot be correct if the bug is such that failure to complete the calculation is the result. It also cannot be correct if the nature of the bug is to invalidate parts of the calculation that are critically important to the response functions of interest.

      The common-sense point of NAFEMS (and its journals) is that experience has established that devoting resources to software V&V is futile, unless the V&V is accompanied by matching investments in human training with regard to the software’s foundations in physics and mathematics.

      This is an interesting summary in that the objective of V&V is to determine the soundness of the software’s foundation in physics ( Validation ) and mathematics ( Verification ).

      Cites to reports and papers which demonstrate that devoting resources to V&V is futile are of interest. Especially now that the concepts are being taken up by the climate science community following decades of development and applications in engineering and other areas of science.

    • Of course you have to understand the problem, the way climate works, the science and math in order to create a climate model. But buggy software isn’t going to embody the correct math. You really have to have a solid understanding of what you are modeling AND solid code. The idea that buggy code is an accurate model of anything is kind of silly.

  3. I think it would be good to first define the level of software required. Not all software needs to be bulletproof, and there are a lot of competing requirements beyond quality – time to build, cost, speed, runtime size, usability, ease of iteration, maintenance, ease of extending, use by others, accuracy, language, target machines etc etc.

    It may well be that climate software doesn’t need to be high quality. That should be discussed first, decided, and stated in the spec (and even if low quality is fine, there should still be at least a ‘goals’ type document!). High quality is a bit of a misunderstood concept with software – people assume good software doesn’t contain errors or can’t be useful unless it is high quality, but that isn’t the case. Would I step into a plane running on code like Climate models? No I wouldn’t, but I’m not stepping on a plane. Is my home camera lens good enough quality for the Hubble? No again. Your goals are determined by your project. That said, the two climate code bases I’ve looked at could benefit from a more methodical approach to design and testing. As it is it is more like prototyping code.

    What is counter productive is when it is clearly not ‘high quality’ and someone does a ‘study’ that claims that it is. I understand that there will be lots of grief in explaining how lower quality may still suit their needs, but making false claims about it is just throwing trust in the garbage.

  4. There is a relatively inexpensive way to quantify (and to get pessimistic upper bounds on) model errors and uncertainty: We should run the models not in a 32- or 64- or 128-bit precision, but in a 32- or 64- or 128-bit “interval arithmetic”, where each quantity is represented as a [lower bound, upper bound] pair. As an example, 1/3 would be represented as [1,1]/[3,3] = [0.3333,0.3334] because of rounding errors.

    The intervals capture not only an uncertainty resulting from a finite precision of computer calculations, but also a natural variability or uncertainty of data, be it a variation of a wind speed over an area, or a temperature distribution in a grid cell. Thus, a thermometer reading 12.0 probably means [11.9,12.1], or with a really good thermometer [11.95,12.05].

    On today’s computers the implementation of interval arithmetic with its required rounding up and down is extremely slow – slow enough to be completely impractical. But it should be relatively easy to design an interval arithmetic floating-point processing unit. We only need to run several FPUs in parallel, get both rounded-up and rounded-down results, and select maximums and minimums, e.g.,

    The design would only cost millions, not trillions (and, if mass-produced, the processor would be probably less than ten times more expensive than today’s floating-point processing unit). How much is a Met office’s new supercomputer?

    This approach would also pinpoint any numerically unstable method, something that no amount of code validation could do.

    Even better would be a “triplex arithmetic” proposed by Prof. Nickel in 1970s – represent a quantity as a triple [lower bound, standard result, upper bound] which makes it easy to compare model runs with older models.

    • peterdavies252

      @Jim Moudry “This approach would also pinpoint any numerically unstable method, something that no amount of code validation could do.”

      I agree with this general premise and consider that many GCM’s suffer from numerical unstability due to computer rounding algorithms which in turn spawns hockey sticks and other junk.

    • peterdavies252

      Sorry Jiri not Jim. Need to wear glasses more often.

  5. The irreducibility of sensitivity over the last third of a century,suggest that this indeed evidence of irreducibility.

    The arguments ( best advertised) suggest the problem lies in the radiaitve cloud feedback,and arguments are that model errors are gaussian distributed eg Roe Baker are dubious at best.

    That there is not a single evolutionary law ,is a constraint on assumptions from GCM projections,and suggests bias in conclusions (by reducing natural variation that is poorly understood.

    That there is still an uncertain future in climate uncertainty, is a topical problem,where legitimate scientific debate needs to be orchestrated,across a number of areas is well identified eg Zaliapin and Ghil 2010,2011 has been well signalled,and the problems need to be argued in a scientific manner and not as totems for the various factions.

    PS important paper in press on this issue

  6. I wish we were far enough in the science of climate to consider the topics discussed here. But sorry to say we need another 25 years of serious and comprehensive observations to advance our understanding of this complex, non-linear, chaotic system. Readjusting the chairs on the deck of the Titanic, will not keep it from sinking.

  7. “Convergent Evolution” and GCMs

    If I understand Stainforth’s work with ensembles of models, the IPCC’s anointed set of models (described in AR4 as an ensemble of opportunity) covers only a small set of “parameter space” that is consistent with current knowledge and realistic weather/climate. Optimizing one parameter at a time in a parameter space that apparently is loaded with shallow local minimum and broad plateaus isn’t really possible. The only reason for stopping the optimization process is because some choice for parameter clearly works better other possible choices. In practice, “working better” could turn out to be caused by: a) getting results closer to those produced by other models (who wants to be the outlier?), b) doing a better job of reproducing the historical record (which could contain biases and multi-decade unforced variability), or c) moving closer to the assumed correct climate sensitivity. Models and modelers are under evolutionary pressure. Could model optimization ever have produced a model with a climate sensitivity of 1.2 that predicted only half of observed 20th century warming (the rest being natural variability and UHI)? Wouldn’t such a model have become “extinct” from difficulty publishing and lack of funding?

    I don’t know if the above speculation is reasonable. I’d appreciate hearing from anyone who could put such concerns to rest.

  8. ‘After a probabilistic framework to manage all sources of uncertainty and error is established, sensitivity analyses are performed in Section 5 to assess the contribution of each source of uncertainty and error to the overall uncertainty in the corrected model prediction. The sensitivity analysis result can be used to guide resource allocation for different activities, such as model fidelity improvement, data collection, etc., according to the importance ranking of errors in orders to trade off between accuracy and computational/experimental effort.’

    This evaluation over systematically designed model families is needed to evaluate the range of intrinsic computational variability that arises from the non-linear nature of the models. As this has not been done with climate models – and possibly can’t be done – the range of ‘irreducible imprecision’ is unknown. There is quite literally no unique solution within the bounds of plausible inputs and solutions diverge exponentially with time. To a certainty the ‘solution’ presented to the IPCC is the result of a post hoc subjective evaulation of the plausibility of the solution.

  9. In the 16th century, Nicolaus Copernicus presented a heliocentric model for the motion of the planets. It was no better at predicting planetary motion than Ptolemy’s geocentric theory. Years later Kepler arrived with his three laws that included elliptical planet orbits. A climate modeler traveled back in time via a freak time warp (a la star trek). Finding himself in a gathering with Kepler and other scientists he helped gain a resolution of disagreement about the various models by suggesting, “why don’t you average them?”

    • What is that, some sort of ignorant joke? The average of the two models is obviously better than the worse of the two. That is part of what Bayesian techniques are about. Kumar’s team at the U of Minnesota will be using lots of these kinds of approaches to extract useful information from the data.

    • Obviously, the average of two discrepant models is worse than the best. And if you average 9 bad ones with one good one, you get garbage out as a result. Your comment is rude and inappropriate, as well as just plain wrong.

    • +10 for understanding.

    • And model runs don’t generate data – they generate numbers from a model. Data is from the real world.

    • Steven Mosher

      all data is theory laden.

    • Res ipsa loquitur? (The thing speaks for itself?) Just to be clear, this is NOT what Steven means when he says that data is theory laden. Because in science, the data does NOT speak for itself. Theory does all the talking. The same data may fit several different theories. For example, at low speeds and ordinary precision, velocity measurements confirm both Newton’s Laws of Motion and Einstein’s Theory of Relativity very well. Such data is not laden by either theory. (Or any other theory that could be fitted to the data.)

    • I guess letting institutions with crappy models combine there output with the better models allows them to play and keep a funding stream. Kind of like in public school where everyone passes even if they can’t find their desk in the morning.

    • “there” should be “their”

    • The modeling groups should be putting in strenuous effort to determine if their models actually behave like the climate. That is the critical metric.

    • Steven Mosher

      A policy maker who uses the model results might not agree.
      “behaving” like the climate is over general and non testable

    • I think you are wrong on this Steve. See Jim2 | April 17, 2012 at 10:05 pm | Reply
      for an example of what I mean. This is also the recommendation from one of the V&V papers Judy posted.

      If you truly think the models aren’t testable and you are right about that, then funds should be cut off immediately to these projects. In that case, they would be a total waste of resources.

    • Gareth Williams

      You can average over a series of real-world measurements. Assuming they are independent and drawn from some probability distribution, the average is better than any one measurement.
      You can also average over many runs of a model with randomly drawn parameters (that is a Monte-Carlo simulation).
      But it makes no sense at all to average over a number of different models that make fundamentally different assumptions. There is no statistical reason (Bayesian or otherwise) to think that will give a better answer than any one of them. If one of them happened to be right, you are just averaging it with wrong answers.

    • Th e arrow of knowledge points in an increasing direction. Bayes is often subjective, which implies if you follow the belief system of increasing knowledge, averaged models will improve.

      However, this doesn’t always work for the Luddites and Malthusian skeptics commenting here. They don’t believe in increasing knowledge.

    • Web, averaging the Copernican and Ptolemaic models increased accuracy of prediction (aka knowledge) at that point in time but what did it do for the future (from that point on) of science?

    • Beliefs are for religion, not for science. Averaging garbage is still garbage — in fact mixing together rubbish from various sources may dilute the odor of the worse, but the mixture will still smell, I doubt you even understand use of Bayes The climate’s dependence on CO2 isn’t governed by the laws of probability, but averaging together models which treat the dependence differently will make it seem so. More directly, earth centered and sun centered models of the solar systems do not improve conceptually or by calculation no matter what you think Bayes justifies. As for your use of “Luddites and Malthusian” in connection with each other and with “skeptics” in this context shows only that you ignorant about Malthus’ fear about population growth would prevent progress toward utopia while Luddites opposed progress by destroying machinery. Putting them together the way you’ve done shows you have no understanding even of the cliches you use.

    • Well you are essentially anti-science at the core, Malthusian in your intent to foster the growth in science and limiting scientists nutrient source, I.e. funding.

    • Bah, Humbug.

    • WebHubTelescope says, “Malthusian in your intent to foster the growth in science and limiting scientists nutrient source”.

      What was that again?

      You must be putting us on — that’s it. Best humor since the Marks brothers.

    • I think this may be a “if you can’t pound the point, pound the podium” sort of response. :)

    • ceteris non paribus

      billc wrote:

      …averaging the Copernican and Ptolemaic models increased accuracy of prediction (aka knowledge)…


      That makes as much sense as averaging dogs and tables – i.e. other than the fact that they both usually have four legs, none.

      BTW – Copernicus’ theory was no better at prediction then Ptolemy’s – despite the 14 centuries between them. Copernicus rejected Ptolemaic astronomy because:
      1) it was geocentric,
      2) it assumed the ‘punctum equans’,
      3) it deviated from the ‘true’ & ancient formal principles of astronomy first stated by Hipparchus: epicycles and uniform motion.

    • WebHubTelescope | April 18, 2012 at 8:44 am | Reply
      The arrow of knowledge points in an increasing direction.

      Don’t mistake propaganda for knowledge though. Bear in mind close to 100% of the funding is from a single source – government – that has a huge vested interest in convincing us of CAGW (and whose agents don’t shirk from hiding data and other science fraud ) . Averaging multiple flavours of propaganda from the same source, and hence the same underlying agenda, is will most likely point the arrow of knowledge down, not up.

    • I am glad it is not the oil industry. I am glad it is not the church, etc.

      Funding is through the people of various sovereign nations of the world.

    • What you mean is, Webby, is you are happy that climate science is funded solely a by grasping, self-interested force incomparably bigger and more powerful and ruthless than any oil industry or church could ever be, using using funds seized from all of us, and that stands to gain vast sums of money and power if its CAGW propaganda is believed, and whose minions employ dishonesty at every turn. Heaven forbid we should get a second opinion, ie one not politically funded.

      OK, seems clear your overriding agenda is political, to advance the totalitarian/left cause – hence your approval of the politically-funded climate ‘consensus’ and IPCC. But let’s at least drop the pretence that the politically-funded ‘consensus’ is in the main anything but politically motivated.

  10. “It is more advantageous to spend resources toward reducing an error with a higher ranking than one with a lower ranking. The rankings are based on systematic sensitivity analysis, which is possible only after quantifying the effect of each error source on the model prediction uncertainty.” Let that be a lesson to all modelmakers. Better yet, forget about quantified prediction uncertainty and listen to Ernest Rutherford, father of the nuclear atom: “If your experiment needs statistics, you ought to have done a better experiment.”

    • Bernie Schreiver

      Arno, in quoting Rutherford’s “If your experiment needs statistics, you ought to have done a better experiment”, you have anticipated a lead strategy of many climate scientists. Namely, if within the next few years observed rate of sea-level rise accelerate sto 5mm/year or more … as multiple research groups presently foresee … then a no-statistics verification of cAGW will be proclaimed … with considerable scientific justification.

    • We shall see how that plays out, but I’m thinking any attempt to stop either the temperature rise or the sea level rise will be for naught, if that is in the cards, that is.

    • Steven Mosher

      unfortunately rutherford was wrong. No experimental answer ever agrees exactly with the prediction.

    • Steve, that was not Rutherford’s point. One can design experiments to give binary answers, at least to have signal one order of magnitude greater than the internal control.
      Good experimental design can make data analysis trivial.

    • ceteris non paribus

      Good experimental design can make data analysis trivial.

      Easier, maybe. Trivial, never.

      Besides, science includes observation of natural systems where no “experimental design” is possible. Or do you think that observational astronomy is not scientific?

      Leaving aside the “argument from authority”, a quick look at Rutherford’s own publications will show that he, in fact, used statistics.

      And why not? Means and standard deviations are not mysterious.

    • Excluding instrument resolution I think Einstien is about as exact as there is.

    • Rutherford. By an Occam argument, wouldn’t an experiment without stats be more preferred (and convincing) than one that did ?

    • Occam would have rejected relativity in favor of Newton’s laws.

    • Not in the face of phenomena Newton’s couldn’t explain he wouldn’t. As Einstein I think said : Simplify as much as possible – but not one bit more.

      Anyway, I was thinking more specifically of experiments.

    • Point being, Occam’s razor is a loose guideline, not a rule.

    • Bernie Schreiver
      Then do you agree that if within the next few years the observed contribution of some glaciers to sea level rise is negative, then cAGW will be rejected? See: Some Himalayan Glaciers Are Growing Nina Chestney, Reuters

      We suggest that the sea-level-rise contribution for this region during the first decade of the 21st century should be revised from +0.04 mm per year to -0.006 mm per year sea-level equivalent,” the study said.

      Slight mass gain of Karakoram glaciers in the early twenty-first century, Julie Gardelle, et al. Nature Geoscience (2012) doi:10.1038/ngeo1450

  11. As I understand it, global temperature is calculated by weighing averages of (Tmax+Tmin)/2; which is a simple enough metric.
    One would have thought that the starting point for all temperature prediction models would be how well they reproduce the diurnal cycle in different ecosystems, at different latitudes.
    I understand this is not the case.

  12. I’m commenting on the validity of these attempts, but we need more of this kind of effort. Even though we can’t expect a climate model to reproduce a given climate pattern to match climate data, there should be some recurring, similar patterns between the model and the climate. Things like ocean cycles, patterns of rain over a period of years in a given region, temperature profiles of the atmosphere, etc. Here are a few attempts:




    It would be great if anyone knows of other efforts in this direction.

  13. Vera, Chuck and Dave

    Admittedly not directly relevant, but
    (a) how much has been spent on climate modelling to date, and
    (b) how much is being spent annually now?

  14. Judith,

    Missing other parameters are not an issue?
    Of course not. Just the mathematical equation is all important.

  15. Bernie Schreiver

    Joe’s World: “Just the mathematical equation is all-important.”

    With respect, that is *NOT* how the mature discipline of computational fluid dynamics (CFD) works. Instead, large-scale fluid motions are governed by physical models, while small-scale (turbulent) motions are governed by semi-empirical models.

    Over the decades since the 1950s (when CFD began), as both simulation algorithms and computer hardware have improved, the relative portion of CFD predictions that are governed by physical models has increased … but there is still a substantial semi-empirical element that is present in all CFD codes (both free and commercial, public and proprietary). Although this physical-empirical boundary is fuzzy — and thus resistant to deterministic V&V protocols — this fuzziness doesn’t stop Boeing from simulating the fuel efficiency of its 787 jetliners to an accuracy of order ±1 percent!

    Similarly, Stefan Rahmstorf has created a well-conceived web page Modeling Sea Level Rise that carefully reviews physical models of sea-level rise to semi-empirical models. Both physical and empirical models agree that the coming ten years or so will witness sea-level rise rates accelerating to 5mm/year or more (although they differ substantially on the details, as Rahmstorf discusses in the sections “Physical Models” and “Semi-Empirical Models”).

    Should the predicted acceleration be observed, then foreseeably the “skeptical debate” element of climate change will largely end — because it will be plainly evident that AGW is real and accelerating — and the “V&V” element of climate change research will shift its primary focus to the accurate alignment of physical models with semi-empirical models (just as in CFD).

    Summary There’s isn’t much controversy regarding turbulent drag on airframes (although that drag isn’t easy to simulate accurately) and foreseeably within a few years there will be similarly little controversy regarding AGW..

    • Bernie – Thanks for that summary. Does the model post-dict the dip from 2010-2012? Also, what dataset will be used to verify the model? This one?


    • Bernie Schreiver

      JIM2, readers of Climate Etc. who point their web browser at the satellite dataset that your referenced — a dataset that includes 2012 observations that are not yet plotted on their website — will find that the highest sea-level ever recorded by satellite was earlier this month, namely, GMSL of +56.056mm on 2012.0717.

      So yes, the sea-level dip seen during 2010-11 was transient … precisely as was predicted by both the physical models and the semi-empirical models.

      Since those same physical and semi-empirical simulations both predict acceleration in the rise-rate during the coming decade, the next ten years will be illuminating from a V&V perspective.

    • Bernie – I see a link to raw data, but not corrected in the manner of the plotted data. Am I missing something?

    • Bernie Schreiver

      JIM2, you need to point your browser to the supplied link “raw data (ASCII), 2012 Release 2“, scroll to the bottom of the data file, and just plain look — the old-fashioned way — at the 2012 (bottom four) sea-level numbers.

      The sea-level dip, that was so prominent in the mid-2011 data, now is gone … as was predicted by many climatologists (James Hansen, for one!).

    • Bernie – As I stated earlier, that is RAW, unadjusted data. The plotted data is adjusted in various ways. This isn’t really a complicated concept.

    • Bernie Schreiver

      With respect JIM2, you may wish to verify for yourself that the plotted JASON-2 satellite data and the text-file JASON-2 satellite data sets are identical … except that the text data file includes additional data points for early 2012 —data points that are not yet included in the plotted data — in which previous declines are erased, and global sea-levels now are observed to be reaching new record highs.

      It’s thought-provoking, eh? That sea-level declines are observed to be transient and sea-level increases are observed to be cumulative?

    • I’m paying attention.

    • Bernie – looking at the numbers that don’t have “seasonal signals” removed, 2012 isn’t looking out of line. How are the seasonal signals removed? Do you have a reference I can access? (Not behind a paywall.)

    • Hey Bernie – What are the name(s) of the group(s) who run the models to which you refer? And the names of the models, please.

    • So, if in another 5 to 10 years; if the rate of sea level rise has shown no significant change from the steady 20 year trend will you be ready to admit that you and the models that you rely upon to predict a terrible future are WRONG?

    • B, your default, that warming is from AnthroCO2, is faulty.

    • Bernie Schriever
      CFD is “easy” compared to climate. The largest uncertainties are in cloud models where we don’t even know the feedback sign. See my post below. especially Graeme L. Stephens, Charney Lecture http://vimeo.com/33321693

    • @David Wojick, we should try to get comment 200,000 attached to this longest thread ever started by Joe.

    • Jim2 | April 18, 2012 at 9:34 am
      Does the model post-dict the dip from 2010-2012?

      Bernie Schreiver | April 18, 2012 at 12:40 pm
      So yes, the sea-level dip seen during 2010-11 was transient … precisely as was predicted by both the physical models and the semi-empirical models.

      But that doesn’t answer the question. All Bernie is saying is the models predict an upward trend, even if there are temporary blips down. He has not explained by what reasoning did the models specifically identified 2010-212 as one of the dips. Or even if they singled out 2010-2012.

    • Yes, Punksta, I was wondering if Bernie was here to illuminate or simply denigrate skeptics. The sea level data on U of C web site (TOPEX/JASON) are labeled raw data, but obviously are not the initial sea level data – the average of the actual sea level before additional adjustments. It is frustrating that some climate scientists conflate raw data with adjusted data. Mosher’s argument that all data is filtered through theory is obfuscation. Data is the result of a measurement. That data may and usually does depend on theory, but even so it is a measured value. Models do not measure anything and model output is not data. These issues make climate scientists look sloppy and undisciplined.

    • > yes but the kl<

  16. Is there any table that lists model parameters and their sensitivities? I would expect cloud effects and total precipitation would have high sensitivites.

  17. Judith, the past two threads have been fantastic.

  18. JC
    Thanks for highlighting that excellent paper. Following are examples of major uncertainty issues in climate relating to those issues:
    Feedback model structural uncertainty
    Lindzen addresses a critical systemic issue of feedback uncertainty and the major variations due to the models. e.g. See his presentation:
    Reconsidering the Climate Change Act Global Warming: How to approach the science. Seminar at the House of Commons, Westminster, London 22nd February 2012.
    Response as a function of Total Feedback Factor

    For negative feedbacks, large variations in the feedback lead to only small changes in response.

    For positive feedbacks, relatively small variations in feedback lead to large changes in response. It is the positive feedbacks in the models that leads to the uncertainty.

    See especially slides 44-57. Lindzen finds negative feedbacks as measured by ERBE and CERES. (Slide 48). However he finds that ALL the global climate models show positive feedbacks. (Slide 49)
    Model sensitivity
    Lindzen shows the feedback structural model causes sensitivity to vary enormously between negative and positive feedbacks. E.g. see Slides 51-53. With positive feedback, small changes causing feedback to become infinite which underlies the major warmist alarmism. Lindzen observes:

    Without positive feedbacks, doubling CO2 only produces 1C warming. Only with positive feedbacks from water vapor and clouds does one get the large warmings that are associated with alarm. What the satellite data seems to show is that these positive feedbacks are model artifacts.

    Why has climate remained “stable”, oscillating between glacial and non-glacial periods, even when CO2 was more than an order of magnitude higher?

    Clouds: Cloud modeling causes the greatest uncertainty in models. Graeme L. Stephens, JPL Climate Change – A Very Cloudy Picture
    A21G Charney Lecture, Moscone West Rooms 2022-2024, AGU FALL Meeting 2011
    Even the sign of cloud feedback is uncertain. See Stephens on cloud feedback v total feedback (at 18:20-19:00 )

    “Conclusion: Differences in cloud feedback are again the largest single source of uncertainty of all feedbacks (range from -0.5 W/m^2/K to + 0.7 W/m^2/K)” – Andrews et al. 2012

    Data Uncertainty: Metrologist Nigel Fox of the National Physical Laboratory (NPL) highlights major measurement uncertainties:

    Nowhere are we measuring with uncertainties anywhere close to what we need to understand climate change and allow us to constrain and test the models. Our current best measurement capabilities would require more than 30 years before we have any possibility of identifying which model matches observations nd is most likely to be correct in its forecast of consequential potentially devastating impacts.

    Phil. Trans. R. Soc. A (2011) 369, 4028-4063 doi:10.1098/rsta.2011.0246 Accurate radiometry from space: an essential tool for climate studies
    See Nigel Fox’s lecture in NPL’s Celebrating Science series Seeking the TRUTHS about climate change, 18 April 2011 http://www.youtube.com/watch?v=BalCag7fQdE&feature=player_detailpage

    It appears that cloud models, whether we have positive or negative feedback, and data uncertainty dominate the issue of catastrophic anthropogenic warming – yet we hardly ever see these issues clearly presented or highlighted in IPCC/alarmist discussions.

    Systemic Bias – Missing physics
    The Press-Courier (Milwaukee) June 11 1986 reported James Hansen as stating:

    the average U.S. temperature had risen from one to two degrees since 1958 and is predicted to increase an additional 3 or 4 degrees sometime between 2010 and 2020.

    However, Lucia Liljegren’s statistical evaluation of UAH global temperature trends vs IPCC models shows:

    As you can see, if “we” believe that the underlying trend is linear and the noise is “red”, and using the trend since Jan 1980 to test the range of trends, the 0.2C/decade is currently excluded from the 2-sigma range of trends. Specifically: the data says warming is slower than that. If we use ARIMA(1,0,1) (which I believe is… uhmm… someone’s… you can guess whose currently favored choice, the 0.2C/decade is also excluded.

    That appears to show that the global warming models and alarmists like Hansen have major systemic bias (are systemically “wrong”) in cloud feedback or are missing some major physics.

  19. Dr. Curry – if you would favor us, what is your opinion on the use of sea level rise as the better metric of global warming? (Better than global atmospheric temperature, that is.) Do you believe the sea level rise model(s) encompass all the input variables? Other thoughts?

    • A while ago Steve Fitzgerald had a nice analysis of this question at lucia’s Blackboard. I seem to recollect that there is a lot of uncertainty in fresh water figures, dammed and water tabled.

    • Omigod, Fitzpatrick.

    • Jim2
      A major issue with sea level is that pumping groundwater for irrigation appears to contribute about 25% of the rate of sea level rise. Yet that is rarely mentioned in discussing “rapid acceleration”.

      25% of sea-level rise is due to groundwater depletion ClimateSanity

      Global Depletion of Groundwater Resources, Yoshihide Wada et al. GEOPHYSICAL RESEARCH LETTERS, VOL. 37, L20402, doi:10.1029/2010GL044571, 2010

      That is another “systemic bias” missing in many evaluations.

    • since Dr C hasn’t chimed in I will offer that “SLR AND Surface temp” OR “SLR AND UAH” is a much better metric of GW. SLR captures the expected steric (thermal expansion) rise + ice melt (at least land ice). For those who discount the thermal capacity of the land/atmosphere compared to the sea, it’s true it is low, but that’s the radiating layer and so controls the T^4 “negative feedback”. Both are better than just one.

      I second Kim’s nod to Steve F’s analysis and note that I only addressed your first question…

  20. Thanks for the references, DLH.

  21. There seems to be quite a range of opinions on what the V&V requirements should be for climate modeling. Perhaps we should talk a little bit about V&V in the real world of software development.

    Few software projects use the full extent of what the official documents on V&V describe. Cost and time constraints are obvious pressures to minimize deflections from a straight line develop and deploy path. After all, much of full V&V involves develop, review, feedback, correct cycles. When you consider that the great majority of software products are not complex non-linear natural system modeling, computational accuracy evaluation is typically straightforward and easily defined.

    Verification of business software more often tends to focus on code structure, commenting, user interface style and components, input data error detection and recovery, boundary error detection and recovery, and such. Reviews are often relatively informal and testing is handled by handing the product to a group of users who are told to “break it!”.

    Validation of business software is somewhat more rigid in content but still usually quite well defined. The business folks know what they will put into the system and what they expect to see come out. It may not be trivial but there is no doubt if the software performed correctly or not at the end of test runs.

    Now, with that said, what does it have to do with climate modeling software? What determines the level of V&V that should be applied during development of software is testability and final use. When considering the verification phase, can system inputs, compensation parameters, calculation algorithms, and correct outputs be fully specified. If not, Verification (big V) will be somewhat ad-hoc and loose. But what about Validation?

    Validation is a measure of the accuracy of the software in the real world application. Remember that validation for most business software is handled using a suite of specifically designed test procedures with exact results defined. Once we move away from software that automates accounting and clerical jobs, the concept of validation gets a little harder.

    Ultimately, validation is driven by the ultimate use of the software. If you are attempting the first landing on the moon using a computer with a capacity little better than a modern toaster, validation takes on a very serious style. If I am writing a small C program to dig some data out of a text file to plot on the my computer screen for my own use, a “that looks OK” level of validation is appropriate.

    Engineering modeling software V&V will also depend upon its intended use. Aerodynamic modeling for automobile bodies and aircraft lift surface obviously require different levels V&V. However, both are critically tested against real data from wind tunnel tests. The results of designs based upon that modeling are tested in wind tunnels as a final verification. They may both ultimately be equally as reliable and accurate but the aircraft model will be used in situations where construction material strength will be much closer to yield limits.

    In the case of our climate modeling software, though the scientists involved wish to claim it to be simply a “what if” tool, it is being used as engineering level software. Its output is used as a firm definition of future climate states. That output is presented to politicians and the public in terms that give no indication that real world validation of the underlying software has not been successfully performed, at least in the sense that software engineers would believe and the public assumes.

    Are the results of our climate software experimental runs correct? I don’t know. Professional software developers certainly doubt the accuracy claimed for that software’s output. Engineering software modelers from other fields question the validity of claims for its accuracy.

    Basically, we have what most have described as software of less than stellar coding quality and best guess evaluation of output. That does not rise to the level of quality required for the serious engineering decisions made from that output. Though politicians are making choices about what to do with the output, they are real world engineering decisions, not political. Which choice to make has enormous political and economic ramifications, they are still engineering decisions. That software should have at least an engineering level of quality, which it is not designed or tested to have.

    • Wow. Just wow.

      For those questioning the need for V or V, I just listened to an NPR story on AMSC vs Sinovel. I wonder why the Chinese company needed to steal the code?

  22. It seems the US Govt is going to build a new system of climate models, with proposals due next month.

    Dacadal forecasts coming to your town. Woohoo!

    • David, my reading of this announcement is that earth system models are focused on adding new components to existing GCMs, mainly atmospheric chemistry and carbon cycle. So this is not so much new models, as embellished models. Decadal simulations have been completed for CMIP5, we have a paper in press comparing the different models (will feature a post on that soon).

    • Would you new papers identify exactly what each model is designed to accurately predict? It would certainly be nice to have the models clearly state what criteria they are designed to predict over what timescales with what margin of error.

    • David Wojick

      Rob, you will have ask the modelers that. They will probably say they are doing science, not making predictions, whatever that means. They will use a lot of words to say it.

    • David Wojick

      I fear you are correct, Dr. Curry. More of the same, with no chance of starting over and getting it right. In fact atmospheric chemistry and carbon cycle is what I call AGW research, since it assumes the AGW paradigm. The crucial known unknowns lie elsewhere.

      I look forward to your post.

    • ceteris non paribus

      Atmospheric chemistry and carbon cycles are part of “the AGW paradigm”?

      As if these things would disappear forever in a puff of incommensurability if the paradigm were changed…

      Of course, by “AGW paradigm” you mean what most people call “physics” and “chemistry”.

      Too much Kuhn is bad for you.

    • I predict that all climate models funded will consider as axiomatic that anthropogenic CO2 drives the climate.

    • If the model developers clearly define what criteria the models have been created to forecast, and clearly explain the relative margin of error of each criteria over time, I do not really care what they have done to develop the model (except from a personal curiosity standpoint). That said I would care very much if a model developer said that in year one of a model’s forecast there was an excessive margin of error. That would actually be evidence of an unreliable model. What I care about is if the model’s design is useful and whether it effectively performs to its design specification.

    • It would be nice if the model as well as the design of the model were useful.

  23. Uncertainty estimates aren’t meaningful when there are key lurking variables (which is the case for most of the uncertainty addressed in the climate discussion).

  24. The scientist has a lot of experience with ignorance and doubt and uncertainty, and this experience is of very great importance, I think. When a scientist doesn’t know the answer to a problem, he is ignorant. When he has a hunch as to what the result is, he is uncertain. And when he is pretty damn sure of what the result is going to be, he is still in some doubt. We have found it of paramount importance that in order to progress, we must recognize our ignorance and leave room for doubt. Scientific knowledge is a body of statements of varying degrees of certainty — some most unsure, some nearly sure, but none absolutely certain. Now, we scientists are used to this, and we take it for granted that it is perfectly consistent to be unsure, that it is possible to live and not know. But I don’t know whether everyone realizes this is true. Our freedom to doubt was born out of a struggle against authority in the early days of science. It was a very deep and strong struggle: permit us to question — to doubt — to not be sure. I think that it is important that we do not forget this struggle and thus perhaps lose what we have gained.
    ~Richard Feynman, 1955

  25. Jim2 | April 19, 2012 at 9:22 am
    It is frustrating that some climate scientists conflate raw data with adjusted data

    There may well be non-innocent reasons for this, ie data needs to be tortured before it pledges allegiance to the theory. Wasn’t this one of McIntyre’s bones of contention with Jones, because McIntyre was trying to see what was done with/to the raw data in the adjustment process ?

    Mosher’s argument that all data is filtered through theory is obfuscation.
    No, he is quite right. Sure, data is just a measurement, but which data you choose to show – or hide – speaks volumes about your theory, whether announced or not.

    • Punsta – Some climate scientiests refer to model output as data. It isn’t. That was my point. Model output isn’t a measurement of an attribute of a physical object.

  26. Isn’t there always going to be an overarching uncertainty in future climate projections related to the timing and size of future volcanic eruptions?
    Do the IPCC model runs show us volcano-free future senarios -or is there a random volcano generator out there, somewhere?

    (In the back of my mind is the suggestion that four decadely spaced eruptions might have dropped us into the LIA).

  27. Hats off to Punksta and the other commenters here for poking the holes in the modeling approach to assessing climate change,

    You’ve made my point again that simple observation and even simpler arithmetic will quickly physically prove the falsity of AGW. No models needed. The only reason to have models is to rig and corrupt the data to make it support a foregone conclusion that has everything to do with leftist-reactionary politics and nothing to do with science. All climate models, at least all the existing ones, are necessarily fraudulent.

  28. Well I wouldn’t say climate modelling is an inherently fraudulent process.
    But I would say the attribution of exaggerated confidence is, and that such fraud is motivated by a combination of totalitarian ideology and grant-farming.

    And I do wonder just how much a CAGW outcome is programmed into them right from the start, both consciously and unconsciously. Does the poor layman have any chance of getting a handle of this, I wonder?

  29. Punksta,

    I would agree that modeling may be possible and useful if done honestly
    with recognition of its limitations. But I would say that none of the kinds of models I know of partake of these necessary qualities, and the ones that do exist and are being used to justify AGW are fraudulent. And I still think there is ample proof that AGW is a lie without having to prepare and use models. It’s so much simpler than that.

    As for the poor layman, I have written my congressman asking him to introduce legislation to compel the news media to present the other side of the AGW controversy and the academic journals to publish articles by skeptics. The news media are acting positively to suppress the facts concerning AGW, and so are the academic journals. In effect, they have the same power that a totalitarian government has to conceal truth and propagate disinformation. While we can’t stop them from perpetrating their lies, I think we could – and should – force them to tell the truth along with their lies.

  30. Chad Wozniak | April 21, 2012 at 1:53 pm
    And I still think there is ample proof that AGW is a lie without having to prepare and use models. It’s so much simpler than that.

    Do tell.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s