Self-organizing model of the atmosphere

by Frank Lemke

A recent post in our Global Warming Prediction Project discusses the question “What Drives Global Warming?” based on a self-organized interdependent nonlinear dynamic system of equations of 6 variables (ozone, aerosols, clouds, sun activity, CO2, global temperature). It also predicts using this system global warming 6 years ahead (monthly resolved) and it compares the known IPCC AR4 projections with this system prediction and the observed anomalies of the past 23 years.

Looking at observational data by high-performance self-organizing predictive knowledge mining, it is not confirmed that atmospheric CO2 is the major force of global warming. In fact, no direct influence of CO2 on global temperature has been identified for the best models. This is what the data are seriously telling us. If we believe them, it is the sun, ozone, aerosols, and clouds – and possibly other forces not considered in this model – that drive global temperature in an interdependent and complex way.“ [link]

So the question arises: What is self-organizing predictive knowledge mining?

Briefly said, knowledge mining is data mining that goes steps further. It is a data-driven modeling approach, but in addition to data mining, self-organizing knowledge mining also builds the model itself, autonomously, including self-selection of relevant inputs by extracting the necessary “knowledge” to develop it from observational noisy data, only, most objectively in an inductive, self-organizing way. It generates optimal complex models according to the noise dispersion in the data, which systematically avoids overfitting the data. This is a very important condition for prediction. These models are available then explicitly in form of nonlinear difference equations, for example.

So this approach is different from the vast majority of climate models, which are based on theories.

Why self-organizing predictive knowledge mining is needed

I think there is no dissent in the community that there is no complete, even no sufficient, a priori knowledge about the complex processes in the atmosphere and its external influences and interdependencies with the ocean, land, and the universe. We only know few things for sure. This alone makes it an ill-defined modeling problem that is characterized by:

  • Insufficient a priori information about the system for adequately describing the inherent system relationships. The dynamics and interdependencies of real-world systems in which system variables are dynamically related to many others, often making it difficult to differentiate which are the causes and which are the effects.
  • Possessing a large number of variables, many of which are unknown and/or cannot be measured. Alone, using variables that have been measured you easily get many hundreds to ten thousands of variables when considering a higher degree of system memory (dynamics).
  • Noisy data available in data sets with a small number of observations. This is a serious problem, especially when the number of samples is smaller than the number of variables (so-called under-determined modeling task), which is usually the case for climate modeling. Temperature records start in 1850 and records for other factors start in the middle or the end of the last century. This is very limited “true” information.
  • Vague and fuzzy objects whose variables and results have to be described and interpreted adequately, which leads to uncertainty.

This means that there is a serious methodological problem in climate modeling. For ill-defined systems the classical hard approach that is based on the assumption that the world can be understood objectively and that knowledge about the world can be validated through empirical means needs to be replaced by a soft systems paradigm which can better describe vagueness and imprecision. This approach is based on the observation that humans only have an incomplete and rather vague understanding of the nature of the world but nevertheless are able to solve unexpected problems in uncertain situations. Theory-driven modeling approaches have been used to advantage in cases of well-understood problems, where the theory of the object being modeled is well known and obeys known physical laws. Theory-driven approaches are, however, unduly restrictive for ill-defined systems because of insufficient a priori knowledge, complexity and the uncertainty of the objects, as well as the exploding time and computing demands. This is the case in climate modeling as well as for other environmental, life science, and socio-economical problems.

On the other hand, we have all these observational data; although limited it is priceless information about the system and its behavior. It only needs to be extracted appropriately so as to transform it into useful knowledge and (non-physical) models that predict and simulate the system development, that help to get a better understanding and increase our knowledge about the system, and that support decision-making under uncertain conditions.

The adaptive learning path

Each methodology has its strengths and limits. Every single model reflects only a fraction of the complexity of real-world systems. What is necessary is a holistic, combined and interdisciplinary approach to modeling that takes into account the incompleteness of a priori knowledge. Knowledge mining benefits from well known, justified theoretical knowledge about the system to get most reliable and accurate predictive models out of the data. These models in turn may reveal new knowledge that could be used in further steps, and they can be applied – together with physical models where and when suitable – in simulation and scenario development. In this unifying way it should be possible to better, more completely and more adequately describe, understand, and predict the complex behavior of the earth climate including uncertainty as its integral part.

Even more difficult is the task of control. In complex systems, there is no simple, single cause-effect relationship that we humans apparently find so appealing. There is no single control knob that only needs to be moved up or down to get a desired result, and only this result, with no unwanted side effects (this is another inconvenient truth). Instead, the system variables interact highly dynamically in time and space, they are inputs and outputs, causes and effects at a same time. This interaction pattern is very difficult to understand and interpret even if it was fully known. We need reliable tools in form of interdependent system models that help understanding and dealing with it (see fig. 2 in the mentioned post).  I believe – and I think I’m not alone here – no individual person or expert group will ever be able to nearly formulate these complexities by theoretical approaches, only.

The primary question is not whether CO2 – or any other single factor – drives global warming, but first of all, whether the modeling approach is adequate to describe the system under research. To me, models based entirely on “CO2 theory” are methodologically inadequate for predicting, describing, and explaining global warming, because of the above reasons, especially given their claimed evidence. AGW protagonists as well as the skeptics (the “deniers”; it’s only the viewpoint that defines who denies what), we all suffer from the same fact: we cannot prove our theories and concepts in the general view due to lack of sufficient knowledge and understanding of the climate system. This could be a never-ending story until reality decides who is right. But this way we would have learned and gained not much.

The other, more sophisticated way would be to understand and respect the sound arguments and conclusions of the “opposite party” as partial (mental) models of the complex, hidden Truth, as different views on the same object, as pieces of the entire picture that we all are trying to reveal. No single model can reflect reality completely. We need an ensemble of models that use different information and modeling approaches. We have to continue to learn and to gain new knowledge to further reveal Truth. New knowledge, however, cannot be obtained without external information. A major, but so far unused, source of external information in climate modeling is knowledge mining.

This path would be the adaptive learning path.

As of today, we do not know enough about climate and climate change to be sure of anything. We can learn and gain knowledge from different sources, approaches and views and eventually will achieve the understanding and clarity needed to make sound decisions. We have to start thinking differently.

JC comment:  I think this is a fascinating methodology for attempting to untangle the complex relationships between clouds, surface temperature, atmospheric composition, etc.   The potential predictive capability on timescales of a year to a decade fills a critical gap in our understanding and predictive capability.  Conclusions regarding AGW and the role of CO2 cannot be drawn from 23 years of data, but this methodology in principle could be extended to longer time periods.  I look forward to the discussion on this.

Moderation note:  this is a technical thread and comments will be moderated for relevance.

216 responses to “Self-organizing model of the atmosphere

  1. Exactly: Nature will have the last say.

  2. How is this different from neural nets?

    • First off, if it to be based on the model of the brain then the perpetuation of the genotype is paramount. But, nature does not care one wit about that.

      • No, no. The “neural net” is a method, that is roughly based on a mammal brain, and its ability to learn. It’s not necessarily evolutionary. Genetic algorithms are something else.

        http://en.wikipedia.org/wiki/Neural_network

        The software net needs to be “trained”, using (lots of) empirical data. This method seems conceptually similar, but not exactly the same.

      • I think you are missed my point. What will spark your neural net? What is to be the God factor that breaths life into the model?

        I would say that nominally, it is the sun, stupid. But in doing so, instead of modeling the beginning of life, the only thing that interests the global warming alarmist community is modeling the coming climate apocalypse and predicting society’s downfall unless free enterprise capitalism and individual liberty is subjugated to the will of the state and replaced with the Left’s Marxist Utopia.

      • What happened to “moderated for relevance”?

      • I think you missed the point as well. These models are not used to reinforce preconceived ideas about reality but rather to eliminate bias by explaining it using variables–like cloud cover and solar activity–that were either not previously or given much weight. When it comes to getting honest results, Everything lies in determining the motivation.

    • It sounds like it generates nonlinear difference equations that fit the data, using pre-specified parameters, and possibly some specified theoretical relations among these parameters and/or the data variables. Nothing like neural nets, which are basically weighting functions.

      • But what they have in common is adaptivity. And real (biological) neural nets can also do differential operations, so the method could be generalized to handle calculus. Again, there are some similarities; this seems more straightforward at the expenses of being less general.

        From what I understand, the computing power required for a net to “learn” something as unwieldy as the climate system would be prohibitive. With more mips (gips? nips? tips?), a neural net might be able to tell us something about the computability of the climate system in the future.

      • I doubt that brains are neural nets, but never mind. Also, neural nets might be interesting as climate models; I don’t see how but I really have no idea. The point is that this methodology sounds nothing like a neural net mathematically.

      • Of course (vertebrate) brains are neural nets. However, the neurons in vertebrate brains are far more complex and “intelligent” in their behavior than the modules used in software “neural” nets.

      • “Of course”? Neural nets embody a very specialized set of mathematical functions. There is no reason to believe that brains embody these functions, except sales hype. I personally doubt that brains are computational at all, as there is no one to read the output. The computational model of the mind is a metaphor at best.

      • Neural nets were originally developed as simplistic simulations of small parts of the vertebrate brain. Once they were in silico they evolved away from their origin, but real nervous systems still come within the original definition.

      • @David Wojick…

        I personally doubt that brains are computational at all, as there is no one to read the output. The computational model of the mind is a metaphor at best.

        I’d have to disagree with you. If you regard the axons as signal carriers specifically analogous to the connections in an in silico neural network, with the neurons corresponding to the nodes, neurons perform a (potentially) very powerful analog calculation, based on signal timing of action potential arrival at the synapse. The shape of the signal arriving at the synapse is also important, although it usually depends on only the immediate chemical environment of the nearby axon. (In some cases, however, the shape of the action potential at the soma can actually influence the shape at the synapse.)

        Simplifying enormously, the inputs of the brain are sensory, the outputs are motor and hormonal.

        Neural networks started out by simulating a very tiny subset of what real neural networks do. They evolved from there, based on the needs of their designers. But biological neural networks are very real networks, in every sense of the word, and they perform massive computations. In many ways these computations are analog where normal in silico computations are digital at the base (but don’t forget that there’s an analog system at the base of every logic gate). However, that doesn’t make them not computations, just not the kind of computation you may be used to.

    • There are several differences. The main difference is that it performs model parameter AND structure identification, autonomously. It starts from an empty model. You put in hundreds or thousands of potentially relevant inputs of all sorts of a number of noisy observational records (which actually can be much smaller than the number of inputs) and you get at the end an “optimal complex model” that is composed of a rather small number of relevant inputs (usually < 20, but this is not predefined). Modeling stops automatically (no parameters) and you get the model in explicit analytical form. This is the concept of inductive self-organization which also formed the theory of noise immunity modeling. This is knowledge extraction from data that originates from cybernetic principles of N. Wiener and S. Beer (not biology), but also includes concepts of systems and control theory (the author of this method, A.G. Ivakhnenko, was at that time (1968) a recognized scientist in control theory), mathematics, information and computer sciences.

      • AFAIK neural nets have been successfully used for this sort of thing. It tends (IIRC) to be very “theory neutral” in ways that more traditional data mining aren’t.

      • @knowledgeminer…

        This is knowledge extraction from data that originates from cybernetic principles of N. Wiener and S. Beer (not biology)

        Reading the description at your website (of the book you co-authored), I find reference to something like an evolving neural net combining some of the principles of neural nets (vaguely simplified from biology) with Darwinian Evolution:

        this book introduces principles of evolution – inheritance, mutation and selection – for generating a network structure systematically enabling automatic model structure synthesis and model validation. Models are generated from the data in the form of networks of active neurons in an evolutionary fashion of repetitive generation of populations of competing models of growing complexity and their validation and selection until an optimal complex model – not too simple and not too complex – has been created. That is, growing a tree-like network out of seed information (input and output variables’ data) in an evolutionary fashion of pairwise combination and survival-of-the-fittest selection from a simple single individual (neuron) to a desired final, not overspecialized behavior (model). Neither, the number of neurons and the number of layers in the network, nor the actual behavior of each created neuron is predefined. All this is adjusted during the process of self-organisation, and therefore, is called self-organising data mining.

        Sounds like something one might call “Neural Nets 2.0″ when selling it to the right audience. I suspect I’ve read bits and pieces about this in the literature.

      • Oops, forgot the link again.

      • We need more information about what inputs you used. I gather that they were time series of (1) measured value (2) first difference and (3) second difference for each of a bunch of variables such as (a) global mean temperature, (b) global mean CO2, (c) global mean solar activity (perhaps several), and others. From there it looks like a neural network algorithm, but it’s hard to tell.

        I don’t recall names exactly, but if you presented KnowledgeMiner in one of the product booths at the JSM 2011 in Miami, then I saw it.

      • steven mosher

        Inputs: sunspots, ozone, aerosol index (log of a ratio), relectivity from a time series with 4 different platforms in its history.

        Note: the measurements of ozone, aerosols and reflectivity ALL depend upon radiative physics.

        So basically the model disproves the science used to generate the inputs

      • no temperature, no CO2, no deforestation, no first or second differences?

      • CO2, global temp, ozone, aerosols, reflectivity, sunspot numbers with time lags of up to 120 months. This list is surely not complete, which adds noise to the data. This is a characteristic property of ill-defined systems.

      • OK. Somehow I got the idea that first differences were included, not merely lags. And the lags, sunspots only, or all measured variables?

  3. Censored, what a surprise.

    • What you are talking about is WAY off topic for this thread. Take it to week in review.

      • On the boards it usually comes down the the same topic which I related to your own comment on the article. I’ll come back after 200 posts to comment and I’ll obviously be correct.

        “Conclusions regarding AGW and the role of CO2 cannot be drawn from 23 years of data, but this methodology in principle could be extended to longer time periods.”

        Science that demands policy needs conclusions, as the thread sinks into the usual partisan abyss I’ll post later if permitted.

  4. Frank, this methodology is certainty interesting and it probably comes with a large learning curve. The apparent lack of CO2 influence is especially interesting to skeptics. Where can we learn more?

    However, you are mistaken regarding skepticism when you say we share with AGW proponents the fact that “we all suffer from the same fact: we cannot prove our theories and concepts in the general view due to lack of sufficient knowledge and understanding of the climate system.”

    Skeptics are not trying to prove any theories or concepts. Our point is specifically what you say yourself, namely that we lack sufficient knowledge and understanding of the climate system. In short you are a skeptic. Welcome aboard.

    • Thanks, David! :)

      No, it’s very easy to learn, especially if you are familiar with modeling. The primary goal is to build models that are as objective as possible by limiting subjective influences (parameters, assumptions etc.) conceptually. In other words, automation. In the easiest case, you just define your data set (the inputs and number of samples) and the modeling task (prediction in time, identification, classification) then press “Self-organize”. Depending on the size of the data set you get the model in a few seconds (small), minutes (several hundreds of inputs), or tens of minutes (thousands of inputs).

      More info on:

      http://www.knowledgeminer.com

      I agree with your point on skepticism. What I mean is that skepticism cannot disprove the claims of the protagonists.

      • Thanks Frank (I assume you are Frank),
        I will look at this with interest, however, I can’t believe this is mathematically simple. Also, I wonder what you mean by modeling. For example, I do not consider statistical analysis to be modeling. I use that term for what you may be calling theoretical modeling. For example, I have used a disease model to explore the dynamics of scientific information. See http://www.osti.gov/innovation/research/diffusion/

        On the other hand, if you are generating equations then it looks like modeling. Fascinating.

      • In fact, it is modeling by knowledge extraction from data.

      • Remember that in reality or genuine science it’s not necessary for skeptics to disprove those claims. Don’t fall for the “reversed Null Hypothesis” trick. The protagonists must prove that all hypotheses except their CO2-causality one fail. They fudge that by parameterizing them into insignificance first, then “discovering” that only CO2 is operative.

        Your method, by contrast, discovers that CO2 cycling does not interact with any of the other variables at all. This has long been my “intuition”, but I get excoriated by warmists, lukewarmists, and all “reasonable compromisers” alike when I express it.

        Thank you.

      • Leonard Weinstein

        knowledgeminer
        You stated: “skepticism cannot disprove the claims of the protagonists”. This is not true if the AGW hypothesis specifically makes a falsifiable claim (hot spot in the upper troposphere in low latitudes, increasing sea heat content to compensate for non rising temperature while CO2 is increasing, etc,) and the claim is falsified in a time period sufficient to be representative. There can never be an absolute “proof” of the validity or non validity of a general hypothesis, since the specific claim may be wrong, but a revised version of the hypothesis can be made without the previous falsified claim. However the claim as made can be falsified.

  5. It’s hard to know what to make of Frank Lemke’s article, because the descriptions here of his approach were fairly general, at least in the sections I’ve read. I agree with him that uncertainties regarding the quantitative results from some of the complex models leads to a wide range of possible outcomes. On the other hand, the claim that CO2 is probably not a major factor in global warming over the long term (a few decades and longer) is contradicted by a combination of simple models and observational data – see the very informative thread on Probabilistic Estimates of Transient Climate Sensitivity. In addition, our knowledge of basic principles of radiative transfer, again supplemented by observational data from satellites and ground stations, tells us that CO2 and other greenhouse gases must play a significant role, even if the role of other factors may also be important. Much of this has been discussed in past threads on anthropogenic vs natural variability, and forced vs unforced variability, which indicate that short term fluctuations are often the product of various unforced factors, while longer trends are dominated by the anthropogenic component – at least since 1950 and to a lesser extent earlier in the century. This issue appears to arise frequently. I refer to it as the “default theory of CO2″, which proposes that the effects of CO2 on climate are what is left over after everything else is accounted for, and if everything is accounted for, CO2 has little or no role. In fact, we now have more than sufficient affirmative evidence for the radiative forcing/feedback role of CO2 and other GHGs to see it as independent of any default role – i.e., it must be an active force regardless of the other variables.

    Despite these caveats, I don’t want to dismiss Lemke’s approach as without value, but its current utility, however great or little that may be, seems to me to be probably limited to short term predictions, and I think it would be unwise to extrapolate beyond a few years. There is also some danger in the application of theoretically derived paradigms by individuals without reasonably detailed knowledge of climate physics, because this can easily lead to small misinterpretations that generate inaccurate conclusions. Regardless, it would be interesting to learn more details. It may also be that the principle is more usefully applied to important phenomena other than climate rather than to climate change, at least at present.

    • steven mosher

      Agreed.

      Looking at some of the model performance

      http://www.climateprediction.eu/cc/Media/slideshow.html?backgroundColor=rgb(0%2C%200%2C%200)&reflectionHeight=100&reflectionOffset=2&captionHeight=0&fullScreen=1&transitionIndex=2

      I would have to give the approach an F.

      one problem is that he appears to have used only 23 years of data to extract model parameters. When the physics tells us that the C02 signal is Lagged.. then trying to build a model from that short of period will fail.

      • If that’s the case, we’re truly hosed, though. The further back you go, the more suspect the data quality gets. It seems like this is a very basic dilemma.

      • You mean when the speculation tells us that the CO2 signal is lagged. I love the way you warmers confuse your speculations with “the physics” or “the science.” Ultimate hubris. (Hint: there may be no CO2 signal.)

      • steven mosher

        Any analysis that purported to extract knowledge from data would have to at least “consider” the notion that the effect that C02 has is lagged. If it doesnt, then that is clearly an assumption. Simply, by trying to model the temperature from 1987 to today, by using C02 measures from 1987 to today, you have assumed that the effect is instantaneous. When you show that C02 doesnt show up in the model, all you have shown is that the assumption that C02 is not lagged, is wrong.

        The other things I would check are.. did they include TSI? what curve
        Did they include volcanic forcing? what curve? ( and we know this is lagged) Did they use C02 as input as ppm? or did they do a log transform? To be physically meaningful, they would have to do a log transform. Or rather, if they didnt do a log transform, then they really havent shown anything. Almost without except most skeptics ( and luke warmers) recoginize LTP in the climate. If you have a process that has long term memory, then you need to recognize that in the data you put into a model. LTP would be something like 30 or 60 or longer processes.
        So if this approach even hoped to explain natural variation it would also need to consider longer time frames.

      • Steve, it is these kinds of suggestions that this post was intended to elicit, thanks.

      • “Any analysis that purported to extract knowledge from data would have to at least “consider” the notion that the effect that C02 has is lagged. If it doesnt, then that is clearly an assumption.”

        The ice core records seem to have 800 year lag.
        So, I don’t know the degree of lag one should consider.
        The possible range of lag is a say day to 800 years.
        To not make this lag more precise seems to make it almost
        useless to consider.

      • Simply, by trying to model the temperature from 1987 to today, by using C02 measures from 1987 to today, you have assumed that the effect is instantaneous.

        No, just that the time constant is on that order. First order lag isn’t the same thing as dead time. You do get some response, and thus information within that time horizon.

        The only process that’s likely to behave like a dead time is the oceans. Unfortunately, that may turn out to be the governing process.

      • Yes, it would be interesting to see a more detailed description of the model, perhaps a recent publication on the topic.

        After reading a bit more on the subject, I get a hunch that this is not that mainstream – this is just a hunch mainly based on lowish citation counts. What I was able to find (quickly) were written almost solely by Frank Lemke himself. This doesn’t mean it is somehow false or anything, just wondering!
        Seemingly it is kind of a neural network that is built after all, perhaps with somewhat similar ideas as Kohonen maps employ (preserving topology etc).

        As said before, it would be interesting to see how the predictions of these ‘simpler’ models and supercomputer-GCMs compare say, after 10 years or so.

      • Leonard Weinstein

        Steve,
        Please tell me why you and the modelers seem to think CO2 effects are lagged. Are you making the point CO2 persists in the atmosphere? That is not an additional heating effect, just persistence of heating. Is it the ocean warming (collecting Joules)? Where is that now (it is not heating for quite a while now)? Is it somehow driving ENSO? How? The statement of cause of lagging needs to be specifically stated and supported, not just repeated.

      • When a volcano gives off CO2 and other gases there should be an abnormally high concentrations of CO2 in the surrounding area. I would be interested in knowing(if someone here knows the answer) if these areas get the extra 2-5 degrees C per doubling co2 as the lukewarmers claim or if not. I would expect there should be some sort of heat island effect.

      • steven mosher

        Look at the model.

        They used sunspot numbers rather than TSI

        http://www.climateprediction.eu/cc/Main/Entries/2011/9/13_What_Drives_Global_Warming.html

        Sunspot numbers are known to have been overcounted recently.

        The decision about what “counts” as a spot is arbitrary, so its more physically realistic to use TSI, which has the benefit of actually being in Watts ( so your units are at least right )

        And again, using C02 concentration (ppm) would be wrong as well, since you want to use the log of C02. And no complaining about using the Log, because they used aerosol index.. which you guessed it is a log itself.

      • @anander
        “After reading a bit more on the subject, I get a hunch that this is not that mainstream – this is just a hunch mainly based on lowish citation counts. What I was able to find (quickly) were written almost solely by Frank Lemke himself. This doesn’t mean it is somehow false or anything, just wondering!”

        Well, this is really an ill-posed task. Try solving it with the info you have and observe yourself how assuming different aspects impacts your answers. :)

        No, maybe this helps:

        http://is.gd/aJtT3e

      • OTOH, if this method could be married to a more dynamic-aware method, 23 years might be able to glean some better information. Maybe a hybrid of this and some of the dynamic CS theory may lead somewhere.

    • Fred, looking to the post you are referring to, and to the Padilla & al paper that is discussed in this post, you should have noticed that warming rate computed by models for [1910 – 1940] period are 3 to 4 times lower than the observed one.

      Furthermore, models are unable to produce [1940 – 1970] cooling, except during [1960 – 1965] short period corresponding to Agung eruption. Indeed volcanic (i.e aerosol) forcing is the only way for models to produce significant cooling, and, as Judith mentioned somewhere else, all models show an excessive sensitivity regarding volcanic forcing.

      • I can’t confirm your statement about the data in Padilla et al. I also disagree that volcanic forcing is the only source of aerosol cooling, or that you can generalize about “all models”. It’s true, though, that observational data during the 1950 to 1990s interval showed variations in the amount of solar radiation reaching the surface under clear sky conditions that would be expected from aerosol cooling effects and their later reversal.

  6. Predicting the future climate is as accurate as predicting the future?

    • “Prediction is very difficult, especially about the future.” — Yogi Berra, Samuel Goldwyn, Niels Bohr, Robert Storm Petersen, (Mark Twain?)

  7. “Conclusions regarding AGW and the role of CO2 cannot be drawn from 23 years of data, but this methodology in principle could be extended to longer time periods.”

    Limited data is an issue, yes. Actually, the models have been developed from data starting in 1978 (33 years) using time lags of up to 120 months.
    This was what was available by public sources for all variables used (aerosols, ozone, clouds, CO2 especially).

    • 33 years sounds good enough to me. As Lindzen says, if you can’t find it, it probably isn’t there. But I would be happy to put this issue on hold for 100 years of so, if that’s what the warmers need to make their case.

      • Don’t tell that to a conspiracy nut. The fact that you can’t find it is proof that they’re hiding it.

      • steven mosher

        david, when we talk about LTP, when we talk about natural cycles that may exceed 60 years in length what we are arguing is that what happens today is conditioned by processes that last 60 years or longer. The point is by limiting the data to 10 years lag or 20 years, or 30 or 60, all these choices preclude finding LTP. they make the tactic assumption that there is no inertia or memory in the system. The debate between the hardcore AGW types and the hardcore natural variation types TURNS on the very question of what LTP is in the system? By looking at short time frames, short relative to the putative cycles in the observables, you rule out APRIORI that there is LTP in the system and you rule out that the C02 effect can take a long time to emerge.

        The approach of deducing a physical model from data can work, but it’s limited by the data you feed it. And the data you feed it is a choice. an analytical choice that like all analytical choices is subject to bias. The way to control for the bias is not by asserting that you think 33 years is long enough. The way to control for that is to TEST.

      • Why not feed it all the tree-ring data etc.? Not as temps, but as separate classes of measurement? Let the model-builder decide what they mean.

      • Why would we want to do that when Mann-o-matic is the state of the art?

    • steven mosher

      There are public sources going back much further than that and there are several other variables you would need to consider. In addition, there is some data that you will want to transform before putting it into the model.
      I’ve seen similar work going back centuries maybe I’ll scare up a reference here. In general, the response function takes decades. decades of lag. not 10 years. You can look at it this way.

      What youve shown is that if you consider only a few variables of interest and neglect lags and LTP that the model will not understand the climate.
      Look at your regional predictions, the skill scores are not anything to write home about.

  8. Frank – That is somewhat disconcerting, because we have excellent evidence from the physics that CO2 played a dominant role between 1978 and recent years, along with declining cooling aerosols. It’s also difficult to interpret what you mean by including clouds, since clouds change in response to changes in forcing agents such as CO2 and aerosols, and so cloud changes are to some extent surrogates for these agents. In other words, if you subtract the effect of clouds, you are also subtracting some CO2 effects, although your modeling would then also need to include changes in the critical variables of tropospheric water vapor and in planetary albedo since 1978. Without a detailed evaluation of the physics, assigning independent roles to these variables, some of which are effects of others, is problematic.

    • Fred, the evidence isn’t all that excellent, see the recent NCAR climate model simulations, particularly this figure. Doesn’t seem to me that we understand what has been going on since 1978 all that well

      http://judithcurry.com/2011/05/08/ncar-community-climate-system-model-version-4/

      See Figure 12 of the Gent et al. paper

      The whole point of Frank’s approach is that it identifies interactions among these variables.

      • Judy – I saw nothing in that discussion to contradict the conclusion that the interval since 1950 was dominated by anthropogenic GHGs, mainly CO2. The critical element is the starting date – 1950. If one starts later (e.g., the 1970s), the relative roles are less certain, but to some extent that reflects the greater importance of natural variations on shorter timescales, and also the effect of interval start/end dates on the averaging of natural climate variables such as the AMO and PDO.

      • Look at the model results, compare to observations. By 2005, model results are almost 0.5C higher than the observations.

      • That is the CCSM4 model compared with CCSM3. Models do differ, but that was implicit in the different climate sensitivity values that emerged, and did not signify to me that one particular warming factor was more or less important than another, as I mention in my comment below.

      • The significance is the comparison of CCSM3 and CCSM4 with the observations. Using lots of tuning, CCSM3 agreed well with observations. Without that tuning, and with a better model, CCSM4 shows warming that is way too strong and ends up 0.5C too warm by 2005.

      • @Fred Moolten…

        That is the CCSM4 model compared with CCSM3.

        Isn’t that also CCSM4 compared to observations?

      • CCSM3, CCSM4, and observations are all compared. In my MIT talk I discussed the circular reasoning and tuning that produced the good agreement with CCSM3 and observations. CCSM4 simulations used a much better and more honest experimental design (with a better model).

      • Dr. Curry

        Looking through the NCAR model version 4 results, you are correct that NCAR version 4 seems to over predict temperature change by about 40%. When I look at precipitation predictions it seems more difficult to make a definitive assessment of the models accuracy. In general the predicted changes seemed favorable for the US if you could believe the model.

        Are you aware where I could find specific precipitation predictions I could compare with observered results?

      • Judith,

        You’ve said in a number of places that CCSM3 had been ‘tuned’ to fit the 20th Century where CCSM4 was not, but I think you’ve been taken in by an optical illusion: Run the model on a few years (using the IPCC A2 scenario) to 2005 and you see that it hasn’t matched well with 20th Century obs. at all. Indeed CCSM3 is running about 0.25ºC warmer than the ensemble mean by 2005.

        As noted in Gent et al. 2011 the main cause of this discrepancy is most likely the lack of an indirect aerosol effect.

      • Paul, the CCSM3 simulations used in the AR4 matched the surface observations to 2000 very well (and the model was tuned to the 20th century obs). the fact that the CCSM3 does not do well during the untuned period 2000-2005 also supports my point

        I am not very convinced by the aerosol argument. The model seems hypersensitive to aerosols: note the big dips in modeled surface temp associated with volcanic eruptions, which are much larger than observed.

      • Dr Curry,
        It seems that CCSM3 is not doing so good over [1910 – 1940] with a warming rate about 2 to 3 times lower than observed.
        [1940 – 1970] cooling is also not well represented.

        The fit is good over [1970 – 1998] period but, as you aready pointed out :
        – Totally wrong after 1998 (T° anomaly overestimated by 0,5°C)
        – Excessively sensitive to volcanic forcing.

        This formally invalidates :
        – All climate models
        – CO2 control knob theory

      • doskonaleszares

        > “The significance is the comparison of CCSM3 and CCSM4 with the observations. Using lots of tuning, CCSM3 agreed well with observations. Without that tuning, and with a better model, CCSM4 shows warming that is way too strong and ends up 0.5C too warm by 2005.”

        > “Paul, the CCSM3 simulations used in the AR4 matched the surface observations to 2000 very well (and the model was tuned to the 20th century obs). the fact that the CCSM3 does not do well during the untuned period 2000-2005 also supports my point”

        Judith,

        you keep saying that, but I haven’t seen any evidence for this claim, here or in your uncertainty papers.

        Also

        1) Blitz et al “Climate Sensitivity of the Community Climate System
        Model Version 4″ identifed several reasons for higher sensitivity of CCSM4, and they have nothing to do with tuning;

        2) Mehl et al “Climate system response to external forcings
        and climate change projections in CCSM4″ demonstrated that tuning by adjusting forcings was impractical, and even if you by pure chance managed to get your model to “agree well” with one specific metric (global mean temp), it would fail with others;

        3) CESM1 CMIP5 runs show less warming than both CCSM4 and CCSM3. Is it “tuned to the 20th century obs” or not?

      • Read the Gent et al. paper, they describe this. Re CESM1, i haven’t seen these simulations yet nor the documenting publication.

      • doskonaleszare

        > “Read the Gent et al. paper, they describe this.”

        Where? Could you please provide a quotation?

      • The relevant text is on page 4, in section 3

      • > The relevant text is on page 4, in section 3

        I assume you mean the following fragment:

        “In this section, the setup of preindustrial control and twentieth-century
        integrations of CCSM4 will be described, which used a strategy designed to address problems of energy balance and climate drift in CCSM3. For that model, most development effort went into producing a present-day control run, which was energetically well balanced at the top of the atmosphere (TOA) (Collins et al. 2006). The CCSM3 1870 preindustrial control run kept the same parameter values as the present-day control, but changed the forcings, which meant that the system lost heat at a rate of nearly 0.6 Wm-2. Thus, the entire ocean cooled in the CCSM3 1870 control run so that the total ocean heat content decreased very significantly. The CCSM3 twentieth-century runs were branched from this 1870 control, and the ocean heat content changes over the twentieth century had to be calculated with respect to the large drift in the 1870 control run (Gent et al. 2006). This strategy was less than optimal, and a different strategy was chosen for CCSM4. It was decided to concentrate on a preindustrial control run, and 1850, rather than 1870, was chosen because the carbon dioxide (CO2) and aerosol concentrations are closer to preindustrial levels in 1850. A real disadvantage of this choice is that it is much harder to compare the long 1850 control run with observations. However, this is outweighed by the advantage of having more realistic twentieth-century runs, where the climate system, including the ocean component, is gaining heat.”

        But in this case I’m afraid you are mistaken. They didn’t tune the CCSM3 to “give the correct 20th century variability”, or to “give a good 20th century simulation”, they didn’t even tune the CCSM3 to “the 20th century observations”, and it makes absolutely no sense to claim that the period 2000-2005 is “untuned”, and this is the reason for the temperature discrepancy.

        The NCAR team used 1990s values of forcings and observed climatology as the initial state, but
        1) their goal was to tune the model to near-zero TOA imbalance for the UNFORCED control run, which is of course inconsistent with “a good 20th century simulation.”
        2) since the proper preindustrial control used in 20CEN runs wasn’t that stable, and kept drifting cold with the TOA imbalence of 0.6 Wm-2 even after centuries of integration, your claim that the CCSM3 “was tuned to the 20th century observations” makes no sense at all.

        Oh, and by the way, Bitz et al tested what happens when you initialize the CCSM4 with a 1990s baseline climate. As it turned out, it raises, not decreases, the ECS.

      • doskonaleszare

        Well, ignoring this issue won’t make it go away. If you made a mistake and misrepresented what the NCAR team did in their CMIP experiments, you ought to issue a correction and retract the false statements you made here and elsewhere (e.g. in your MIT talk and response to a comment on your Uncertainty Monster paper).

      • I did not make a mistake. No one from the NCAR team has objected to what I said. For the CCSM3, the model parameters were tuned to the 20th century runs. Each modeling group also had the option of selecting whichever external forcing data they wanted, as long it was within the range of the published forcing data sets. If you would like to challenge my statements, go ahead and work on it; find the documentation and interview people at the modeling centers. I have provided documentation to support my statements, and I have also spoken with climate modelers at NCAR.

      • doskonaleszare

        > I did not make a mistake. No one from the NCAR team has objected to what I said.

        Well, I’m afraid that’s no longer true. I contacted with dr. Gerald Meehl Meehl, and he confirmed that they didn’ “tune the model response to fit 20th century observations, and never have.”

        > For the CCSM3, the model parameters were tuned to the 20th century runs.

        No, they were not. Look at this graph:

        Does it look like the 20th century temperature change?

        > Each modeling group also had the option of selecting whichever external forcing data they wanted, as long it was within the range of the published forcing data sets.

        For the sake of argument, let’s assume it’s true. But even if they had this option, it doesn’t necessarily follow that they could use it. For instance, if a modeling group implemented a tropospheric chemistry module in their GCM, and derived aerosol optical depth from emission data, it wouldn’t make sense to prescribe whichever aerosol forcing they wanted.

        Also, you claimed that

        “the 20th century aerosol forcing used in most of the AR4 model simulations (Section 9.2.1.2) relies on inverse calculations of aerosol optical properties to match climate model simulations with observations”

        I ask again: can you identify these (“most”) model simulations, or not?

        If not (because “the available documentation on each model’s tuning procedure and rationale for selecting particular forcing data sets is not generally available”), then how can you possibly claim that “most” of the models did inverse calculations of the aerosol forcing?

        > If you would like to challenge my statements, go ahead and work on it; find the documentation and interview people at the modeling centers.

        I did exactly that (did you? you describe the documentation as “not generally available”, which suggests you didn’t bother with checking the facts), and the documentation directly contradicts your claims.

        > I have provided documentation to support my statements, and I have also spoken with climate modelers at NCAR.

        So far, you have only provided a reference to the Gent et al paper, which you clearly misunderstood, since it doesn’t describe the “tuning to the 20th century runs”; and a reference to the AR4 section 9.2.1.2, which you also misunderstood, since it doesn’t describe the 20CEN model simulations.

        You haven’t shown any other evidence, beside anecdotal “I have spoken with modelers and they didn’t object to what I said”.

        However, we do know that one modeller has objected to what you had said

        http://www.collide-a-scape.com/2010/08/03/the-curry-agonistes/#comment-13587

        “However, Judy’s statement about model tuning is flat out wrong. Models are not tuned to the trends in surface temperature. The model parameter tuning done at GISS is described in Schmidt et al (2006) and includes no such thing. The model forcings used in the 20th Century transients were also not tuned to get the right temperature response. Aerosol amounts were derived from aerosol simulations using the best available emissions data. Direct effects were calculated simply as a function of the implied changes in concentrations, and the indirect effects were parameterised based on the median estimates in the aerosol literature (-1 W/m2 at 2000) (Hansen et al, 2005; 2007).”

        To give another example, here are some relevant quotes from the CCSM3 paper describing their 20CEN runs (Meehl et al 2006, “Climate Change Projections for the Twenty-First Century and Climate Change Commitment in the CCSM3″):

        “For the 1870 control run, a version of the model was formulated including an interactive sulfur cycle which increased the model run time about 20%, and a branch was run from the 1990 control run with GHGs and solar forcing instantaneously set to 1870 values. Sulfates were set to near zero. The model underwent an initial cooling, but after about 300 yr, the surface climate stabilized, with a net radiative imbalance of about -0.6 Wm-2 at the top of atmosphere. This was associated with somewhat greater cooling in the deep ocean than in the 1990 control run. After the surface temperatures stabilized
        (i.e., long-term trend of -0.011°C century-1), this 1870 control run was continued for another 400 yr.”

        “The forcings included in the twentieth-century simulation are as follows.
        Sulfate: direct effect from sulfur cycle model using time- and space-varying SO2 emissions (Smith et al. 2001, 2004);”

        Where Smith et al are

        Smith, S. J., H. Pitcher, and T. M. L. Wigley, 2001: Global and regional anthropogenic sulfur dioxide emissions. Global Planet. Change, 29, 99–119.
        ——, R. Andres, E. Conception, and J. Lurz, 2004: Sulfur dioxide emissions: 1850–2000. Tech. Rep. PNNL-14537, JGCRI, 16 pp.

        So it’s again a “forward calculation”, where aerosols were derived from emission data.

        Another AR4 model, HadGEM1 (Stott et al, 2006): “Transient Climate Simulations with the HadGEM1 Climate Model: Causes of Past Warming and Future Climate Change”

        Section 2:
        “The interactive schemes implemented in HadGEM1 to deal with sulfate, black carbon, and biomass aerosols are described in detail in Martin et al. (2006). Natural emissions of sulfur from dimethylsulfide (DMS) and volcanoes are taken to be time invariant and were based on Kettle et al. (1999), Jones and Roberts (2004), and Andres and Kasgnoc (1998). Pre-2000 anthropogenic emissions of sulfur used data provided by S. J. Smith (Pacific Northwestern National Laboratory, 2004, personal communication). In the ALL and ANTHRO past transient simulations, emissions between 2000 and 2004 are assumed to follow those specified by the SRES A1B scenario (Nakicenovic and Swart 2000) and the two predictions follow the SRES
        A1B and A2 scenarios. Past and future fossil fuel black carbon and biomass smoke emissions were based on datasets provided by T. Nozawa (National Institute for Environmental Studies 2004, personal communication).”

        No sign of inverse aerosol calculations either.

        GFDL CM2.0 and CM2.1:

        http://nomads.gfdl.noaa.gov/CM2.X/faq/question_13.html

        “Tropospheric ozone —–> SOURCE: using MOZART chemistry-transport model-generated distributions {Horowitz et al., 2003; Tie et al., 2005}; “1990” climatology from NCAR’s MACCM3 used for all years; emission values used are estimates at the beginning of each decade.
        All Tropospheric Aerosols: [Sulfate, Black and Organic Carbon]; [Dust and Sea-salt not assumed to vary – values held fixed at the 1990 estimates].
        Same methodology as for Tropospheric Ozone. SOURCE: black carbon and organic carbon from fossil-fuel sources {Cooke et al., 1999}; biomass burning from {Hao and Liu, 1994; Andrea and Merlet, 2001}. Historical emissions produced by scaling present-day values based on the EDGAR-HYDE historical emissions inventory {Van Ardenne et al., 2001}.
        Only sulfate aerosols are assumed to have a hygroscopic growth. External mixtures for radiation calculations.”

        Another forward calculation.

        HadCM3 (Johns et al, 2003: “Anthropogenic climate change for 1860 to 2100 simulated with the HadCM3 model under updated emissions scenarios”). Section 2.3:

        “In addition to the improvements mentioned in existing physical schemes, HadAM3 also now includes the capability to model the transport, chemistry and physical removal processes of anthropogenic sulfate aerosol which is input to the model in the form of surface and high level emissions of SO2 (outlined later; a fuller description of the interactive sulfur cycle in the model may be found in Jones et al. 1999). Note that this is in contrast to some earlier climate model studies (e.g. Mitchell and Johns 1997), which only included sulfate aerosol effects based on prescribed burdens.
        Other recent climate change studies including an interactive sulfur cycle include Roeckner et al. (1999) and Boville et al. (2001).”

        MIROC3.2 (Nozawa et al, 2005: “Detecting natural influence on surface air temperature change in the early twentieth century”)

        “The atmospheric component of MIROC has an interactive aerosol transport model [Takemura et al. 2000] which can handle major tropospheric aerosols (sulfate, black carbon, organic carbon, sea salt, and soil dust).”

        “Four ensembles with different external forcing factors were carried out for the period from 1850 to 2000. The first one is FULL, where the simulations were forced with the both natural and anthropogenic forcings: changes in solar irradiance [Lean et al., 1995], stratospheric volcanic aerosols [Sato et al., 1993], WMGHGs [Johns et al., 2003], tropospheric and stratospheric ozone [Sudo et al., 2002; Randel and Wu, 1999], surface emissions of anthropogenic carbonaceous aerosols and precursors of sulfate aerosols [Nozawa and Kurokawa, 2005; Lefohn et al. 1999], and land-use [Hirabayashi et al., 2004].”

        See? I’ve already provided evidence that “most” of the AR4 model simulations didn’t use the inverse method, contrary to what you have been claiming.

        Now, I ask again: what’s your evidence?

      • You asked the wrong correction of Meehl. They tuned the model parameters for the 20th century runs.

      • doskonaleszare

        > You asked the wrong correction of Meehl.

        Oh, really?

        > They tuned the model parameters for the 20th century runs.

        You can keep repeating that claim, but since you haven’t provided us with a single shred of evidence, I’m sorry, I’m not going to buy it, especially since you’re directly contradicted by papers describing the CCSM3 20CEN runs, and by dr. Meehl himself.

        Also:
        1. Your reference to section 3 of Gent et al paper proves that you misunderstood the purpose of the CCSM3 tuning.

        2. Your inability to name models which relied on inverse calculations of aerosol optical properties to match climate model simulations with observations suggests that you have never checked how they derived aerosol forcings; probably because, in your words, you “ran into a dead end (rather dead link) referenced to in Table S9.1. Sorting this out requires reading 13 different journal articles cited in Table S9″, and reading 13 different journal articles was just too time-consuming.

      • I realize you were referring to 1978, not 1950, so I should have addressed that point, but this is still what I had in mind in stating that ghgs plus aerosol changes combined to account for the warming, with ghgs a more dominant influence. if it had been 1950, I wouldn’t have invoked the aerosols. The figure you refer to shows that different climate sensitivity estimates yield different projections, but because both aerosols and ghgs (as well as solar changes) are more or less equally affected, the apportionment shouldn’t be significantly altered. I’m unaware of any other significant competitors for most post-1978 warming, although there are certainly changes consistent with the operation of cooling factors that offset some of the warming.

      • Gent et al also suggested that the excessive warming in CCSM4 might also reflect inadequate accounting for indirect aerosol effects, which are a n offsetting cooling effect rather than a warming effect.

        I do take your point, though, that the Lemke method might in principle be useful in evaluating interactions. It doesn’t seem to have come out with the right attributions for the interval in question, however. Maybe that can be improved in the future..

      • I’m unaware of any other significant competitors for most post-1978 warming, although there are certainly changes consistent with the operation of cooling factors that offset some of the warming.

        The inclusion of chapter 4 in the WMO expert ozone assessment was to identify a number of problematic issues that arose in the IPCC 2007 such as the dynamical response to ozone “forcing” on SH climatology.

        By dynamical response, what we mean is the effects on transport ( read heat) that are poorly understood away from equilbrium and the transitive effects on various metrics .

        The current chapter helps to place the Protocol’s climate impact within a wider context by critically assessing the effect of stratospheric climate changes on the troposphere and surface climate, following a formal request for this information by the Parties to the Montreal Protocol. As requested, the current chapter also considers the effects on stratospheric climate of some emissions that are not addressed by the Montreal Protocol, but are included in the 1997 Kyoto protocol. Hence, the chapter covers some of the issues assessed in past Intergovernmental Panel on Climate Change (IPCC) reports (IPCC, 2007; IPCC/TEAP, 2005). The current chapter is designed to provide useful input to future IPCC assessments.

        The troposphere and surface climate are affected by many types of stratospheric change. Ozone plays a key role in such stratospheric climate change, but other physical factors play important roles as well. For this reason, we consider here the effects on the stratosphere of not only emissions of ozone-depleting substances (ODSs), but also of emissions of greenhouse gases, natural phenomena(e.g., solar variability and volcanic eruptions), and chemical, radiative, and dynamical stratosphere/troposphere coupling

        The assessment identified a number of plausible assumptions eg

        Observations and model simulations show that the Antarctic ozone hole caused much of the observed southward shift of the Southern Hemisphere middle latitude jet in the troposphere during summer since 1980. The horizontal structure, seasonality, and amplitude of the observed trends in the Southern Hemisphere tropospheric jet are only reproducible in climate models forced with Antarctic ozone depletion. The southward shift in the tropospheric jet extends to the surface of the Earth and is linked dynamically to the ozone hole induced strengthening of the Southern Hemisphere stratospheric polar vortex.

        The southward shift of the Southern Hemisphere tropospheric jet due to the ozone hole has been linked to a range of observed climate trends over Southern Hemisphere mid and high latitudes during summer.

        Because of this shift, the ozone hole has contributed to robust summertime trends in surface winds, warming over the Antarctic Peninsula, and cooling over the high plateau. Other impacts of the ozone hole on surface climate have been investigated but have yet to be fully quantified. These include observed increases in sea ice area averaged around Antarctica; a southward shift of the Southern Hemisphere storm track and associated precipitation; warming of the subsurface Southern Ocean at depths up to several hundred meters; and decreases of carbon uptake over the Southern Ocean.

        This allows a number of papers to open legitimate debate on the observations in the SH and the efficacy and understanding behind them eg Polavani 2011 JCLIM.

        The CMIP3 model integrations (Meehl et al. 2007a) have provided some evidence that stratospheric ozone depletion may be a major player in SH climate change. Approximately half the CMIP3 models did not include the signi cant, observed changes in polar strato-
        spheric ozone in the SH in the 20C3M scenario simulations for 20th century climate. Taking advantage of this, several studies have shown that SH atmospheric circulation changes in the CMIP3 model simulations that did include stratospheric ozone depletion are much larger than for the models that did not (Cai and Cowan 2007; Karpechko et al. 2008; Son et al. 2009a).

        In the accompanying paper Polvani 2011 GRL there is a nice argument

        It is now widely documented that stratospheric ozone depletion has played a major role in causing the atmospheric circulation changes that have been observed in the Southern Hemisphere (SH) during the second half of the 20th century [see e.g., Polvani et al., 2011,
        and references therein]. It is thus likely that the projected ozone recovery will have a considerable impact in the coming decades: understanding that impact is the goal of this paper. It may be worth recalling, as originally pointed out by Shindell and Schmidt
        [2004], that in the late 20th century the depletion of stratospheric ozone has added to the circulation changes associated with increasing greenhouse gases (GHGs), whereas in the 21st century ozone recovery will subtract from them.

        There is a opinion piece in Nature that is simplistic and over extends (assumes) but it provides a vehicle for legitimate enquiry

        http://www.nature.com/nclimate/journal/v1/n1/full/nclimate1065.html

      • Those are interesting articles, but they don’t change the assignment of most post-1978 global warming to a combination of ghg and aerosol effects.

      • Aerosols are not an escape clause in the SH as there are problematic issues with the observations at actinometric stations in the SH ITCZ as was identified in Wild eg.

        Evidence for a decrease of SD from the 1950s to
        1990 and a recovery thereafter was also found on the
        Southern Hemisphere at the majority of 207 sites in
        New Zealand and on South Pacific Islands [Liley, 2009].

        Liley [2009] pointed out that the dimming and brightening
        observed in New Zealand is unlikely related to the direct
        aerosol effect, since aerosol optical depth measurements
        showed too little aerosol to explain the changes. On the
        basis of sunshine duration measurements he argued that
        increasing and decreasing cloudiness could have caused
        dimming and brightening at the New Zealand sites.

        The symmetry breaking in the T record for NZ is also problematic ie the antipersitence and reversibilty eg Carvalho 2007.

        In this study, low-frequency variations in temperature anomaly are investigated by mapping temperature anomaly records onto random walks. We show evidence that global overturns in trends of temperature anomalies occur on decadal time-scales as part of the natural variability of the climate system. Paleoclimatic summer records in Europe and New-Zealand provide further support for these findings as they indicate that anti-persistence of temperature anomalies
        on decadal time-scale have occurred in the last 226 yrs. Atmospheric
        processes in the subtropics and mid-latitudes of the SH and interactions with the Southern Oceans seem to play an important role to moderate global variations of temperature on decadal time-scales

        http://www.nonlin-processes-geophys.net/14/723/2007/

        Reversibility suggests natural variation,that is not well understood by the CS community,the same constraints on weather forecasts are applicable to CS.The non periodicity of flows constrains predictability,averaging non periodic flows does not make them more predictable this is well described in the literature eg Lorenz .

      • Maksimovitch – It is a matter of courtesy, as I’ve mentioned earlier, to link to references if you are going to use them as the basis for comments. I’m familiar with the Wild data, but In this case, I don’t think I’ll bother trying to track down the rest on my own, because they don’t alter the main conclusion that the dominant cause of post-1978 warming was a combination of ghg and aerosol effects, including both indirect aerosol effects on clouds as well as their direct effects. That there were other factors and regional variations is not in doubt, but they accounted for less of the change than did the ghgs and aerosols..

      • While I didn’t try to track down the Liley article, the Carvalho et al paper reinforces the point that there do not appear to be any apparent strong competitors with aerosol and ghg effects to explain most of the post-1978 warming, but that there are likely to have been some degree of offsetting cooling factors, without which the warming would have been greater.

    • “There is also some danger in the application of theoretically derived paradigms by individuals without reasonably detailed knowledge of climate physics, because this can easily lead to small misinterpretations that generate inaccurate conclusions.”

      Fred, to make it clear: There is no application of theoretically derived paradigms in this modeling approach at all! EVERYTHING, including the model and the composition of inputs is derived from observations, only. Observations, measurements of system variables, hide essential information about the behavior of the system. This knowledge can be extracted by self-organizing modeling and transformed into predictive models.

      Clouds means Radiative Cloud Fraction as measured by the NASA TOMS satellite:

      http://toms.gsfc.nasa.gov/reflect/reflect_v8.html

      • My reply is here.- sorry I didn’t get it in the right place.

      • Good point. Such models should not be used by global warming alarmists to fine tune perameters to reinforce their preconceived notions about reality. Rather, to eliminate bias global warming alarmists would–if they really cared to inject honesty into the GCM fabrication business–use these techniques to show how other unconsidered variables–like cloud cover–help to explain climate even better than AGW theory.

      • steven mosher

        “Fred, to make it clear: There is no application of theoretically derived paradigms in this modeling approach at all!

        The decision to use short lags IS a paradigm.
        The decision to use sunspots rather than TSI IS a paradigm
        The decision to use C02 concentration rather than log C02 IS a paradigm
        The decision to ignore, methane, to ignore, black carbon, to ignore sulfur
        IS a paradigm.

        You can build models out of data, but when you select data you are making theoretical assumptions.

        For kicks go get the Forcing database for Ar5. its public. Build your model from that. use a training period and a validation period. With 150 years of data, I’d suggest that you train on 1850 to 1979 and predict 1980-2010
        as your validation period.

        Just a thought. But seeing the variables selected makes the result much less interesting

    • Fred,

      [“Frank – That is somewhat disconcerting, because we have excellent evidence from the physics that CO2 played a dominant role between 1978 and recent years…”]?

      With all due respect, I find your point disconcerting. What ‘excellent evidence’ is there that CO2 is the dominant factor after 1978 and not before? Between c1910 and c1944 the gt rose by appx 0.8 deg C. The rise from c1975 to 1998 was appx 0.8 deg C over a slightly smaller timescale. The two warmings are almost identical in size, yet you are confident that the latter is due to CO2 but not the former! This is illogical thinking. The lack of further warming since 1998 (thirteen years) effectively invalidates the idea that CO2 was the main contributor between 1978 and 1998, as CO2 has continued to increase.

      What you are essentially doing is condensing AGW into a twenty-year period and claiming that CO2 is the culprit for that period.

      • What ‘excellent evidence’ is there that CO2 is the dominant factor after 1978 and not before?

        I’m sorry if I wasn’t clear, Arfur. CO2 and other anthropogenic ghgs were a dominant factor since 1950, and an important factor earlier in the century, along with solar and aerosol forcing, as has been discussed by several of us in recent threads – I hope you’ll forgive me if I ask you to review those recent threads rather than having me repeat the evidence still again.. What I stated was that the combination of ghgs and aerosols was the dominant factor after 1978. We can’t say CO2 alone was dominant for that interval, although it was certainly significant.

        Your statement that warming from 1910 to 1944 was 0.8 C is incorrect. Is that what you intended to say?

      • Fred.
        Arfur has just mixed °C and °F.
        A variation of 0,8°F equals a variation of 0,45°C.

        For the rest he is completely right.

        The fact that the T° variation between appx 1910 & 1940 was almost exactly the same as the one observed between appx 1970 & 2000 (0,45°C), whereas [CO2] has increased by about 20% between both periods (manmade CO2 emissions being multiplied by 6), formally falsifes your claim that CO2 is the dominant factor.

      • Thanks for the explanation. CO2 and other ghgs can be assigned a dominant role after 1950, but they were only part of a multiplicity of factors responsible for the positive forcing earlier in the century, which also included solar changes and a reduction in volcanic aerosols from the beginning of the century until mid-century. For an overview of the relative forcings, an informative source is Figure 5 in Gregory and Forster 2008.

      • Eric & Fred…

        Eric, thank you for your supportive post. You are quite correct that, irrespective of the actual figures, it is the similarity of the warming difference that is key here.
        .
        Fred, you were absolutely clear in your first post and you DID say quite clearly that it was CO2 that played the dominant role. Now you say it was CO2 and other GHGs ( I take it you except water vapour from that statement). It doesn’t matter. Whether it is just CO2 or all the ‘dry’ GHGs, my (and Eric’s) point still stands. Your reference to the G&F 2008 paper below does not explain the anomaly. That paper is based on model assumptions to give the anthropogenic forcings result. It also takes the overall warming during the 20th century and does not take into account the discrete warmings of c1910-1945 and c1875-1878 which quite clearly occurred without – according to your argument – the ‘main contributor’ CO2. (You can ignore the other dry GHGs due to their relative concentration for the purpose of my point.) Here is an excerpt:

        “[…because in these HadCM3 experiments the overall warming in the 20th century, and its trend since 1970, are similar with and without natural forcing [see Gregory et al., 2006, their Figure 2], consistent with the assumption that natural forcings have little long-term effect.”]
        .
        Now look at this graph which explains our argument…

        http://www.woodfortrees.org/plot/hadcrut3vgl/from:1917/to:1941/plot/hadcrut3vgl/from:1917/to:1941/trend/plot/hadcrut3vgl/from:1976/to:2000/plot/hadcrut3vgl/from:1976/to:2000/trend

        .
        Please note that the two slopes are virtually identical. Both are over a 24 year period (yes, I adjusted my original dates to cater for this to avoid any argument about different periods). The OLS trends both give – as Eric said – appx a 0.4 deg C increase. But also, if you look carefully, you will notice that the difference between the trough and the peak of each graph is about 1.0 deg C. I was eyeballing a graph when I quoted the 0.8 C rise which you said was incorrect. I was incorrect, the rise was actually greater than 0.8 C. This means that the ‘post 1950 warming which was ‘mainly contributed by CO2 an other dry GHGs’ was preceded by an equally large warming which was ‘not’…

        I repeat the logical point that you cannot claim the veracity of the radiative forcing theory to explain the warmings post 1950 when there have been similar warmings prior to 1950. It doesn’t matter how many simulations you run based on assumptions of forcing, the data does not support such an exclusive approach. Furthermore, if G&F had continued the graph further into the future, they would now have to explain why the anthropogenic forcings were continuing to increase, along with CO2 (and others), whilst the temperature has not.

        Respectfully yours…

      • Arfur – What you state is incorrect. We can conclude with confidence that CO2 along with aerosols were the dominant source of post-1978 forcing, and it matters little whether one adds in the other anthropogenic ghgs, although they also contributed.

        From you comment, I surmise that you didn’t understand the GF-08 paper and the model they used, but it you think you did, please state what “assumptions” by the authors you are challenging in the way they arrived at their conclusions and why those assumptions matter. Do the same for the Padilla et al paper on the same thread about transient climate sensitivity. Your reference to intervals prior to 1978 is irrelevant to the discussion of the warming since that time.

        The fact that you introduced some of these irrelevancies strikes me as a warning signal. It seems to me that you are simply trying too hard to justify a preconceived notion rather than to understand how the climate has actually behaved. As I explained above, absolute certainty about post-1978 warming is impossible, but the conclusion that the anthropogenic ghgs (or CO2 by itself) plus aerosol effects were the dominant warming factors is robust and is not undermined by your claims about 1875, 1910, and a mythical 0.8 C rise from 1910-1944 (you can’t look at spikes, you have to look at the curve). .

        Finally, it’s also clear from GF-08 and many other sources that the earlier and later warmings in the twentieth century, although comparable in magnitude, involve a different mix of contributions. The anthropogenic ghgs dominate after 1950. They contribute prior to 1950, but so do declining volcanism and a rise in solar irradiance. Even so, anthropogenic ghgs can account for a sizable contribution to the pre-1950 warming. There is no inconsistency.

        I don’t want to preclude further discussion, but I also don’t want to become entrapped in non-productive arguing. If you have further evidence to introduce, I’ll certainly be glad to look at it, but otherwise, it would probably be best for interested readers simply to look at these exchanges and linked evidence themselves.

      • Fred, first let me say that I am as unimpressed by your sanctimonious attitude as I am at the alacrity with which you accept a paper such as the G&F 2008 you have used to support your ‘there is excellent evidence…’ statement. Your appeals to authority hold no particular sway with me. As a non-scientist, I am unburdened with the need to display any loyalty to those who seek to further enhance the dogmatic belief in the radiative forcing qualities of CO2 and other GHGs. I prefer to think about the problem and use observed data to try to explain whatever may or may not be happening to the climate. So far, there is no supportive evidence that the expected warming due to CO2 etc is as quantifiable as you or other supporters of cAGW seem to think.
        .
        This is the point you seem unable to grasp. The observed data does not support the models. Trying to use more models to support your argument doesn’t make that fact any less so. Trying to deflect the argument away from your clear and unambiguous statement that ‘there is excellent evidence that CO2 played a dominant role between 1978 and recent years…’ by getting dragged into a discussion on a particular paper which in itself uses estimation, assumption and poor thinking won’t cut it. However, I will indulge you for the sake of fairness, although I understand that someone arguing against your dogma can sometimes be stressful and will therefore understand if you don’t want to listen…
        .
        When considering the G&F 08 paper, please remember that the first line of the paper states “Observations and simulations of time-dependent twentieth-century climate change indicate a linear relationship… between radiative forcing and global mean surface air temperature change.” and then goes on to state
        Disregarding any trend caused by natural forcing (volcanic and solar), which is small compared with the trend in anthropogenic forcing, we estimate…. There is the first assumption. Later, the paper will state ”…consistent with the assumption that natural forcings have little long-term effect…”. There’s another one. The paper also states … but assumes it affects only the trend, not fluctuations since the years strongly affected by volcanoes have been excluded.. Yet another assumption. I’ll stop there, somewhat quizzical that volcanoes can be both dismissed in one sentence as having little long-term effect but later excluded in spite of their ‘strong effect’. Interestingly, from my point of view, is the further assumption that in the AOGCM simulations of recent climate change, the unperturbed climate is often taken to be the late nineteenth century… and that the perturbations are small enough to be treated as linear, an assumption that is implicit…

        Linear? They think warmings of 0.4 C in two years (between 1876 and 1878) and 0.7 C in 33 years (between 1911 and 1944) can just be dismissed as ‘linear’ and yet they can justify only using data from 1970 to 2006 (see introduction)? They assume a climate response (TCR) based on model integrations ending in 1999? The same year that the MBH98 was sold to Joe Public as ‘proof’ of the impending climate disaster? And they publish ten years later? When the temperature had not risen above the 1998 level?
        .
        Oh, whatever. Have it your way Fred. You carry on in your self-delusional haze of blind acceptance of any pro-AGW literature without any hint of an open mind and then splutter the usual superior, holier-than-thou nonsense about why the assumptions made by the AGW modellers are actually more important than observed data.
        .
        I’ll stick to the real world.
        .
        Oh, and I am also happy for readers to make their own mind up.

      • Arfur – As you say, you are not a scientist. I think any scientists who read the GF-08 article as well as the other evidence I’ve cited and linked to will see that you didn’t understand it. In any case, they can make up their own minds.

      • Fred,

        LOL!

        Fred Moolten, 17 Oct 2011…

        http://judithcurry.com/2011/10/17/self-organizing-model-of-the-atmosphere/#comment-123427

        [“…because we have excellent evidence from the physics that CO2 played a dominant role between 1978 and recent years…”]

        Fred Moolten, 17 Oct 2011 slightly later…

        http://judithcurry.com/2011/10/17/self-organizing-model-of-the-atmosphere/#comment-123562

        [“It would be wrong of me to be too dogmatic in assigning most post-1978 warming to the combined effects of changes in CO2, other ghgs, and anthropogenic aerosols…”]
        and…
        [“I would emphasize that my original point could be better expressed by saying that the role of CO2 during this interval was significant, and leave the question of what was “dominant” unanswered.”]

        Which was exactly my point in my first response.
        .
        Well done on at least being able to admit that you were wrong. Many wouldn’t. Respect, dude.

      • Arfur – You have given me undeserved credit for admitting I was wrong, when I didn’t admit being wrong and I don’t consider my conclusions wrong. I simply said I couldn’t be too dogmatic because of the possibility of error, particularly since there was no need for a statement about dominance in order to point out an important role for CO2. My original conclusion stands however – I think the post-1978 warming can be attributed to a combination of anthropogenic ghgs and aerosol effects with high confidence.

        Arfur – I hesitate to impute motives. However, it occurs to me that the intention of your false praise was not to commend me but to pretend I was wrong earlier. I believe you are still pushing too hard to advance a partisan agenda without a knowledge of the science adequate for the task (e.g.,your misinterpretation of GF-08). In any case, we agree that the paper and others should be looked at by readers who want to make their own judgments.

        As you imply with your praise, I try to be scrupulously honest, which is what I’m doing right now.

      • Fred,

        For goodness sake, stop trying to argue around the point. Where you were wrong was in making a clear and unambiguous statement that CO2 was the main contributor to the greenhouse effect. I quote:

        [“because we have excellent evidence from the physics that CO2 played a dominant role between 1978 and recent years, along with declining cooling aerosols.”]

        I, among others, questioned your ‘excellent evidence’. You then, amongst other tangential arguments, moved your position to one of:

        [“My original conclusion stands however – I think the post-1978 warming can be attributed to a combination of anthropogenic ghgs and aerosol effects with high confidence.”]

        The ‘wrongness’ is in making a confident statement which you later adjust under questioning but still try to suggest it was your initial point. My praise might have appeared faint to you but it was a genuine feeling that you had admitted your change of position. Perhaps if you could refrain in future from making firm statements which you later feel compelled to adjust then we may avoid antagonistic discussion.

        By the way, the change of ‘CO2′ to ‘anthropogenic ghgs’ does not explain the lack of ‘excellent evidence’…

      • Argur – I’ve provided what I consider excellent evidence that post-1978 warming was dominated by anthropogenic ghgs and aerosol effects, including the links to the GF-08 article and Isaac Held blog post. I’m confident that conclusion is correct. I did state it would be wrong of me to be too dogmatic, but that was not because I wasn’t confident in the conclusion, but rather to avoid a quarrel about a point tangential to the main one I was making that CO2 was a significant contributor – a point that didn’t require the use of the word “dominant”.. Anyone interested should simply review what has been said and make his or her own judgment.

        I hope the last sentence in the previous paragraph is enough to discourage further arguing here. I’m always willing to continue to pursue an important topic, but in the absence of new evidence on this particular topic, there is probably no point, and I don’t like becoming engaged in long arguments with no resolution.

    • Fred, I think Judy and Lindzen are right on this one. Hansen was criticized in the 1990’s for using too course a grid. Forget subgrid modeling errors which are probably big. The idea is “I know tgere are large nunerical errors, but the patterns look reasonable and agree with climate data.”. Judith is pointing out that they DO NOT agree with data. So, there seems to me to be no reason to believe the models. Even Andy, who I respect had no answer for this

      • Fred, It’s easy to get fooled by the vast array of conclusions that come from running the models. The scientific basis is very weak. We need better methods and data, not just massive generation of CO2 by power plants powering massive computers running models that need better methods. Just an example, the models relegate convection to a subgrid model, even though its a very complex process with lots of turbulence. What’s the rationale for that?

      • David – It would be wrong of me to be too dogmatic in assigning most post-1978 warming to the combined effects of changes in CO2, other ghgs, and anthropogenic aerosols, but my confidence that this is probably true is based on the observed changes in these substances, and their known radiative properties, as well as the limited nature of contributions easily attributable to other factors. Judy has accurately pointed out that the CCSM4 model overpredicted recent warming, and so your point about disagreement with data is correct.

        What I don’t see as a logical corollary of the overprediction is that something other than ghgs plus aerosol changes (mostly a reduction in cooling aerosols) did most of the warming. For that to be true, the CCSM4 model would have had to overestimate the sensitivity to these factors while not doing the same for other phenomena that alter temperature with consequent feedback effects. That’s possible, but there is no good mechanism for this nor evidence that it occurs. Rather, I would conclude that this model simply overestimated everything, while leaving the relative strength of the most important variables unchanged or nearly so.

        Rather than pin too much on this point, where I might ultimately prove to be wrong, I would emphasize that my original point could be better expressed by saying that the role of CO2 during this interval was significant, and leave the question of what was “dominant” unanswered.

  9. I wonder if anyone has considered writing a climate model using LISP? The simple syntax tree and the way LISP treats data as code would seem to make it the ideal language for such a model. It is ideal for writing a program where you are not even sure of the questions.

    • steven mosher

      you must be crazy. LISP would be the last language one would use on this kind of problem.

      • FORTRAN would be good, it is possible to incorporate genetic algorithms and evolutionary programming principles with it. That seems to be what the OP seems to be implying is needed.

      • steven mosher

        argg. well most are written in fortran because fortran is the language of choice for scientific computing. Its what scientists learn ( although one GISS guy uses python)

        There is also legacy code to consider. I hate fortran, but considering all the legacy code, the skill set of the teams, the fast compliers on supercomputers, the large number of tested libraries, I can’t see many better choices.

      • It’s been a long time since I had anything to do with Fortran, but IIRC Fortran77 supported structured programming (IF – ELSE- ENDIF). So are most climate models built using structured programming?

      • AK: Dr. Curry directed me to downloadable codebase for a climate model. It’s on a retired computer but I’ll have a look. It was in Fortran and appeared well-organized and commented. It was better than I expected after Climategate.

      • really and what are your reasons for saying that? Only drawback I can see is that it is a bit dated .If one wanted to approach the problem as a NN there are certainly a lot of worse languages one could use.

  10. Joachim Seifert

    Dear JC: Exciting method for independent climate research without having to rely on calculations of warmist institutes…..! Free software for Excell – this will advance science, no doubt…..I wish I could get Lemke to run my model calculations given in ISSN 978-3-86805-604-4 , which allows accurate climate calculations for more than 50,000 years in a for everyone transparent way…. including an accurate forecast for this century: Accurate, because a seventh completely new variable: the “Libration” of the Earth (see Wikipedia, animated picture for the Moon, doing this movement as well) is taken into account…. This seventh variable, the Libration, still remains omitted by the IPCC until today, but is now finally recognized by the IPCC in a first step (lamentably only in private AR4 error report correspondence…)……
    ……Good move, good progress, we keep on the right way….

  11. Is this any different from calibrating models by hindcasting data?

    • References:

      Farlow, S.J. (ed.): Self-Organizing methods in Modeling. GMDH Type Algorithm. Marcel Dekker. New York, Basel. 1984

      Ivakhnenko A.G.: Group Method of Data Handling as a Rival of Stochastic Approximation Method, Journal “Soviet Automatic Control”, Nо. 3 (1968), pp. 58-72.

      Ivakhnenko A.G.: Heuristic Self-Organization in Problems of Automatic Control, Automatica (IFAC), No 6 (1970), pp. 207-219

      Ivakhnenko A.G.: Polynomial theory of complex systems, IEEE Trans. Sys., Man and Cyb., 1 (1971), No 4, pp. 364-378.

      Madala, H.R., Ivakhnenko, A.G.: Inductive Learning Algorithms for Complex Systems Modelling. CRC Press Inc..Boca Raton, Ann Arbor, London, Tokyo. 1994

      Müller, J.-A., Lemke, F.: Self-Organising Data Mining. Libri, Hamburg, 2000

    • Yes, it is. Backtesting is taking an existing model and check its performance on new data. This method is about creating the model algorithmically (which also includes backtesting).

  12. Interesting article, let’s see how the predictions hold up against the ‘official ones’.

    Meanwhile, more references to literature would be welcome, perhaps some key references or a text book? For instance, is it correct to say that this technology has nothing (or very little) to do with neural networks (i.e. linear optimization) or self organizing maps?

    What I learned from http://www.climateprediction.eu/cc/About.html is that the software promoted by the site and the climate prediction site is not an academic project, but a commercial one. This is not to say that it would be false or fake or anything, and I suppose there is a sound theory behind.

  13. “Noisy data available in data sets with a small number of observations”

    It sort of means what you mean by noise. If I were to calculate a the effect of aging on the mean heart rate I don’t believe that I would take a persons lowest heart rate during their sleep and their maximum heart rate during the day, then average that, then average the averages, then average the averages. Now would I then at this point calculate the the statistical distribution; padding the early in utero points and later post-death heart rhythm.

  14. Frank – I don’t understand your point. Certainly you are using observations, although many critical ones may not have been available to you – I suggested tropospheric water vapor, for example, but perhaps you included it. The data you used for clouds were inadequate for warming/cooling estimates; you probably should have used HIRS/ISCCP data instead that included different cloud types and altitudes. Regardless, the issue is whether those observations are applied in a way that yields an accurate output. This didn’t happen for the interval you mention, and your general conclusions about CO2 are wrong if you imply an insubstantial role over a few decades or more. I think you may have overestimated the extent to which your extraction method can be transformed into predictions, although this might improve with more, better, and more relevant observations than those you used.

    • The presented system model is not meant being complete and final. I’m aware that there system variables (known and maybe some unknown) are missing in this model as they do in CO2-theory-driven models. In fact, true missing variables add noise to the data, which is why we in almost all cases have to deal with noise (noise is not only a category of data quality). The modeling method has to take this into account, appropriately.

      About your second point, this refers to what I called the adaptive learning path: We have to use all substantial knowledge to come up with models that better suit to the complex nature of climate.

      • Frank – You make good points. Would it be a worthwhile for you to consult with climate science professionals in applying your method to climate change, or are you already doing that? The reason I ask is suggested by points I made earlier. For example, factors responsible for post-1978 temperature change are listed as including aerosols, clouds, and CO2, but over an interval of that length, clouds are not an independent variable – the effects of clouds are the effects of CO2, and of aerosols, and ENSO, etc, all operating at different times and in some cases different directions. In light of these sometimes complex relationships, some advice on how to assign attributions might be useful.

      • Yes, this is the way to go.

        Also, from a current European research flagship proposal that targets a holistic, data-driven approach on environmental, economic, social, and technologial problems:

        “Paradigm Shift in Our Understanding of the World
        The conventional ‘medicines’ to tackle the problems of our world fail more and more often. But many problems today are due to an out-dated understanding of our world. In fact, our traditional way of thinking is fundamentally wrong, because the world has changed: While its parts still look pretty much the same, we have networked them and made them strongly interdependent. When ‘self-organization’ sets in, the components’ individual properties are not longer characteristic for the system behaviour, but collective behaviour takes over. Group dynamics and mass psychology are two typical examples. …

        As a consequence of the above, we have to turn our attention away from the visible components of our world to the invisible part of it: their interactions. In other words, we need a shift from an object-oriented to an interaction-oriented view, as it is at the heart of complexity science. This paradigm shift is perhaps of similar importance as the transition from a geocentric to a heliocentric view of the World. It has fundamental implications for the way in which complex techno-socio-economic systems must be managed and, hence, for politics and economics. Focusing on the interactions in a system and the emergent dynamics resulting from them opens up fundamentally new solutions to long-standing problems”.

        http://www.futurict.eu/

  15. Two thirds of the planet covered in H2O, thousands of times more H2O in atmosphere than CO2, and clouds all over planet night and day, but this guys picks six variables that do not include H2O. Good job.

    • Good point mkelly. I agree that the vast amount of water on this planet must surely have an influence on climate. Hence a model that does not factor in any effect (norwithstanding that the net effect may well be to cushion the total system against wild positive feedbacks) would be a significant shortfall.

  16. And, in this manner we can perhaps more easily predict the end of the world, assuming it continues on past this Friday.

    Fyi–

    Abstract. A novel approach based on the neural network (NN) technique is formulated and used for development of a NN ensemble stochastic convection parameterization for numerical climate and weather prediction models. This fast parameterization is built based on data from Cloud Resolving Model (CRM) simulations initialized with TOGA-COARE data. CRM emulated data are averaged and projected onto the General Circulation Model (GCM) space of atmospheric states to implicitly define a stochastic convection parameterization. This parameterization is comprised as an ensemble of neural networks. The developed NNs are trained and tested. The inherent uncertainty of the stochastic convection parameterization derived in such a way is estimated. The major challenges of development of stochastic NN parameterizations are discussed based on our initial results.

    ~Krasnopolsky, V.M.; Fox-Rabinovitz, M.S.; Belochitski, A.A.; Development of neural network convection parameterizations for numerical climate and weather prediction models using cloud resolving model simulations, Neural Networks (IJCNN), The 2010 International Joint Conference on, 18-23 July 2010, Pages: 1-8

  17. Judith Curry

    Yes. I would agree that this methodology brings something new and fresh into the equation. I like the fact that it is not necessarily “consensus” seeking (i.e. dismissive of conflicting views). The only caveat I would have is that it should get “reset” continuously with actual empirical data.

    Those believing in the “consensus” view (i.e. that CO2 IS the principal “control knob” for our climate) will find little solace in the opening paragraph describing the project, which I have copied here:

    What Drives Global Warming?

    To say it upfront: It is NOT CO2. Not necessarily and not exclusively. Looking at observational data by high-performance self-organizing predictive knowledge mining, it is not confirmed that atmospheric CO2 is the major force of global warming. In fact, no direct influence of CO2 on global temperature has been identified for the best models. This is what the data are seriously telling us. If we believe them, it is the sun, ozone, aerosols, and clouds – and possibly other forces not considered in this model – that drive global temperature in an interdependent and complex way.

    Let’s see where this leads. It should be interesting.

    Max

  18. Can someone explain what’s going on? What is the different between this and polynomial/rational function modeling?

    We know: “Polynomial models have poor extrapolatory properties. Polynomials may provide good fits within the range of data, but they will frequently deteriorate rapidly outside the range of the data.”

    We also know: “Rational functions have excellent extrapolatory powers. Rational functions can typically be tailored to model the function not only within the domain of the data, but also so as to be in agreement with theoretical/asymptotic behavior outside the domain of interest.”

    Fred Moolton’s very good suggestions for tailoring seem to be assuming a rational function approach is being done. (He is probably aware of the poor extrapolatory properties of trying to fit polynomials to data.) But the post’s author, Frank Lemke, states that: “There is no application of theoretically derived paradigms in this modeling approach at all! ”

    So what approach exactly is Lemke using?

    • On one level, there is no difference between this and a fit. However, unlike the polynomial which is essentially information less as regards to projection, the models generated by a neural net can be predictive. They can in addition be predictive. They can sometimes tell you that you have a relationship between tow things you believed to be unconnected or present an underlying, previously unidentified waveform.
      Normally you get very pretty rubbish, but some people get quite good at this stuff.

  19. The claim that there is “no application of theoretically derived paradigms” seems wrong to me – anytime you decide what data to put in (and, as Steven M notes, with what transformations and what lags) there is theory involved…

  20. It is that what you mentioned: Self-organization of a discrete form the Volterra functional series (polynomial, rational function).

    The basic concept is a “model of optimal complexity”. Model complexity is a unimodal function of noise. If noise in the data grows complexity of the optimal model decreases. For purely random data the optimal model is the (climate) mean. This leads to so-called non-physical models. Only in case of noise free data you get the physical model. This concept avoids overfitting the design data, i.e., modeling just chance correlations.

    You are right, model stability/extrapolation is a problem the algorithm/model has to take care about.

  21. Frank
    Compliments on an insightful application.

    Recommend adding neutron counts as an independent input as proxy for galactic cosmic rays. See NOAA Cosmic Rays

    Also separately input UV and Visible portions of TSI as source data.

    Global fossil fuel use would be another good input to predicting CO2.

    What ways are there to evaluate the data mining results to give insight into parameters. e.g. teh relationship between CO2 and temperature conventional outputs?
    e.g. the lead/lag between solar insolation versus CO2 and temperature.

    Predicting the Length Of Day (LOD) would be another test of a precise observable climate parameter that integrates winds varying from TSI etc.

    • Thanks David.

      The software and models that describe global temperature can be downloaded free (models are part of the examples in the software; you need a Mac to run):

      http://is.gd/Z3NeB2

      Models of the other variables (aerosol, ozone, …) I can send you by email.

  22. The objective function of the algorithm is not well described. To continue a thought posted by Vaughan Pratt, within a range of 90% of the optimal objective function achieved, and the optimal itself, you can usually find a large set of models that have different entities included, different parameter estimates, and different interpretations. The difference in the objective function between the best and all these second and third raters can’t be known to be other than chance variation. With many variables and few observations, it is next to impossible to avoid overfitting and over-interpreting. So tell us more about how you are not excessively fitting noise.

    FWIW, this post reads like an advertisement.

    • FWIW, this post reads like an advertisement.

      Have you seen his website? But still, it’s better than didactic “this is how it is ’cause I say so” that you get from a lot of papers.

      • I did visit the website, where different versions of the software are offered at different prices.

    • 1. There is a proven history of this approach of more than 40 years. References are also given in this discussion.

      2. Follow the “advertisement”. Sometimes people also call it transparency.

      • Interesting so far, but the comment from above:

        >The objective function of the algorithm is not well described<

        well enough describes my current puzzlement.

        Could you define more precisely how the algorithm actually works, please ?

      • It would still be nice to have more details in this post, and in the comments, about what exactly the input is, and what exactly the objective function is, and how the algorithm homes in on the optimum.

      • If you noticed, the threads about control theory were pretty vague, too. I think the idea is to present the concept rather than a finished proposal, and see what the reactions are.

      • It employs the concept of external supplement which states that without information from the external world it is not possible to get new knowledge. You find more on your questions here:

        http://is.gd/aJtT3e

  23. The described adaptive learning path methodology looks a little bit like Kalman filtering methodology applied to climate system…
    Interesting approach that definitively throws “[CO2] unique control knob” to the basket.

  24. Data mining is a wonderful tool; but, it just another tool in the analytical toolbox. Sometimes it provides wonderful insight, sometimes it serves up spurious or nonsensical results. Just like any analytical technique out-of-sample validation is a must.

    There are two basic types of data mining approaches (or ways to ‘let the data do the talking’). ‘Unsupervised’ learning (no prior information – one way conversation). ‘Supervised’ learning (prior information – structure the conversation). You can combine these approaches in an iterative process of ‘Unsupervised-Supervised’ learning and this approach works best.

    Data mining suffers one great drawback. It assumes that the data captures all of the needed information to conduct the conversation.

    From my meager understanding of the climate sciences, the observed data is not very rich in information (length of the historical record), the observed data is contaminated by noise, and the system is very complex. So, when you ‘let the data do the talking’, you are talking to a two year old child. Little kids will tell you any story that you want to hear.

    It does not surprise me that data mining found no evidence of CO2 influence of global temperatures. If it exists, it is hidden in the babbling conversation.

    People have stated that there is 30 years of satellite data of all types – plenty of data. I suspect that we are observing information that takes much longer to reveal itself.

    My own silly unsupervised ‘let the data do the talking’ of the Central England Temperatures shows a clear 60 year (irregular) cycle. The cycles are not fixed, they wiggle with time. The range of the cycles is around 0.6C. Is there something really there? Again, out-of-sample validation is needed.

    Judith, you have excelled by inviting the ‘Control Theorists’ and ‘Data Miners’ (knowledge discovery) folks to contribute. It is time invite the ‘Information Theorists’.

    Is there enough information to claim anything within in the ‘standards of proof’?

    BTW, what are the ‘standards of proof’ in climate science?

    • I think that, in essence, is what the statisticians are arguing about. In a broad sense, statistics is about information theory.

  25. Would you be willing to publish the actual functions, with the coefficients and lags?

    Thanks

    • Paul, go download the software free. It contains all the models for global temperature (it is a model ensemble to describe uncertainty) and a PDF book. Let me know if you also need the other models on ozone, CO2 etc.

      http://is.gd/Z3NeB2

      • Thanks – I was able to download it and open it without difficulty.

        I think I understand how the models are built up. What I would be most interested in is whether it is possible to create an “aggregate” equation which shows the coefficients of each of the input variables (counting each “lagged” variable as a separate input variable) and their influence in the final output. As it is, I find it very difficult to figure out what the influence of the various input variables is (which is what I would be looking for if this were an ordinary regression analysis).

      • I get your point and I agree. This is future work. It is not trivial since models can be highly nonlinear. If you try to aggregate the model into a single equation it may fill tens of pages, which is not really easier to understand.

  26. For climate science, I think that the best ‘unsupervised’ learning approach is Singular Spectrum Analysis (SSA). The SSA approach uses Principal Components Analysis (PCA) but SSA focuses on time series data.
    Google “Singular Spectrum Analysis Climate”.

    SSA is not truly ‘unsupervised’ it requires a little tweaking. You have to play with SSA.

    As far as statistics using ‘Information Theory’, I totally agree. However, climate science needs a specialist.

  27. I checked out the site, and nowhere did I see a methods section, nor a reference to a journal article with a methods section. I couldn’t find source code, or any specific statement of what was actually done. Did I miss something? I’d love to see a clear statement, enough to reproduce the results. I don’t see it here. Are there references to this method? Are there links to the data used?

    • This site primarily aims at the general, interested public, not at (climate) scientists. It tries to help ordinary people to understand climate science a little bit better from their viewpoint. It’s about communication between science and the public. This issue has also been discussed at Climate Etc., already.

      Anyway, the cited article on this GWP site lists at the end a number of references. Also, after the first paragraph there is a link for downloading models and software (which also includes a PDF on the method, a book actually) and further data are available on request.

      • I’m a little unclear as to how, as an electrical engineer, you “help ordinary people to understand climate science a little bit better . . . [via] communication between science and the public.”

        Not that you need to be a scientist to improve science communication, by given you are advancing a fringe theory of mathturbatory curve-fitting quite at odds with conventional science, I question whether you can realistically present yourself as advancing science communication. If that involves presenting yourself as a climate scientist, that would be deceptive.

      • OK, thanks.
        You missed the basic point, however: “… from their viewpoint”. I’m educated enough to know that your approach in this matter will continue to fail.

      • I’m educated enough to know that your approach in this matter will continue to fail.

        That’s an intriguing claim. How did your degree in electrical engineering prepare you to analyze my “approach” and determine that it will “continue to fail” (if your petulant response is any indication, I’d say I haven’t failed to touch a nerve with you.)

        Promoting climate denial via mathturbation doesn’t seem to be persuading very many people not already determined to disregard the science. Does promoting climate denial help you sell your modelling software? If not, I fail to see how your self-published curve-fitting exercises can be regarded other than as unqualified failures.

        More here: http://theidiottracker.blogspot.com/2011/10/frank-lemke-mathturbates-in-public.html

      • I’m just talking about your communication approach, your communication style…

      • You still haven’t explained what part of your degree in electrical engineering made you an expert in communication styles, rhetoric, etc. I’d really like to know.

        You also haven’t explained how I’ve “failed.” Failed to do what exactly?

        Your own crude efforts — to pass yourself off as a scientist, and to hawk your software — do not give the impression of a master of communication at work. But you’ve claimed your education made you such, so I’m eagerly awaiting details.

      • That’s fair enough. Had you presented more, I’d have probably asked for more anyway.

        The proof of your data mining will be whether this model still looks good 20 years from now. Some forecasts necessarily have to be conditional on events not yet known, such as the current and subsequent volcano eruptions. However, the present model can be rerun in 2015 with the intervening observed aerosols as data, and the model predictions checked.

        It would be nice to see your 10 best models: how close are they in terms of the objective function, and are their interpretations dramatically different. A later post perhaps, or JSM 2012 in San Diego.

  28. brianblais

    Welcome to the mysteries of climate science.

    Replication, quantification of uncertainty, validation, etc.

    These are secondary concerns.

  29. As an undergrad I used to use self-organizing solutions to generate fragments of the Mandelbrot Set, as they held a certain fascination at the time.

    However, I found that self-organizing solutions suffer problems of scope and scale and selection. Choose poorly, and you get nothing at all.

    Choose less poorly, and you obtain uninteresting results.

    Choose well, and you may yet obtain completely different outcomes from the previous well-chosen starting condition, even when you appear to be obtaining convergence to a solution.

    There’s nothing magic in the automation being discussed; it’s just math and a bit of exuberant overconfidence. The same rules of predictability apply to predictions by self-organizing models as apply to other forms of predictions — which is to say, these predictions will fail (or will succeed in some trivially nonmeaningful way) and will tell us little or nothing about what in particular made the method invalid at which point.

    But pretty graphics will result, and it will be fascinating to watch.

  30. “we have all these observational data; although limited it is priceless information about the system and its behavior. It only needs to be extracted appropriately so as to transform it into useful knowledge”

    That’s the trick, isn’t it? But without knowing how to unlock the information the data won’t lead us to a better understanding of what we are trying to learn. Hard as it is to believe, and difficult as it is for many people to accept, it actually is possible to come out the worse for trying. On its face this seems implausible. If “A” is information without the data and “A union B” is the same information with the data, isn’t it obvious that the quality of actions operating under “A union B” must dominate those under “A” alone? This is true provided that the model for extracting the added information is correctly specified. If the model is not known and must be deduced (and then estimated) from the same data, the win-win does not necessarily follow. The models that relate climate outcomes to signals in data have large numbers of parameters that require estimation and nonlinear functional forms that also are not completely known and must be estimated as well. And then there is the convolution of noise with the signal (not to mention bias) which as far as I can tell is not fully understood. And spatio-temporal dependence as well. I doubt that a self-organizing model can figure out how to deal with these issues correctly by itself.

    • Bob K.

      It’s practically a truism that self-organizing models have exactly the weakness you explain.

      I’ve seen it myself hands on; I don’t know anyone who has worked with such tools for any length of time who have not seen this problem for themselves.

      I’m very surprised Frank Lemke appears unaware of this issue.

  31. Looking at observational data by high-performance self-organizing predictive knowledge mining, it is not confirmed that atmospheric CO2 is the major force of global warming. In fact, no direct influence of CO2 on global temperature has been identified for the best models.

    Agree 100%.

    There is something that drives global warming that has been constant at 0.06 deg C per decade for the last 160 years!

    http://bit.ly/pJhIaw

    What is it?

  32. Being just a dumm hick from the stick(s) and noticing the first few paragraphs: “6 variables (ozone, aerosols, clouds, sun activity, CO2, global temperature). It also predicts using this system global warming 6 years ahead”………Question: Who figured out using CO2 as a driver instead of SST? Does it predict global cooling 6 years ahead? Seems to me all this fancy nitpicking about neutral nets and adaptive learning paths is fascinating to the cognoscenti, but does very little for climate understanding. The fact is, not enough facts are available, yet. A few hundred years at the minimum is required to get a data BASE.

  33. Frank Lemke

    Thanks so much for an excellent article.

    On the other hand, we have all these observational data; although limited it is priceless information about the system and its behavior.

    Here is what that “priceless information” tells us regarding global warming:

    http://bit.ly/ocY95R

    And its prediction is also correct so far:

    http://bit.ly/nz6PFx

  34. “What is self-organizing predictive knowledge mining? …..”

    David Wojick suggests it ” sounds like it generates nonlinear difference equations that fit the data, using pre-specified parameters, and possibly some specified theoretical relations among these parameters and/or the data variables. ” OK Possibly.

    There is, however, one certainty in this approach. If it produces results which agree with the IPCC, the climate science rejectionistas are going to have lots of fun taking the mickey! On the other hand, if it doesn’t, they are going to like it a lot.

    • tt

      Conversely, “if it produces results which disagree with the IPCC, the climate “consensus” supporters are going to have lots of fun taking the mickey! On the other hand, if it doesn’t, they are going to like it a lot.”

      Max

  35. When there’s a lot of data to explain, methodologies that try to minimize assumptions and include a lot of adaptability may be valuable tools. When there’s only little data, which has also a lot of noise, any flexible model is likely to find spurious solutions, which fit the data as well or better than theoretically sound models.

  36. Richard Saumarez

    I think data quality is a massively important point.

    Given a model, one can predict the sampling, errors etc in inputs to obtain a model solution with stated degrees of confidence. It’s generally rather laborious if approached through simulation.

    Since this definitely isn’t my field, have the calculations on data quality required to resolve a model been performed, and does the available data meet the criteria?

    • There is no way to estimate the errors in much of the basic climate data, and these errors are likely to be large.

      Take the core temperature data: the pre-satellite global temperature estimates produced by the Jones type surface statistical models. The sample is not representative. The area averaging method defies statistical sampling theory. Much of the coverage is based on SST proxies. There is UHI and local heat contamination. The raw temperature readings are dodgy. All this before sampling error comes in.

      Many of the other data sets are even worse, being based entirely on proxies, or questionable sources like ice cores. Refined modeling makes no sense in this context, but we do endless amounts of it. The modeling is a house of cards.

      • Richard Saumarez

        Yes, I agree. I have been looking at the Ceres documentation (I am hugely impressed by the skill and effort underpinning the radiation budget programme) and this data is designed to be used in attacking a specific question.

        One problem, I guess, is that if models are “verified” using hindcasting, they are being verified with questionable data.

      • The standard case is extreme even without data problems. This is where people are trying to match the precise HadCRU mean value global temperature line. The probability if that line being true is nearly zero. It would be fun to plot the 49% confidence interval, such that the probability that the true value is outside that interval is greater than that it is inside. All we really have is a wide swath of possibility.

        Note too that the UAH data contradicts the HadCRU data. So in addition to bad data we have contradictory data. So far as I know the models are oblivious to these uncertainties.

    • You won’t make many friends by starting a discussion about data quality, not in climate science or in anything else. As often as statisticians encounter data quality problems in real life, it is remarkable how little is said about it in textbooks or academic papers. The usual answer is “go get better data”. When Frank Lemke writes that “as of today, we do not know enough about climate and climate change to be sure of anything” and expects data analysis to somehow bridge the gap, it is indeed an ominous sign. The next big thing coming out of machine learning won’t be able to sprinke pixie dust and make all of climate science’s problems go away.

  37. Frank Lemke

    Very impressive! Open data & open predictions. Just brilliant.

    Frank, please be prepared for the definite attack that will come your way. Take it like water on a ducks back.

    Frank, how about including the IPCC’ projections that are in the public domain (like the 0.2 deg C warming per decade in the next two decades) in your prediction graphs? Please do?

    Good luck.

  38. Frank

    What is your subject boackground?

  39. Frank

    Do you have something like a tutorial solving a simple problem using knowledge mining?

    Thanks in advance.

  40. Joey Priestley

    [IMG]http://i55.tinypic.com/2wgf11f.png[/IMG]

    “eduation”

    are you this careful with your software?

  41. Hi Frank
    As well as producing temperature predictions, is it possible to extract an understanding of the physical processes that emerge in the model? This would allow a side by side comparison of the physics contained in your algorithm-developed model and the science underpinning the models used in conventional climate science.

    • RobB

      the science underpinning the models used in conventional climate science.

      Do you still believe in the “conventional climate science” when it predictions are found to be wrong?

      Read IPCC boasting about the accuracy of its projections just 5 years ago

      Since IPCC’s first report in 1990, assessed projections have suggested global average temperature increases between about 0.15°C and 0.3°C per decade for 1990 to 2005. This can now be compared with observed values of about 0.2°C per decade, strengthening confidence in near-term projections.

      Look now, 5 years latter, what the observation shows compared to the IPCC projections.

      http://bit.ly/p5976o

      The models are too hot and sensitive!

    • Hi RobB

      We are dealing with noisy data (noisy measurements, errors, missing variables). The concept of a “model of optimal complexity” states that on noisy data – depending on the noise level – so-called non-physical models are the “best” models. “Physical” models require noise-free data. Therefore, these two model types you suggest are not directly comparable. Due to uncertainty in the data you can only get uncertain, qualitative interpretations of the models like which input variables have been self-selected as relevant from the number of given potential variables and their relevant time lags. Of course you also get an equation, but this should not be so much subject of interpretation for noisy data.

  42. Frank

    Assume we are living in the 1940s.

    You are given the global mean temperature data up to the 1940s shown below:

    http://bit.ly/rpXLHL

    Using your data mining method, could you please show us your prediction for the period from 1940 to 2000?

    • That would be possible, sure. The problem is that most of the satellite data used start in 1978, only. This is what I have.

      The idea I have in mind is to publish results, predictions on the site transparently and update them from time to time to show how they compare with observations. I did this in this article already. The models have been built using data till April 11 and the true observations for May to August are added for reference. This risk is necessary to take to produce confidence (or not) and to learn more. Also, see the sunspot number predictions 1 and 2 (http://is.gd/RbOyhR and http://is.gd/rW1M0f).


  43. Data Mining Example

    For example, one Midwest grocery chain used the data mining capacity of Oracle software to analyze local buying patterns. They discovered that when men bought diapers on Thursdays and Saturdays, they also tended to buy beer. Further analysis showed that these shoppers typically did their weekly grocery shopping on Saturdays. On Thursdays, however, they only bought a few items. The retailer concluded that they purchased the beer to have it available for the upcoming weekend. The grocery chain could use this newly discovered information in various ways to increase revenue. For example, they could move the beer display closer to the diaper display. And, they could make sure beer and diapers were sold at full price on Thursdays.

    http://bit.ly/qlcpU4

  44. Frank

    My estimate is for 0.54% increase in CO2 concentration per year

    Yours is for 0.47% increase per year.

    I think this is a big difference for the very predictable annual CO2 concentration growth

    Year CO2
    2009 387.38
    2010 389.78

    My Estimate
    2010 estimate => 2009 Value * 1.0054 => 387.38 * 1.0054 = 389.47

    Your Estimate
    2010 estimate => 2009 Value * 1.0047 => 387.38 * 1.0047 = 389.20

    • Girma,

      The 0.47% increase per year is a linear approximation of the initial model, which is:

      CO2(t) = a*CO2(t-12) – b

      with a = 1.018, b = 4.805, and t – Month. This is actually a nonlinear model. Interesting by the way that CO2 growth is described at 99% accuracy by just a simple auto-regressive process.

      Using this model, averaging monthly predictions for 2010 results in an estimate of 389.54 ppm. For 2011 it estimates 392.04 ppm.

  45. Why should it matter if CO2 do really drive Global Warming or not?

    The current mental model of Global Warming that has been communicated worldwide is this (fig. 3): CO2 and other greenhouse gases cause global warming and if CO2 emissions are continuously growing global temperatures will do so, too, proportionally.

    X4 CO2 Concentration => f => X6 Global temperature anomalies

    Figure 3. Communicated mental model of a supposed CO2-driven global warming as a linear chain cause-effect relationship.

    If this is true, it will indeed have dramatic consequences. Believing that it is true, huge efforts has been propagated and also taken in many countries in recent years including the introduction of CO2 certificates trading as a questionable tool to mitigate CO2 emissions. To fight Global Warming we have to fight CO2 emissions. That‘s the conclusion.

    But what if Global Warming has not been driven by greenhouse gas concentrations or not in the assumed way? Or what if Global Warming takes a different path than projected by the present communicated model due to other dependencies and effects that exist in reality than assumed and described by this model? Can we really afford failing in this matter? Wouldn‘t we have to take other actions in these cases? Maybe rather taking care of aerosol and ozone concentrations, for example?

    Thanks Frank.

  46. Joachim Seifert

    To Girma :
    Your question: But what if global warming has not been driven by GHG or not in the assumed way? …..
    This is the yellow of the egg….because it is not GHG but it is something completely different: “Libration” forcing (see Wikipedia, the animated picture for the Moon):
    The Earth’s orbit given by the IPCC is a “Two body gravitation problem”, i.e. Sun + Earth as two bodies. In reality, it is a “3- body- problem”, taking
    gravitational pushing and pulls onto the Earth’s orbit (by the other planets – third body) into account.
    The IPCC works with a wrong astronomic model, failing to take the real trajectory into account: They simply “steal” the radiative forcing of the real trajectory and put it into the pockets of the GHG, therefore their models must be uncertain… and fail to predict the present climate plateau….
    Including Libration, you will get the real fore- and hindcast, not even
    needing to statistics of the past….
    JSei

  47. Is this similar to the methods used by Verdes in these papers:

    http://prl.aps.org/abstract/PRL/v99/i4/e048501

    http://pre.aps.org/abstract/PRE/v72/i2/e026222

    • From my understanding they used an Artificial Neural Network with a predefined number of input, hidden, and output neurons and predefined transfer function. This is similar to developing a model by theoretical means: The model structure is predefined and the analysis method then does estimation of parameters or weights. This is what data mining does on large data sets.

      I’m talking of knowledge mining, which shares some concepts with data mining, but goes beyond that. Knowledge mining, in addition to parameter identification, also performs structure identification, it builds the model from scatch autonomously. No model is given at start of the modeling process.

      So to some extent these methods are similar, but there are also essential differences.

  48. Frank, your approach is refreshing. Your method approaches the matrix of information free of preconceptions as to cause and effect. It observes relationships to identify dependencies. That’s what a smart human being is supposed to do.

    But, humans are not very smart and many of those who write comments on blogs seem to be massaging their own egos or pushing a barrow (e.g. Robert who you very sensibly ignored). They and not actually interested in mining the truth.

    I like the idea of a mechanical observer that tests relationships. It’s the ultimate in terms of detachment. At the very least it points us to things that seem to be related.

    Ozone is very definitely related to cloud cover and the intensity of surface insolation and therefore surface temperature. Ozone is not confined to the stratosphere. It is present in highly variable levels in the upper troposphere influencing dew point and cloud cover because it absorbs outgoing radiation from the Earth. If you measure it in the stratosphere, or simply measure the temperature of the stratosphere near the equator you will find that it varies with sea surface temperature in the mid and low latitudes.

    Ozone content in the stratosphere depends upon solar activity (aa index) because there is an interaction between the mesosphere and the stratosphere at the poles, the Arctic in winter and the Antarctic all year, involving the night jet, NOx compounds in the stratosphere that vary with solar activity and variation in the strength of the night jet itself depending upon the variation in surface pressure also subject to solar influence. That’s because the atmosphere contains ionized gas that behaves like plasma, reacting to an electromagnetic field.

    “sun, ozone, aerosols, and clouds” . You have nailed it.

    No, not CO2. But that is a politically incorrect statement and all the radiative theory protagonists who are very poor observers of the atmosphere and know very little of the geography of the Earth will discount what you say because they reckon that they understand the ‘physics’.

    Engineers are people who like to be trust-worthy, so I am inclined to trust you.

    • erl,
      As an engineer, I am sure that you would want to focus more on the cause and effect relationships of whatever problem you may be looking at, rather than to rely on numerical correlations with respect to a noisy data set that is influenced by many variables.

      Ozone is a very important climate system component with significant shortwave as well as longwave impacts, including also its impacts on atmospheric chemistry. We take all these impacts into account in our climate modeling of global climate change. We also take the radiative effects of CO2 into account, and these are significantly greater than the radiative effects of ozone. Our approach in doing climate modeling is more akin to solving and engineering problem than it is to investigating some new science question.

      The political incorrectness of CO2 is mostly the result of someone’s thinking process that has nothing in common with either the engineering or science aspects of climate change.

      • A Lacis, I think you misunderstand me.

        Yes I am very much interested in cause and effect. I think the influence of ozone in the troposphere is important in modulating upper troposphere cloud cover.

        The coupled circulation of the stratosphere and the troposphere at the poles drives ozone into the troposphere lowering surface pressure. In consequence there is a good relationship between mid latitude sea surface temperature and surface atmospheric pressure in high latitudes. So, I see the variable presence of ozone in the upper troposphere as modulating cloud cover via its ability to intercept radiation at 9.6 micrometers.

        One can observe a temperature change at 200hPa at 20-30° south that is three times the temperature change at the surface. Its not the latter driving the former but the independent influence of ozone affecting the temperature of the cloud layer.

        I am mindful also of the relationship[ described here: http://mls.jpl.nasa.gov/library/2011GL047865.pdf

        The change in ozone content in the troposphere over the western Pacific must influence cloud cover and radiation that is received at the surface.

  49. Frank: “… It also predicts using this system global warming 6 years ahead (monthly resolved) and it compares the known IPCC AR4 projections with this system prediction …So this approach is different from the vast majority of climate models, which are based on theories.

    If that means what I think it means, I don’t think that’s the best way forward.

    I read it as meaning that every so often you do a 6-year prediction based on past data only – very sophisticated but still based on past data only – freeze it, then see how it goes over the following 6 years.

    I think it would be better to produce (and freeze) models which can’t make predictions until certain future observations are applied. For example, you could prepare the model without solar activity data then feed future solar observations into the model to generate predictions (retrospective, from the freeze date onwards). That way you can restrict your predictive process to only those factors that you have a chance of predicting (I am assuming in this particular example that you cannot predict future solar activity – a reasonable assumption I think, based on attempts to date to do so and our lack of knowledge about how the sun works).

    It means that you can’t make predictions ahead of time, only retrospectively, yet your prediction is firm in the sense that you can’t change what it is going to say once you have frozen it. That can be externally audited too. It means you can incorporate unpredictable events such as volcanoes. It also means that you can prepare models which target only a part of climate (eg. only some of the 6 rather than all 6 at once). And I haven’t thought this through but I suspect that it may make it easier to incorporate or at least test theories too – eventually there are going to have to be theories.

    Note that this is very different to the IPCC approach of continuously tweaking models so they match observations but still promoting their future predictions as gospel.

    Apologies if I have misunderstood you.

  50. Hi Mike,

    The self-organized system presented here is a flexible approach. It can be reduced or extended by other variables and it is a predictive system. This means, all system variables have to be – and can be – predicted over the forecast horizon of the system. This leads to status-quo-predictions. What-if predictions are possible too. This is true also for sun activity which has been predicted till 2020, actually:

    http://is.gd/rW1M0f

    As to incorporation of theories, I agree. This is part of regularization of ill-posed tasks and it is important:

    “Each methodology has its strengths and limits. Every single model reflects only a fraction of the complexity of real-world systems. What is necessary is a holistic, combined and interdisciplinary approach to modeling that takes into account the incompleteness of a priori knowledge. Knowledge mining benefits from well known, justified theoretical knowledge about the system to get most reliable and accurate predictive models out of the data…”

    Thanks.

  51. Thanks. That answers some questions. It’s an interesting project and I will try to keep an eye on progress.