by Frank Lemke
A recent post in our Global Warming Prediction Project discusses the question “What Drives Global Warming?” based on a self-organized interdependent nonlinear dynamic system of equations of 6 variables (ozone, aerosols, clouds, sun activity, CO2, global temperature). It also predicts using this system global warming 6 years ahead (monthly resolved) and it compares the known IPCC AR4 projections with this system prediction and the observed anomalies of the past 23 years.
“Looking at observational data by high-performance self-organizing predictive knowledge mining, it is not confirmed that atmospheric CO2 is the major force of global warming. In fact, no direct influence of CO2 on global temperature has been identified for the best models. This is what the data are seriously telling us. If we believe them, it is the sun, ozone, aerosols, and clouds – and possibly other forces not considered in this model – that drive global temperature in an interdependent and complex way.“ [link]
So the question arises: What is self-organizing predictive knowledge mining?
Briefly said, knowledge mining is data mining that goes steps further. It is a data-driven modeling approach, but in addition to data mining, self-organizing knowledge mining also builds the model itself, autonomously, including self-selection of relevant inputs by extracting the necessary “knowledge” to develop it from observational noisy data, only, most objectively in an inductive, self-organizing way. It generates optimal complex models according to the noise dispersion in the data, which systematically avoids overfitting the data. This is a very important condition for prediction. These models are available then explicitly in form of nonlinear difference equations, for example.
So this approach is different from the vast majority of climate models, which are based on theories.
Why self-organizing predictive knowledge mining is needed
I think there is no dissent in the community that there is no complete, even no sufficient, a priori knowledge about the complex processes in the atmosphere and its external influences and interdependencies with the ocean, land, and the universe. We only know few things for sure. This alone makes it an ill-defined modeling problem that is characterized by:
- Insufficient a priori information about the system for adequately describing the inherent system relationships. The dynamics and interdependencies of real-world systems in which system variables are dynamically related to many others, often making it difficult to differentiate which are the causes and which are the effects.
- Possessing a large number of variables, many of which are unknown and/or cannot be measured. Alone, using variables that have been measured you easily get many hundreds to ten thousands of variables when considering a higher degree of system memory (dynamics).
- Noisy data available in data sets with a small number of observations. This is a serious problem, especially when the number of samples is smaller than the number of variables (so-called under-determined modeling task), which is usually the case for climate modeling. Temperature records start in 1850 and records for other factors start in the middle or the end of the last century. This is very limited “true” information.
- Vague and fuzzy objects whose variables and results have to be described and interpreted adequately, which leads to uncertainty.
This means that there is a serious methodological problem in climate modeling. For ill-defined systems the classical hard approach that is based on the assumption that the world can be understood objectively and that knowledge about the world can be validated through empirical means needs to be replaced by a soft systems paradigm which can better describe vagueness and imprecision. This approach is based on the observation that humans only have an incomplete and rather vague understanding of the nature of the world but nevertheless are able to solve unexpected problems in uncertain situations. Theory-driven modeling approaches have been used to advantage in cases of well-understood problems, where the theory of the object being modeled is well known and obeys known physical laws. Theory-driven approaches are, however, unduly restrictive for ill-defined systems because of insufficient a priori knowledge, complexity and the uncertainty of the objects, as well as the exploding time and computing demands. This is the case in climate modeling as well as for other environmental, life science, and socio-economical problems.
On the other hand, we have all these observational data; although limited it is priceless information about the system and its behavior. It only needs to be extracted appropriately so as to transform it into useful knowledge and (non-physical) models that predict and simulate the system development, that help to get a better understanding and increase our knowledge about the system, and that support decision-making under uncertain conditions.
The adaptive learning path
Each methodology has its strengths and limits. Every single model reflects only a fraction of the complexity of real-world systems. What is necessary is a holistic, combined and interdisciplinary approach to modeling that takes into account the incompleteness of a priori knowledge. Knowledge mining benefits from well known, justified theoretical knowledge about the system to get most reliable and accurate predictive models out of the data. These models in turn may reveal new knowledge that could be used in further steps, and they can be applied – together with physical models where and when suitable – in simulation and scenario development. In this unifying way it should be possible to better, more completely and more adequately describe, understand, and predict the complex behavior of the earth climate including uncertainty as its integral part.
Even more difficult is the task of control. In complex systems, there is no simple, single cause-effect relationship that we humans apparently find so appealing. There is no single control knob that only needs to be moved up or down to get a desired result, and only this result, with no unwanted side effects (this is another inconvenient truth). Instead, the system variables interact highly dynamically in time and space, they are inputs and outputs, causes and effects at a same time. This interaction pattern is very difficult to understand and interpret even if it was fully known. We need reliable tools in form of interdependent system models that help understanding and dealing with it (see fig. 2 in the mentioned post). I believe – and I think I’m not alone here – no individual person or expert group will ever be able to nearly formulate these complexities by theoretical approaches, only.
The primary question is not whether CO2 – or any other single factor – drives global warming, but first of all, whether the modeling approach is adequate to describe the system under research. To me, models based entirely on “CO2 theory” are methodologically inadequate for predicting, describing, and explaining global warming, because of the above reasons, especially given their claimed evidence. AGW protagonists as well as the skeptics (the “deniers”; it’s only the viewpoint that defines who denies what), we all suffer from the same fact: we cannot prove our theories and concepts in the general view due to lack of sufficient knowledge and understanding of the climate system. This could be a never-ending story until reality decides who is right. But this way we would have learned and gained not much.
The other, more sophisticated way would be to understand and respect the sound arguments and conclusions of the “opposite party” as partial (mental) models of the complex, hidden Truth, as different views on the same object, as pieces of the entire picture that we all are trying to reveal. No single model can reflect reality completely. We need an ensemble of models that use different information and modeling approaches. We have to continue to learn and to gain new knowledge to further reveal Truth. New knowledge, however, cannot be obtained without external information. A major, but so far unused, source of external information in climate modeling is knowledge mining.
This path would be the adaptive learning path.
As of today, we do not know enough about climate and climate change to be sure of anything. We can learn and gain knowledge from different sources, approaches and views and eventually will achieve the understanding and clarity needed to make sound decisions. We have to start thinking differently.
JC comment: I think this is a fascinating methodology for attempting to untangle the complex relationships between clouds, surface temperature, atmospheric composition, etc. The potential predictive capability on timescales of a year to a decade fills a critical gap in our understanding and predictive capability. Conclusions regarding AGW and the role of CO2 cannot be drawn from 23 years of data, but this methodology in principle could be extended to longer time periods. I look forward to the discussion on this.
Moderation note: this is a technical thread and comments will be moderated for relevance.