I’m starting a new thread for this topic, since interest in the previous thread has re-generated owing to this AGU abstract by Steve Easterbrook, entitled Do Over or Make Do? Climate Models as a Software Development Challenge.
Dan Hughes writes:
My reading of these posts at Professor Easterbrook’s site seems to me to be saying that all is sweetness and light within the climate science software development community:
Verification and Validation of Earth System Models
Do Climate Models need Independent Verification and Validation?
Should science models be separate from production models?
The AGU abstract, on the other hand, seems to indicate that there are significant problems within that community:
The testing processes are effective at removing software errors prior to release, but the code is hard to understand and hard to change. Software errors and model configuration problems are common during model development, and appear to have a serious impact on scientific productivity. These problems have grown dramatically in recent years with the growth in size and complexity of earth system models. Much of the success in obtaining valid simulations from the models depends on the scientists developing their own code, experimenting with alternatives, running frequent full system tests, and exploring patterns in the results. Blind application of generic software engineering processes is unlikely to work well.
My direct experience with application of recent engineering and scientific software development methodologies has shown that many of the problems noted above can easily be avoided. Documentation of the specifications, for one example, provide an excellent starting point for avoiding these problems. Coding guidelines including careful specifications for interfaces is another example.
This last sentence in the above quote:
Blind application of generic software engineering processes is unlikely to work well.
is a strawman in that ‘generic software engineering processes’ are not the only processes employed for engineering and scientific software.
The following statement from this post The difference between Verification and Validation:
For climate models, the definitions that focus on specifications don’t make much sense, because there are no detailed specifications of climate models (nor can there be – they’re built by iterative refinement like agile software development).
is especially troubling.
Professor Easterbrook has also cited the infamous paper by Oreskes Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences. A paper that has been refuted several times within the engineering and scientific software development community. A paper that says, in effect, that we might as well not even think about starting development of software for complex natural physical phenomena and processes. And yet at the same time, some such software is readily accepted, climate science software for example, while others are simply rejected based solely on that paper.
I find comments like this to be particularly galling:
Much of the success in obtaining valid simulations from the models depends on the scientists developing their own code , experimenting with alternatives, running frequent full system tests, and exploring patterns in the results.
It may be quicker and more “agile” to have people throw together code and try alternatives, but certainly not more valid. Steve’s argued that “you can’t delegate ill-defined problems to software engineers”, but that begs the question of whether you can validly code a solution for an ill-defined problem. The process of specifying requirements not only serves to communicate to the development team, but also serves to firm up the problem in the mind of the “customer”.
My first suspect for the issues mentioned (software and configuration errors) would be an amateur development team working without adequate process or supervision.
Sounds like prayer more than science.
As i mentioned in a previous thread- the validation and testing of these models MUST be performed IN-PROCESS not after the fact.
The over-reliance on ‘fudge factors’ (usually different in each model) shows just how unsuitable they are.
Apply INDUSTRY STANDARD QC to the models and you’ll soon get FAR better results (and discard all the trash along with it).
Research MUST learn from industry.
Yes, I totally agree. research must learn from industry in terms of quality control and quality standards. We have all this nice Six Sigma methods that can be used and adapted. That this is not done is very telling. Moreover, I do think their is a lack of appropriate lack of education about quality methods within the community around the IPCC.
I agree. In industry, all software development would only begin after fully detailed and approved specification. Subsequent changes would only be made after further fully detailed specification.
I am highly troubled by calling undisciplined work “agile.” It is precisely the opposite of sound Agile development. Small-team Agile programming means a different set of disciplines from big-team Waterfall programming, but it never, ever means lack of discipline. It never means losing sight of what the program is for, or what it is doing. It never means lack of documentation, just a different kind. It never means lack of coordination, just a different kind. Please (re)read Beck’s “Extreme Programming Explained” if you are fuzzy about this. It will put you back in love with proper programming practice.
With all due respect for the people who work hard at these programs, we should not be teaching anybody in these teams to do undisciplined work, or letting them get away with it.
I do waterfall software development. For a large project like this, I would prefer waterfall over agile. I guess either could work, but the requirements gathering stage could be very useful and illuminating. It could fill in gaps in the problem definition that people aren’t even aware of now.
You would also want statisticians in the room, along with experts in the various Earth systems. That would be a fun project, IMO.
Agile tends to the best choice where aspects of the requirements are still being discovered. It’s one quick step forward, and NO steps back. Waterfall is 10 slower steps forward, then several re-done steps, because of incomplete requirements. Undisciplined Agile is 1 step forward, and half a step back, because of undisciplined work.
It would be interesting to see what would come out of a waterfall development. Steve makes some interesting hypotheses about software development in climate science. Those hypotheses are testable.
One could for example, take an existing GCM and rewrite it using waterfall practices. I worked on projects where we often had to get “research” code running and working and then later, start from scratch with the specs and the reviews and all the various stages. In the end you have a stable, documented, shareable code base.
Interestingly, NCAR’s CESM is “stable, documented, sharable”.
I also liked MIT’s approach. Judith has recommended that I look at NCAR. I was asked a while back to estimate the cost of setting up a lab to do work with CESM ( on the cheap) I didn’t dig much into the code rather I was just looking at bare minimum HW requirements.
A key element of course would be stamping CESM with a sign of approval. For example, on a DOD contract we would be required to use MODTRAN. I’m suggesting the same sort of validation.
having MODTRAN as a standard didnt stop the science going forward, in fact you could get contracts to improve it. What it provided was a common accepted way of estimating.
Thank you Mrs Curry for you somewhat refreshing (if I may use this word) blog.
I’ve been very interested in your previous thread related to Climate Models’ validation. Even though I’m not a professional climate scientist, I got a degree in aerospace engineering (somewhere between Master and PhD since French system is different to yours). This means I’m a little bit familiar with models’ development and validation. Therefore I’ve been quite surprised by the climate models’ validation process you describe. As I understand, this process is mainly based on inter-comparison, which is far away from the state of art with respect to model’s validation. The standard process is to run the model with different (well defined) sets of initial conditions and to compare the outputs with the results of lab tests performed using the same sets of conditions. Then you tune the model by adjusting some of the parameters (such as feed-back factors) until the model faithfully reproduces test’s observations. Here you have formally rescaled and validated model. Applied to climate science, this means a model can only be declared validated once it has proved its ability to reproduce past climate observations. This comparison with past observations is the only possible way since climate tests are of course hardly feasible. But looking back to AR4 report (group I – Chapt.8 FAQ 8.1) one should recognize that climate model’s ability to reproduce past climate trends turns out to be dramatically poor.
– Between 1912 and 1942, the warming trend reproduced by the (mean) model (roughly 0,06°C per decade) is more than twice lower the observed one (0,15°C per decade).
– Furthermore the model reproduces warming until ~ 1960 whereas observations indicate stagnation and even a slight cooling starting roughly beginning of the 40’s. It is noticeable and significant that models are not able to reproduce any cooling (while CO2 concentration is still growing) except when using volcano’s forcing (all significant cooling trends produced by models actually correspond to a volcano’s eruption) .
– The only period that models correctly reproduce is the 30 years warming trend from 1972 to 2002.
This correlation is definitely too limited and insufficient to declare any model as validated, especially knowing that a new stagnation / cooling trend started in 2002, that none of these models has ever been able to foresee !
Therefore I’m pretty convinced that models’ validation will be one the key issue for climate science in the coming years. Today’s status is that climate models are so far formally invalidated by comparison to temperature observations. And as far as this situation remains, any attempt to prove the validity of AGW theory will fail.
Eric – I recommend that readers view the Chapter 8 FAQ 8.1 model description to make their own judgments. Model imperfections are well recognized, but if one compares model output (red line) with observations (black line), I find the correspondence quite good over the long term. In particular, anomalous events around 1940-1950, possibly reflecting combined effects of variations in ENSO, AMO, and PDO, create a bump in the observations that otherwise would simply continue the relatively modest warming that preceded that interval and would have also created a slight rise in the following decades instead of an apparent slight cooling or flat interval until the 1970s. It is well known that these particular internal climate variations are not well modeled, but over the long term of the twentieth century, they have tended to average out. In essence, those models have done a fairly good job on a multidecadal basis. As models evolve, they will perform even better, but it is unrealistic to expect them to simulate the real world in precise detail at every time point. They are most valuable in conjunction with basic physics principles and real world data.
With the same type of background as the original poster (PhD in CFD/solid mechanics, a lot of Finite Element code development and validation), I strongly disagree with the apparent leniency you give to the climate models. Fitting only the trend is far far from being sufficient, because of the large number of unknowns and parameters that can be “adjusted”. The more complex your model is, the more non-completely fixed parameters there are, the more complex validation you need. Fitting only the trend, for the time period considered, can be done with a simple power law or exponential, and those will be produced by any simple linear function with well tuned parameters. Much much too easy to give any confidence in a predictive model, especially a complex model with a lot of degree of freedom…
In GCMs, the parameters are not “adjusted” to make the model fit the trend. They are often adjusted to fit the starting climate, but whether or not they fit the trend is a test of model skill. They perform fairly well (with room for improvement) for long term, global trends – less well for short term or regional projections.
What do you mean by “fit the starting climate”? Does it mean “fit the 20th century temperature record”? Or just fit a single point in the 20th century, asking for a steady state around this point if forcings are kept constant?
In the first case, testing the model by his ability to fit the 21th century trend is risible. In the second case, I would like to know much more about the procedure – it should be interesting how the reference point is selected and why any steady state is assumed. But as a CFD guy, even in this “best” case, I maintain that perfroming fairly well to predict the 20th-21th century trend after adjustment on a reference climate is far from enough to inspire confidence in the models….given that the gobal trend is so simple!
The initial conditions for a climate model are far less important than the boundary conditions.
How do you back up this assertion, since the programs are unvalidated?
And since the climate scientists, according to what is being shown in this thread, are claiming to validate against projections and other models, how is any of this anything more than a very expensive exercise in time wasting?
Think of it this way – to get a general notion of sunrise and sunset times, which is more important? Longitude or latitude?
Please explain the relevance of your cryptic comment to the matter under discussion.
Writing in riddles may impress your fellow climatologists with your wit and erudition, but failure to address the points directly gives the rest of us the impression of shiftiness and possibly something worse.
When making a weather forecast, initial conditions are paramount, because the boundary conditions (“forcings”) change very little over the length of the forecast, whereas starting from (say) a day with pleasant temps and no precip vs. starting from the next day, with wind and snow, makes a huge difference.
When projecting climate over decades to centuries, whether or not one starts from a nice or stormy day (which isn’t how climate models work anyway) doesn’t matter – but the evolution of the forcings over time (boundary conditions) are critical. You need to have the correct solar irradiance, volcanic aerosol loading, GHGs, etc. to make your projection viable.
It’s all timescale.
How can you be so sure Derecho nobody has correctly projected? climate yet.
The main method for validating climate models is to validate them against a selection of climatological data. In other words, is it windy and calm in the right proportions, are the temperatures about right, is the amount of extreme weather similar to the current climate, etc. etc.
Climate model runs to project future climate or model past climate are not done until the model is frozen.
1910-1944 warming period (Delworth + Knutson) showed linear warming of 0.53°C over the 35 years (linear trend of 0.15°C per decade)
This report confirms that the mean of the climate models was 0.06°C per decade or 0.21°C over the 35-year period.
In other words, Delworth and Knutson confirms that the actual warming was 2.5 times what the models projected (confirming what you stated in your post by comparing AR4 WG1 FAQ 8.1 Fig. 1).
To your last point:
Post 2000 the models projected warming at 0.2°C per decade
2001-2010 actual shows slight cooling of 0.07°C per decade
The above examples show that the models need verification and validation, and I hope this new thread will either provide this or clearly confirm what you have concluded, namely that the models are unable to project climate changes.
Here’s the link to the Delworth + Knutson study on Early 20th Century Warming:
@ Manaker: Thanks for backing my post and providing good reference supporting the conclusion with respect to models’ invalidity.
2 additional and important remarks I’ve forgotten in my initial post :
1) Observed warming trend from 1912 – 1942 period (0,15°C per decade) is very similar to the one observed from 1972 to 2002 (0,17°C per decade). This is also a key issue because none of the models is able to reproduce and explain these similar trends. This actually questions models’ basic equations since CO2 mean concentration in the 20’s and 30’s (about 295 ppmv) was rougly 16 % lower than the one observed in the 80’s and 90’s (about 350 ppmv)… And as Delworth and Knutson noticed in their paper “causes of the earlier warming are less clear because this period precedes the time of strongest increases in human-induced greenhouse gas (radiative) forcing”.
2) We can observe same factor 2,5 between CO2 concentration trend in the 20’s & 30’s (about +1,7% per decade) and the one observed in the 80’s and 90’s (about +4% per decade). This actually confirms that models are mainly driven by GHG’s (and especially CO2) concentrations and provides the basic cause of low warming trend reproduced by these models for the 1912 – 1942 period. Since CO2 concentration is the main driver, the warming trend that models reproduce is the direct translation of CO2 concentration increase. But this “GHG’s centred” approach is formally falsified by the comparison to observations since :
– There is obviously no correlation between observed warming trend and CO2 concentration increase (same warming rate for different CO2 concentrations)
– This approach will always fail to explain cooling (1942 – 1972) while CO2 concentration is continuously growing!
1) Climate is not mainly driven by CO2 concentration
2) Climate is actually driven by many different parameters, CO2 concentration being only one of them, and probably not a leading one.
3) Subsequently, models that are mainly based on this assumption that GHG’s /CO2 concentration is the main driver fail to reproduce observed temperature and are therefore formally invalidated.
4) As long as models will focus on CO2 forcing or positive feedback factors, and under-estimate or even ignore other leading parameters or physical phenomena (sun, clouds, ocean’s oscillations like AMO, PDO or ENSO, negative feedbacks etc…), their validity as well as their predictions will remain highly questionable.
Easterbrook has a thread on the entire AGU session on Software Engineering for Climate Modeling
Discussions regarding climate change appear to have moved far beyond the fundamental questions that good science demands answers to before proceeding and making assumptions as if they were proven facts.
One of the first principals I was taught, was to question to the void, unfortunately due to the very large number of scientifically unskilled people such as news media and politicians involved in this debate wild unsubstantiated claims are accepted without careful study. Instead intuitive logic and highly selected correlations are taken as scientific proof, which of course they are not.
By my last count there were 14 different variables that affect the temperature of the earth, of which CO2 is one. Until the effect of the other 13 variables are known it is impossible to evaluate the impact of any CO2, man made or natural. Many of these variables have only been able to be studied since the beginning of the satellite observations in 1979. Thus all data prior to 1979 regarding global temperatures is subject to a wide error margin and an attempt to correlate global temperature with CO2 when the effect of most of the 14 variables could not be tracked is total nonsense.
Two data sets which are continually referenced as “gold standards” and indisputable are the average global temperature produced by either NASA (GISS), The USHCN or The Hadley CRU. In actual fact the raw data is distributed by the CRU but it is processed by all 3 organizations using their own computer programs and they employ their own selection methods regarding which data they consider worthy of inclusion in their analysis.
Therein lies several major sources of error, first which meteorological stations were selected since 1880. The answer to that is there has been no consistency in the number of stations or even the location of the stations and there are often gaps in the data which are filled in by the ever present computer.
Next after the selection process the data is “homogenized” by a computer program which no one is allowed to audit. Then since the stations are land based there is no data for the oceans ( 71% of the worlds surface) and many remote land masses so a second computer program has to be employed to fill in all the missing data. Again no auditing allowed.
GISS are constantly revising their computer programs and reprocessing data all the way back to 1880 so the global average graphs are getting steeper each year and this is in the public record, the GISS graphs from the past are readily available but our news media and other special interest groups totally ignore them. So when the news media scream “hottest decade” or “hottest year” one has to ask by whose measurement and how valid are they.
The temperature rise as calculated from the satellite data since 1979 is about half that published by GISS and others. Unfortunately satellite data is rarely referenced it just does not have the shock appeal.
The second data set is the CO2 record from the ice core proxy.
Again this is published as “incontrovertible” however there are a number of other CO2 proxies that vary significantly from the ice core data, probably the most accurate is the leaf stomata proxy. A number of papers have been written concerning the ice core proxy method and why it will return low values and yet the IPCC puts the blinders on and will not even consider other evidence.
With reference to returning to scientific fundamentals, in 22 years and after 4 assessment reports not a single scientific paper written by an eminent physicist has been referenced which validates the IPCC hypothesis in terms of the laws of radiation physics as related to the CO2 molecule. But many have been written proving the IPCC wrong and all have gone unchallenged in an intelligent and analytical way.
The mandate of the IPCC when it was formed in 1988 was to produce as much evidence as possible to show that increased temperatures produced unfavorable results any reference to increasing temperatures producing favorable results was discarded. Secondly they were tasked to show that CO2 was responsible for increasing temperatures and that human activities were responsible for the increase in CO2.
So the conclusions were formed before the examination which can hardly be termed a scientific evaluation. It is not too difficult to assemble the “solid scientific proof” by careful selection and editing when the conclusions have already been established.
So with respect to computer models where do you start to gather the data in order to produce the 14 algorithms required, in particular where is the data prior to 1979 for input into the GCMs, thus any correlation between calculated and observed temperature is purely coincidental or contrived. Since 1979 as stated there is a considerable variation between satellite data and GISS or others, again between what data sets are these comparisons being made.
If instrumental observations cannot agree how is it possible to construct a computer model that forecasts 100 years hence. Come down from your ivory towers folks and get involved in the real world.
To distinguish anthropogenic changes, natural changes have to be accurately modeled and their uncertainties quantified. Research by Syun-Ichi Akasofu supports the null hypothesis of a multi-decadal 50-60 year oscillation (peaking in 1940 and 2000) superimposed on a linear 0.5 deg C / 100 years temperature rise.
Syun-Ichi Akasofu, On the recovery from the Little Ice Age
Natural Science Vol.2 No.11, 2010
Statistically significant deviations from such null hypotheses need be shown to quantify anthropogenic influences. (Previous IPCC reports do not include the PDO oscillations.)
Software engineering for climate model development is so far outside my realm of knowledge that I can’t ask any technical questions, so I’ll ask a more general one. To what extent do current deficiencies in the process lead to spread among different model outputs, and to what extent to they lead to systematic errors in the same direction? We already know that in some cases, the mean results from a multiplicity of models match observed data better than the best of individual models. Are we seeing an averaging out of errors?
I realize that software is not the only attribute that differs among models, but it is clearly an important determinant of performance.
There is for example a very large spread in the actual global temperatures modelled by the various models. See here for example:
The spread is about 7 degrees (11 – 18 degrees centigrade), which means that the coldest-running models give results that are equal to fully glacial conditions while the warmest have temperatures that have not occurred since the early Pliocene. This extreme spread an inability to simulate actual temperatures is not obvious in most illustrations of model runs since the results are usually given as anomalies rather than absolute temperatures.
Steve dismisses the “Do Over” strategy because
1. it would cost too much
2. There are too few experts.
3. The science would be outdated.
Let’s look at that. Well, #3 is a problem right off the bat. if the answers we get today from GCMs are “good enough” to drive some (like me) to conclude that we should do something, then there is a honest question whether we need or would benefit from more exacting physics. On the other hand, if the models need to be improved, then it becomes difficult to say they form a solid basis to convince people that we need to do something.
#2. too few experts. As Steve notes there are 25 modeling teams out there.
When we started 3D on the PC there were 30 chip companies building chips. In the whole world about 30 architects who understood 3D. In the end we ended up with 3 companies doing the work. Given the LOC in a GCM
I’m not convinced there are not enough SMEs, there may be rather an unwillingness or no motivation to merge.
#cost. Steve puts it at 1000 man years , say 150Million.
I’m not sure if he ran an analysis ( say cocomo) to estimate that
but 1000 man years to do 1million LOC seemed a bit unproductive
Nevertheless, since these people are already being paid, this isnt a
a increase in cost, its just a redirection of how we spend money.
He also put in the cost of 200M for a new computing center.
That’s an old trick those of us in the defense industry used to use
to fight the rules about switching to ADA from fortran. “oh, if you are going to make me write my working code over again in ADA and this time do it right, then I’ll just load the bill up with a brand new computer. What thsi really is is just cover for the sociological desire to preserve the status quo. because you understand the planet is not important enough to use the hardware that we already have, hardware that was good enough to give you answers on AR4. And the software that was good enough for AR4 cannot be rewritten from ground zero because we are developing better physics right now that we can’t even specify will produce better answers. Maybe folks ought to think of GCM code as a national treasure and not as a tool to get through the next AR. You see, despite steve saying that the customer is internal, there is another “customer” driving the process and that customer is the AR5.
So the community may like the way it is currently working, but the way it is current working is not necessarily winning credibility points with the technically knowledgable people who will be impacted by the decisions the models inform.
The goal of the “new” AGU was to help members get the message OUT about climate change. The message coming IN is that you might have to change some of the ways you prefer to do things. Planet’s at stake, seems like a simple choice.
( Folks might get annoyed if you play the “planets at stake, save the grand children card, in every debate, but it should be evident how easily it is played, object lesson over)
As a lay person, my take would be along the lines of this:
1- Where did our money go? How could over $50billion get spent and an endeavor which is claiming to demonstrate a world wide climate crisis has written software that fails at industry standard QA/QC? Those who are footing the bill, and those who are going to be impacted by this, have a moral right to know.
2- Why should anyone hold confidence in the work product if its quality is so poor? Why should the claims and conclusions be held credible? And especially, why should the policy demands of the people who wrote this stuff be held as serious?
3- The cost of getting it right and providing results that are not questionable, possibly GIGO, and lead policy makers to support incredibly expensive solutions is not too high. If things are as bad as they are claimed to be, and the cause is CO2, then we do not really need yet another confirmation paper showing that kidney stones are going to increase in Dallas, or pythons are going to live in the wilds of Kansas, due to AGW.
Frankly as a lay person I find the reluctance to apply professional industry standards to the climate science software enterprise a huge ‘tell’ that something is not right.
Perhaps the AGU should reconsider turning its Mooney communication gun the public. Maybe it would be more productive to turn it on the climate science community and explain to its members how dodgy work, lack of transparency, hubris and arrogance are generally considered bad by mere mortals.
Lay people may not, as another poster pointed out sometime ago, understand physics, but they do understand the smell of a manure covered field. What I read here smells like a field covered in manure.
1- Where did our money go? How could over $50billion get spent and an endeavor which is claiming to demonstrate a world wide climate crisis has written software that fails at industry standard QA/QC?
The simple answer is that it’s more expensive to write working software using a dysfunctional process. Assuming that the released code is computationally correct (which should be a fair assumption – some of the processes outlined by Easterbrook make it seem that they have a good handle on verification), it’s still within the realm of possibility that poor process/technique had made it take longer to get there: poor requirements management can mean that bugs are found later in the process, poor structure can make it harder to track down bugs when they’re found, and poor practices can make the code “brittle” (changes are approached with trepidation because the downstream effects are poorly understood). Better process looks more expensive on the project schedule because remediation time is almost never adequately budgeted for, but is cheaper in the long term.
2- Why should anyone hold confidence in the work product if its quality is so poor? Why should the claims and conclusions be held credible?
Oddly enough, the code could be a mess and still be functional and valid. What justifies confidence is the process of validation (in a nutshell, validating that the requirements match the purpose of the effort). Verification (ensuring that the code matches the requirements) is important, but doesn’t tell you whether the system being built is the “right” system or not – validation does. Obviously, good practices enhance the ability to validate the system.
In my experience and in reading history, I seriously doubt if your view on 1 can be correct. Garbage makes garbage.
You are implying that these systems are unvalidated. To me that excludes their reliability.
I wish that were the case. I’ve spent years cleaning up after systems that, while they worked to spec, the manner in which they were constructed was truly horrifying.
That being said, performing to spec isn’t enough – validation is required to confirm that the spec matches reality. Easterbrook argues for a validation against theory rather than observation (which I find more than a bit hard to swallow, but he’s not been back yet to defend that). I don’t have enough evidence one way or another at this point to say whether they’re validated or not. I do agree that unvalidated = unfit for purpose.
“The simple answer is that it’s more expensive to write working software using a dysfunctional process. ”
The thing that annoys me more than the money spent on climate models is that they are only working on one basic model. The one driven by co2. If the custodians of the supercomputers had been required to provide cpu time to a variety of models parameterised in accordance with a variety of climate theories, so we could compare results, I think there would be a lot less bitching about the amount of money spent.
Climate models aren’t designed to react to just CO2. They are fed many climate forcings – CO2, CH4, N2O, various other gases, solar changes, volcanoes, ozone changes, sulfate and other aerosols, land use change, and so on.
You are missing my point. A young discipline such as climatology needs competing groups with different climate theories. The faux consensus created by excluding alternative theories is to the detriment of knowledge advancement.
What “alternate” theories have been excluded? In other words, what *credible* theories are there that have no CO2 dependency at all?
Figure 1 on page 3 of this from Steve Easterbrook gives a good conceptual overview of the components of a GCM. In fairness, it’s not all CO2.
And who got away with the 50 Billion Bucks? That’s serious money in anybody’s pocket book. All we have is an unreadable 800 page doucment produced by amateur authors in their spare time. And a lot of crap computer code that may or may not be doing something remotely useful. Bu we have no way of knowing which.
The money went somewhere…where is it?
Nowhere near $50 billion has been spent on climate modeling.
That’s a standard “skeptic” meme. Right up there with Obama’s trip to India costing $200 million/day.
OK. You say $50 billion is too much
How much has been spent? Where can the taxpayer see the audited accounts for the lower amount?
When George Dubya cam to UK, the cost of his entourage alone makes $200 mill look not entirely unreasonable. 15 Lockheed Galaxies bringing ‘stuff’ Heathrow shut for hours, zillions of sceurity etc etc
Check with your government.
Also, to check claims, Snopes is your friend. Ever hear of it?
Snark is not your friend. It only makes you look more reactionary.
And if Snopes has something on this, let see the link to it. I looked and nothing is there. Sort of like your argument.
You assert that it is less than $50 billion. To do so, you must know the correct figure. Please tell us all what it is.
Why don’t you prove the positive?
Prove the negative.
Take a look at
Thanks. I will study it with interest.
At first glance it seems highly relevant to the topic under discussion.
Let’s take a look at NCAR, as an example. Going by the numbers in
The total NCAR FY2009 budget was ~$170 million. NCAR does much more than climate modeling – solar studies, atmospheric chemistry, oceanographic studies, mesoscale work, computational support, and so on. Yet NCAR is one of the elite climate modeling centers in the world. Let’s be generous and say 10% of its entire budget goes for climate modeling – ~$20 million.
GFDL is another climate modeling center of international stature, and so is NASA GISS. They’re smaller than NCAR, though, so let’s say $10 million each per year. There’s a few other climate modeling groups (MIT and so on) and some university stuff too, but they’re all quite small. Let’s give them $2 million/year.
Add all that up – $20 + (2 * $10) + (5 * $2) = $50 million/year, very roughly, for the entire US climate modeling effort. Over 20 years, being extremely generous, $1 billion dollars. Interestingly, the US National Weather Service gets roughly $1 billion per year.
To put that into perspective, compare that to (say) NOAA’s or NASA’s total budget over those 20 years, or, even more amazing, the DoD budget over those 20 years. The money spent on climate modeling in the US is so far down in the noise it’s hardly even there.
Lets see some justification for the 10% number that you pluck from the air. Because below (with assistance from Steve M) we’ve come up with a figure of about 5 times your estimate. So maybe 50% would be a better guess…(we would then have a consensus…so that would be OK then, whether right or wrong :-) ).
And in the supercomputer environment, just buying and running the compute power needed doesn’t come cheap. There are plenty of indirect costs that need to be charged to the modelling budget, not just directly employed staff salaries and benefits.
The taxpayer has got 30 models.
From what I have read on this thread and elsewhere, none of them have been shown to have any useful predictive abilities at all and the chief proponents (the jobbing builders) are actively opposed to having anyone even attempt to do verification and validation. as is a common expression in UK ‘well they would say that wouldn’t they’
I wonder how you arrive at the conclusion that these are ‘very good’ models. If I order a garage, that is what I expect. I do not expect a car port, however pretty the curlicues or however intellectually prowessed the brickie may have been.
Please explain why you think that any of the models deserve continued funding if they are no good as predictive tools. After 25 years the endeavour has not left the starting blocks.
the chief proponents (the jobbing builders) are actively opposed to having anyone even attempt to do verification and validation.
In fairness, I don’t think this is the case. Based on what I’ve read from Easterbrook’s work, they seem to have a good handle on the verification aspects. Validation is a little bit more complex: Easterbrook’s position is that the model should be validated against the theory which should be validated against observation. I would question that, based on the usage of the models as a tool to inform policy (it may be adequate for a purely investigative tool). The main objection to rigorous validation is the danger of it slowing down the science – again, an understandable gripe for an investigative tool, not one that’s used to help shape policy.
I wonder how you arrive at the conclusion that these are ‘very good’ models. If I order a garage, that is what I expect. I do not expect a car port, however pretty the curlicues or however intellectually prowessed the brickie may have been.
Bear in mind that these have been developed over decades and started out life as investigative tools. I don’t think it’s fair or accurate to infer bad faith on the part of those involved. To use your analogy, we (the public) started paying for a car port and then realized we wanted a garage – we can get there, but we can’t fault the builders for the difference.
Please explain why you think that any of the models deserve continued funding if they are no good as predictive tools. After 25 years the endeavour has not left the starting blocks.
I’ve posted a link a few times during this thread that shows the progression of the coupled models over the years. The number of modules has grown considerably over the years. I’m far from convinced that the models are capable of making these types of projections, but progress on the scientific front appears to be taking place. Progress on the other fronts needs to be made as well.
NCAR doesn’t spend half it’s budget on climate modeling alone. Take a look at the business plan at
It’s a bit dated, but by FY2008, they were estimating ~$8 million/year. My $20 million estimate was roughly 2X too big, like I said earlier.
OK. Now I have had a chance to study your link here are my observations:
1. It is a link to a document entitled ‘The US Global Research Program fro 2010, and includes financial information as Appendices.
1. The US Global Change Research Program has a budget of between $2 Billion and $2.5 Billion per annum. Over 25 years that doesn’t make a total spend of $50 Billion seem entirely unreasonable.
2. This budget is about 30o times more than Koch Industries are alleged to spend on ‘denial, each year
3. It is not possible to separate out exactly which components of this expenditure are directly spent on climate models. But given their fundamental importance in the whole field, we should assume that a fair chunk of the money will go on them. Maybe $50 billion overall is an exaggeration, but somewhere over $10 Billion does not seem unreasonable. Still a large chunk of change.
4. US is not the only government funding such work. In UK we have the Met Office who take about $250 million per annum and the CRU/UEA guys. Other countries no doubt do the same.
5. Though the exact sums mentioned in the threads above may have been slightly overstated, the order of magnitude of the spend on GCMs was correct.
It remains an entirely legitimate question to ask what the taxpayers got for their money (seemingly very little) and who got the $10 Billion + that has been spent over the last twenty odd years.
US climate modeling efforts have not cost anywhere near $500 million/year for the last 20-25 years. How much of the UKMO’s annual budget is spent just on climate modeling?
You sent me the link whence I dervied my numbers. Please show me how to arrive at a number other than my estimate.
Just asserting that I am wrong doesn’t necessarily make it so. Show us a better way — or better numbers.
The floor is yours.
Since most climate model scientists are public servants it is flawed thinking to suppose that somehow they have got away with $$$[your own figure here]. They’ve each “got away” with a government salary and benefits.
Most of the money will have gone on satellites.
Easterbrook estimates 1000 man years for a big model (Mosher suggests that’s $150 million). The models have been developed over, say, 20 years, so that is 50 people on average employed to develop and maintain a million lines of code.
From the taxpayers perspective, they have paid a bunch of people a lot of money to do a job for them.
Imagine you pay a jobbing builder to build you a garage. You pay him the money but only get a car port instead. You could quite legitimately wonder what he did with all the rest of the dosh. As you did not get much of value.
Whether or not the individual workmen involved were complicit in the deal is irrelevant to the big picture…little of value was delivered. The end customer was sold a pig in a poke. The deal was flawed.
And using your own figures, I believe there are about 30 climate models at $150M each that makes a total spend of $4.5Billion. So maybe my estimate of $10B was a bit high, but still in the right ballpark.
What the “jobbing builder” got is important because you are implying that someone illegitimately got away with billions when in reality all the jobbing builders got was a normal salary.
Many of the 30 or so models use shared components, and Easterbrook’s numbers relate to one of the biggest and oldest of the models, so multiplying up is a wrong assumption.
But you have only invoked these 30 models to try and enhance the amount of money involved and thereby to support your argument that the tax payer has got very little. It’s got 30 very good models. If 30 is too many then make the scientific argument for cutting them.
As noted in this comment, the estimate for the operational costs of the new NCAR super-computing center alone is at about 25 million US$ peer year. And we all know how accurate these estimates are not.
LANL: Climate, Ocean and Sea Ice Modeling ( COSIM )
PNNL: Aerosol Climate Initiative ( ACI )
ORNL: Climate Change Science Institute ( CCSI )
ORNL Partners: SCIDAC and Supercomputer
Steve E has some valid points. Software development on the cutting edge of science is a different beast than software development within established science. The normal specification process would have to be adapted.
Let me give you an example from my experience. The DOD wanted my company to build AI controlled threat aircraft for Man in the Loop simulation. Part of that requirement was easy to specify. Take formation flying. It’s easily described, easily specified, and easily validated. No brainer. The other part was very hard to specify. “the threat shall react realistically in engagements with human opponents” What the hell?
The spec is broad and there is no clear way to validate it. what does “realistic” mean? beat a human? tie a human? pass a turing test? WTF.
If you hold the bar too low you end up with a simplistic thing. hold it too high you get nothing. On the development side there was also No clear implementation path. It was not clear which approach ( rule based approach, nurel net, etc) would guarentee a working system. it was trial and error. Agile development. had to be because we were using software development AS PART OF THE DISCOVERY PROCESS. I imagine that Steve E would make the same case for cutting edge climate science. Doing the software is the science. So there will be processess that cannot be approached in a waterfall type development. You gotta write the code first. And the only guy who can write the code is scientist. He is not gunna write a DID before he starts slammin the keyboard. Aint gunna happen. And if you pair him with a Software engineer, be prepared for major culture clash. ( I’ve seen fist fights when I tried that brilliant idea)
Regarding this: “The spec is broad and there is no clear way to validate it. what does “realistic” mean? beat a human? tie a human? pass a turing test? WTF.” Was the requirement ever refined for you beyond that? As stated, I fully agree that there’s no way to validate it…I’m not sure if that’s a blessing or a nightmare from a contract basis.
I fully understand writing proof of concept code to pick a viable design path. I also understand the need for Agile methods when dealing with emergent requirements. However, Agile doesn’t mean no requirements at all. At the end of the day, I’d think you would have to have some end goal in mind for production quality code – basically starting with an impossibly broad idea and narrowing down. It may not be a formal DOD spec that you would use for a waterfall type project, but it’s not totally freeform either.
Not trying to pick a fight…just trying to get my head around the situation.
I can certainly sympathise with the notion that IT people tend to not have the idea of creating new things i.e. IT people tend to work in the exciting world of (e.g.) dental billing, where pulling names from a database is an already well established mechanism. Nobody needs to rethink how to obtain a name from a database, and chances are that most IT people can’t do that anyway. IT people aren’t really doing software in the same sense as what you refer to.
You’re not describing anything different than development for robotics, where you have to hack code* to make things occur at all, much less correctly, and this will tend to always be the case when dealing with new ways of positional sensing, new mechanical design, etc.
*hack code = “Software is the Discovery Process”
That said, the job of the software pros is to turn these hacks into “production quality” code which is correctly modularised, gets unit tested (simulators get created and coded!) and otherwise is retrofitted with the error handling scheme of choice, etc.
This lets the mathematicians and science boffins do their mad science thing and then turns their effort into that which is robust; that is it’s free from errors and does what it’s supposed to do. If it’s supposed to calc area under the curve, then this is what the module does, and if the input is NFG then it says so, etc.
This as far as I know is fairly standard. I don’t know why the cutting edge of climate science is necessarily different than the cutting edge of industrial automation.
Since all accept that models are imperfect simplifications (by virtue of lack of computing power as well as lack of knowledge), how does one decide what to improve. And having decided, how does one go about improving it.
If your model doesn’t produce very good ENSO, it takes judgement to work out what elements in your model are at fault and, indeed, whether your model is complex enough to demonstrate ENSO. One doesn’t simply set up a waterfall process to build the ENSO subroutine.
Any changes you make to “improve” your ENSO are changes that apply to the rest of your model planet as well, so may make things worse elsewhere. If it makes things worse elsewhere then you have to make a scientific judgement as to whether the worse behaviour is more important than the better ENSO – and different scientists will disagree. You don’t withdraw the change *just* because there is a red flag on the test harness due to a bit too much dust from the Sahara (though you would if it leaked memory, failed coding standards, was unreproducible and unreadable etc. etc. etc.)
>> If your model doesn’t produce very good ENSO
As far as I know , no GCM produces or models the major cycles in ocean currents. They are regarded as “internal variability” and ignored.
Certainly that is the case for the Met. Office models. I have that directly from one of their team in reply to a question I sent.
Their mission is to explain climate change as the result of “external forcing” such as CO2.
I picked ENSO deliberately because it is a complex issue which is not simulated well and where we don’t know all its drivers. You should have asked them if they were doing research to try and improve the situation because they would have said yes.
Very typically, I think it’s SOP almost everywhere, there are production-grade versions and developmental versions. At some time the new science discovered using the developmental version is incorporated into the next production-grade version.
I’m certain that Steven Mosher is aware of this.
It’s not rocket science. We mere mortal engineers do it all the time. Even when SQA controlled software is concerned.
“In the end we ended up with 3 companies doing the work. ”
The large divergence between GCM programs and between GCM and nature indicates major need for improvement.
Climategate HarryReadMe exposes requirement to eliminate shoddy fraudulent code and develop professional open code.
Time to immediately move to 3 “companies”
Pick the best two.
At least build the third from scratch based on those two.
Thoroughly document verify and validate all three.
Make the software modular.
Then let existing teams group into specialties to build the best modules.
All Under strict professional software management.
Ensure they incorporate ALL known physics with steady improvement on each factor.
The Harry “README” file has nothing to do with GCMs.
Harry is a Climatologist at the Climatic Research Unit, University of East Anglia.
Absent any reason to believe otherwise, it seems sensible to assume that his standard of work is representative of the field of Climatology as a whole. Anmateur and undisciplined is a charitabel way to describe it.
After all, everyone is happy to rely on CRU data (HADCRUT) as one of the three basic tmeperature records of the entire world. And none from within the field of Climatology have yet criticised HRM as being a special bad case. Insated you all got uppity that it came to light at all…but not about its contents.
Take a look at, say, NCAR’s CESM.
Realize that it’s very strong evidence against your claim that “his standard of work is representative of the field of Climatology as a whole.”
Let’s put it this way, Latimer – if the CRU part of the HadCRU dataset is so abysmally bad because of the software, then how come NOAA’s dataset, NASA’s GIStemp, and UAH and RSS are all very similar? Can you explain?
Since a large part of CRU;s work seems to ba applying arbitrary and unrecorded ‘adjustments’ to raw data, they could probably make it sing three verses of O Come All Ye Faithful and dance an Irish jig around the Xmas Tree. Whan you ‘adjust’ something it miraclously gives teh result you’d like. Big surprise!
Doesn’t mean that they do it in a professional manner though.
And see Judith’s comments balow about an upcoming thread.
Talking out your hat, again.
Compare HadCRU to the other four. What do you see?
FWIW, Steve McIntyre’s view is that HadCRUT does not involve a lot of adjustments, and probably is actually very primitive. It doesn’t involve much more than averaging the datasets. In fact, a lot of the fuss about Jones 90 (part of ‘the Wang Affair’, if you remember) was that it was the excuse used for HadCRUT not making any corrections for UHI.
Evidently something has been done to it, but not even Phil Jones knows what any more, which is why he was sending round those letters begging for permission to publish his diddled-with data as “raw”. They’ve lost the records. (Due to lack of tape storage capacity in the 1980s(!).)
The Harry code is in fact producing a completely different product, called CRU TS2.10, that has little or nothing to do with the global temperature anomaly series – one reason why I was harming myself laughing in the earlier debate when D64 clearly thought that it was, and that replication of GISTEMP somehow absolved its errors. Harry was trying to update it to produce CRU TS3.0, but not getting on very well. The previous version passed peer-review when it was published.
I’ll suggest that you look at MIT’s code and their process. Further, since at least 2007 gavin has been working to improve practices with ModelE. That code is online. if you started looking at it 2007 as I did and look at it again today you can see that he is working diligently in the right directions.
The community is moving in the right direction. Those of us with software backgrounds would obviously like to see our values reign the day. Those with science backgrounds prioritize differently. It’s not so much of a conflict as it is a prioritization of values. As the GCMs have more influence on policy folks in the science community may have to rebalance the priorities. That is not a straightforward process. raising false problems does not help the process.
That carrot is not believable. Scientists could be forgiven for thinking there’s nothing in play when the subset of “technically knowledgable people” give a strong impression of neverending auditors and neverswaying low opinions. In which case the chips are not all in one hand and the onus on persuasion is not all one sided.
On the other hand, I really liked the sentiment…
I found how he came up with the 1000 man year estimate, here on page 21 of his slide deck:
My back-of-the-envelope calculation of the
cumulative cost of a current fully-coupled earth system model is around
$350 million – representing the work of around 50 scientists working
over a 20 year period.
He does note right after this that:
A rebuild might be done with a much smaller
team over just a few years, but will still require substantial computing
facilities – still in the order of hundred million dollars.
That, however, isn’t reflect on the slide itself.
Perhaps it’s just me, but an estimate for a rewrite that counts both the initial development and all support and enhancement since is likely to be a tad high.
The new computing facility is already under construction.
Cost estimate at April 2009.
Those supers will not be used just for climate modeling.
How does that conflict, if at all, with what I said?
So what else will they be doing?
To this Joe Sixpack, it looks exactly like climate modelling. You may have some internal terminological differences, but viewed from even a short distance outside academe, that seems to be its purpose.
I assume that the other uses aren’t Air Traffic Control, running Google, being the data centre for Wikleaks or anything else?
‘Climate modelling’ is good enough for me.
Some of the research will be solar modeling, mesoscale modeling, observational analyses, and whatever the University of Wyoming and other universities will want to do with it.
I’m sure the climate models forecast the current cold over Europe. Cooling is much worse than warming, especially as it impacts food supply. It also requires more energy to keep from freezing to death. I’ll take warming any day.
Perhaps there is another aspect to the problem with validation of climate models. Maybe the historical data is being misused to start with. Most of the historic instrumentation records were collected with instruments with no better than plus or minus one degree F accuracy in installations with variable reliability at being able to measure true temperature to even that accuracy. That data is then folded, spindled, and mutilated. In simply plotting the difference between the raw USHCN daily temperature records for my own town, I see that the official GISS record differs by minus 2.4 degrees F to plus 2.9 degrees F at various times. In any given year, there is a cyclic two degree variation between the two records. Which is correct. I’ve checked the USHCN data file against the PDF images of the manually collected temperature data and there is an exact match. The GISS records are “homogenized.” Since our town just recently manage to grow to a population of 12 thousand and historical records of this area show no rational reason to expect a need for random multiple degree corrections.
The 1 to 1.5 degree F trends that the models are trying to match are not only below the accuracy level of the instrumentation but is less than a fourth of the range of “homoginization” adjustments if my town is a reasonable example. The trends the models are being asked to model may mostly be smoothed, meaningless noise. Smoothing noise from a thousand or so sources still gives you smoothed noise. Because it matches you expectations does not assure that it is useful as a base for comparison.
The current GCM programs may actually be very good but might just be tuned to what is mostly just random noise. Anyway, I figured somebody should at least mention the possibility.
Data quality is one of the most important issues the climate science community as a whole has yet to address. Instead they hire Mooney to browbeat the public into submission.
Right on Gary W it is just random noise and the pretence that the “global average” is known to a hundreth of a degree is total absolute nonsense and it is going to cost the taxpayers hundreds of billions of dollars. Get back to first principles you scientific eggheads and prove your hypothesis.
In any data set your “precision” is limited by that of the lowest factor or component, whether additive or multiplicative. That is, you can’t get more significant digits out than in. Averaging a large number of imprecise numbers on the theorem that the errors will cancel is very poor practice, unless you know clearly what is causing the errors, and you can be assured those influences are not biased.
A quickie test for that is to watch whether corrections and adjustments are randomly +/- distributed. For GCMs and the climate record, that is a major FAIL.
Like pretty much everything else about them.
There ought to be for each model a clearly specified description of each component and how those components work together. Models ought to be able to be examined not from the code itself but from the specification.
Maybe thats the case for some but not for any I’ve seen.
“All sound and fury; signifying nothing” (Shakespeare). I am sorry. It does not matter what sort of computer model you are talking about, the only way to validate ANY model is by predicting the future a sufficient number of times that the outcome could not have happened by chance. I know I have said this before, but…..
Would that not be a case of ‘confirming the consequent’?
No, it would be validation of the model.
climate model predicts climate –> valid climate model
valid climate model –> climate model predicts climate
So, it could be that the climate model gets a prediction right by chance. I think that is why the poster postulated that a model predict climate more than once in order to decrease uncertainty.
What I meant was that there was a little more to it than just making predictions. You have to make predictions about the right things, you have to explore the full range of inputs you want to use it for, where they are inaccurate (and all models are inaccurate) the boundaries, reasons, and reasons why it doesn’t matter must be understood, and most importantly, you must have independent reason to believe that any model that was in fact wrong or faulty in some relevant sense would fail to predict what you’re asking it to predict.
I come across a lot of models that work most of the time, over most of the parameter space – they wouldn’t even submit it for testing if it failed the obvious stuff. The problems mostly occur with the exceptional values – big numbers, zeros, singularities, boundaries between different cases, missing inputs, outliers, weird error distributions, unstated assumptions that are almost always true, numerical instability, etc.
The basic problem is with that expression “could not have happened by chance”, because it assumes that you know what could happen by chance. Remember, this is the output of a program that was designed to answer this question, and has already passed some basic tests. It’s got a bit of a head start over pure uniform random numbers. The distribution of outputs you can expect from buggy code is not necessarily anything straightforward.
Correct code gives a correct answer. Checking that you got the correct answer is (while still necessary) a case of confirming the consequent. What you need to know is that buggy code will reliably give a wrong answer. In that circumstance, and that circumstance alone, getting a correct answer will prove that the code is not buggy, and hence is correct.
NIV – Are you of the opinion that climate is either chaotic or spatial-temporal chaotic ? (Or possibly some other brand of a chaotic system?)
Difficult to say. Weather is definitely chaotic. I am reasonably sure that chaotic behaviours apply over much longer timescales as well (decade-century). There is no sharp delineation between “weather” and “climate”. (The 30 year threshold appears to be arbitrary – I’ve sometimes asked AGWers how the number is – or should be – calculated, but not yet got an answer.) But whether the chaos eventually dies out when considered over a long enough period… I don’t know. Either answer seems possible.
It depends as well on what is meant by the word “climate”. It is in some sense the statistical distribution of the weather, but whether the statistical distribution per se, or a sample distribution taken over some finite interval over time, isn’t always clear. You can (very likely) only get a chaotic climate if used in the latter sense.
Looking at the Milankovitch cycles chart, it kind of appears to be a chaotic system super-imposed on the M. cycle, but that isn’t the point I’m getting to. If climate is chaotic longer term (define as you wish), then we couldn’t expect even a valid model to make predictions or forecasts. I could only tell us the general behavior of the system. Is that the way you see it?
No general circulation model will tell us anything more than “the general behaviour of the system” beyond a threshold of a few weeks.
Climate, however, is defined as the statistical distribution of the weather, and that might or might not be determinable. There are plenty of chaotic systems for which the distribution can be easily found.
However, it is also possible for long term behaviour – the distribution observed over a large block of time – to show sensitive dependence on initial conditions. Accidents of weather may tip the distribution into one regime or another, or may speed up or delay a climate cycle, and these differences may accumulate and be magnified, or slower movements in ocean currents may show their own turbulent, chaotic behaviour on their own time scale.
If this was the case, then even the general behaviour of the system couldn’t be predicted. Although it would be known (from a correct model) that it wasn’t predictable, and the sort of general behaviours that might arise.
Don’t mistake apparently random, irregular behaviour for chaos. Chaos is a specific technical mathematical property, and not the only way to get complexity.
Climate models are only required to tell us the general behaviour – the statistical distribution – so the weather being chaotic doesn’t imply that climate models can’t predict changes in the climate. But yes, it’s quite possible that even a correct climate model couldn’t make predictions if the climate is chaotic.
Depends – if you have an education in liberal arts, or statistics.:)
Suppose someone take a set of actual data, and some bits of theory and some hypotheses about variables , and form a series of adjusted calculations that closely replicate what has happened across those data sets. That would be a sucessful model, but only within the realm of the data set. That does not prove the hypotheses to be correct, nor does it imply that the future outcomes from the model are going to be.
I believe I heard just the opposit from an interview with a modeler about 5 years ago on the CBC.
Need to get the statisticians involved in a big way to at least tell us what the probabilities & correlations are. That might not be a practical task.
Seems it would be better to have only a few models in the world and use the money saved to up the game of the modelers. Use it to pay statisticians, professional project managers, professional programmers, and others to up the quality of the model and the code. Also, make all model code and all model runs public. (Not just the code and runs of some models.)
Here are rough page counts for the theoretical basis of three engineering codes.
Theory Manual Code 1: 1220 pages
Theory Manual Code 2: 1064 pages
Theory Manual Code 3: 708 pages
This material is not in the public domain, so I can not provide more information and the manuals are not online. Maybe someone knows where such material might be available in the public domain. The source code, for only these theoretical aspects, run to a few 100 thousand LOC; generally less than 500,000.
These counts are for the manual that describes the development of the continuous equations, the discrete approximations to these, and the numerical solution methods applied to the discrete approximations; typically labeled Volume 1 for this type of model / code. The models differ somewhat in their application areas, but there are many overlaps between them. Code 2 has limited, more focused, application areas than code 1. The last code has not yet been the subject of applications by a wide audience of users. The oldest, Code 1, has been in development and application service for over roughly 45 years.
These manuals are taken to be the specifications for only the theoretical aspects noted above. There are typically 3 to 5 additional; manual volumes that cover other aspects on modern engineering and scientific software; user guidelines manual, code structure and coding, verification and some validation, and validation for each specific application area and application systems. There are numerous additional reports and papers behind the condensation into the code manuals. Plus an enormous number of internal memos and reports that form the basis for selection, development, testing, and coding of the current-production models and methods.
All of this theoretical-basis material, and all other specification documents, have been subjected to independent peer review by organizations that are completely external to the organizations that developed the models and methods. The availability of theoretical details to this degree, and in a single and authoritative source, provides a significant advantage relative to knowing exactly what is in the code whenever potential problems might be identified by users. Equally important, as the models and codes evolve, the developers know exactly where and how new theoretical pieces parts fit into the grand scheme of things.
Additional information about typical specification documents for these types of engineering codes is here.
Certainly the public domain issue is not important to a science enterprise as heavily funded as climate and the computer models that support its claims?
I will bet good money that not only that these manuals, or their equivalents, could be purchased and implemented within the budgets the modelers have, but also that decision makers have chosen to not be bothered to do so.
There is an enormous difference between engineering models, and climate models. Engineering models must be validated. The output of these models, when used, is supported by the signature of Professional Engineer. If the numbers are wrong, that PE can be sued.
But both climate and engineering models apply physics to make estimates of how systems will perform.
The difference you are alluding to, if I am following what you are saying, is a cultural difference.
The Royal Society put out a position paper entitled, “Climate change, a summary of the science, September, 2010”. It states:
“Attribution of climate change
37 The size and sustained nature of the observed global-average surface warming on decadal and longer timescales greatly exceeds the internal climate variability simulated by the complex climate models. Unless this variability has been grossly underestimated, the observed climate change must result from natural and/or human-induced climate forcing.
38 When only natural climate forcings are put into climate models, the models are incapable of reproducing the size of the observed increase in global-average surface temperatures over the past 50 years. However, when the models include estimates of forcings resulting from human activity, they can reproduce the increase.”
Now the “generally accepted” global average surface warming over the last 100 years is about 0.7C. There are many issues with the quality of the data and the manipulation/fabrication of the data. See:
Dr. Curry, do you have the confidence in both the surface temperature records and the ability of the climate models to simulate past temperatures with sufficient accuracy that climate change can be attributed to man’s influence with any degree of confidence?
See my series on detection and attribution, parts I, II, III
I really need to do something about a roadmap for this site, it took even me awhile to find these posts.
Judy and/or Dan Hughes – Forgive me for repeating here part of a question I asked above that may have been overlooked in the flood of subsequent comments. I address it to you as probably the two individuals qualified to answer in some detail, and because it is relevant to the thread. Steve Easterbrook could also comment if he happens to come across this thread. The question relates only to software and not to other elements of modeling with their own uncertaintiea:
” To what extent do current deficiencies in the process lead to spread among different model outputs, and to what extent to they lead to systematic errors in the same direction? We already know that in some cases, the mean results from a multiplicity of models match observed data better than the best of individual models. Are we seeing an averaging out of errors?”
There is spread not only in the outputs, but in the assumed inputs (e.g., aerosols for cooling), so the reason for output spread is a little indeterminate. There are also common assumptions in all the models, which would produce a bias such that the mean of models is offset from reality. The assertion that you can “average” models and that this represents a canceling of errors is an unproven assumption.
It is not entirely an assumption. There is empirical evidence for both error canceling and systematic bias – see, for example:
Because of the thread topic, my own question related primarily to the software component, but of course, inputs and parametrizations are also a source of uncertainty.
If the above link doesn’t work, here is another try:
Actually 0.7 degrees over a hundred years is not an unusual figure at all. It has been exceeded several times just during the Holocene, not to mention earlier intervals, so obviously the internal variability of the climate systen has been underestimated.
Before all of the intricacies and details of models are worked out, I have a question for you. Since the output of modeling will have to be validated against observations, are you satisfied with the overall quality of land and sea temps? Seems to me that this should be nailed down first.
Bob, talking about “the overall quality of land and sea temps” says: “Seems to me that this should be nailed down first”.
Indeed it should.
In the latter part of my career in the mining industry I performed a number of third-party reviews of assay data sets – which are closely analogous to surface temperature data sets – to make sure that they met the minimum quality control criteria imposed by regulatory agencies. (The assay results couldn’t be reported to the public if they didn’t). These criteria are designed basically to prevent the ruination of yet more widows and orphans in mining scams, but they actually aren’t all that strict (and haven’t always worked, as survivors of the Bre-X Gold fiasco will attest.)
Yet the surface air and sea surface temperature time series the IPCC uses to support its conclusions on AGW – which if wrong could ruin far more people than just a few widows and orphans – don’t come remotely close to meeting them. I certainly wouldn’t dare sign off on a set of assay values that had been “corrected” the way the IPCC’s temperature data have. More than just my professional reputation might be at stake if I did.
In addition, many of the “corrections” applied to the raw temperature data are so large as to raise serious questions as to whether the time series the climate models are being “validated” against are themselves valid. I have done enough work to satisfy myself that they aren’t, but a full discussion of results will have to wait until we get a dedicated post on this issue, which hopefully will happen soon.
Judith, perhaps a dedicated thread on temp quality is warranted. Five sources of global temperature data exist. Three are anomalous estimates of surface temperature, ( NASA GISS , HadCRU , and NCDC) . The remaining two are anomalous estimates of lower-troposphere temperature, (RSS and UAH ). Since they are anomalous, they are all estimated from a baseline and the baselines have different start times. Numerous questions about data quality have been published and the blogosphere is lit up with doubts about several, especially HdCRU.
It maybe worthwhile entertaining a detailed discussion at Climate, Etc.
I recommend reading Tamino’s analysis of these five datasets as a start.
On the other hand, I’d still like to see the UAH group’s code.
He knows his ‘analysis’ is so unworthy he is not willing to put his own name to it. And, that being the case, no sensible person will bother to consider it.
I concur Richard. Tamino is viewed as the “statistical mascot ” of the Hockey Team. You would get a much more robust discussion here.
Are you filling in for Watts? Galileo used pseudonyms in his writings; I guess that makes Galileo’s work suspect.
In any case, the data is out there and the analysis Tamino performed isn’t difficult; do it yourself and verify his findings. Go for it!
Not a pseudonym. It is just that the discussion is trying to move past the Tamino’s of the world. He is a smart guy, but you have to admit that at least some of his analyses have been biased. Climate science needs a fresh, objective look at data on a go forward basis.
So, go ahead and take a “fresh, objective look” at Tamino’s data and prove him wrong.
Derecho64, look at this link if you think Tamino is an honest broker. http://bobtisdale.blogspot.com/
Tisdale may want to look at Hansen’s latest:
I made a point to ask why Tamino shied away from serious debate with Tisdale. Diverting you response to a Hansen analysis is pure obfuscation.
Hansen explains why the GISS analysis doesn’t use SSTs in certain polar areas – and his explanation makes sense. Tisdale’s claim that Hansen “deletes” SSTs doesn’t really hold up in a scientific sense.
Dereho64, There was no reply button to your response. Fine, why couldn’t Tamino say what you did and engage Tisdale? Why are you his proxy?
I’ve posted Hansen’s comments over at Tisdale’s blog, we’ll see what he says.
So if Hansen is now presenting a data set of SAT he should say so and not present it as SST. That does not “make scientific sense” it is bad science.
It is the whole 1200 km frig that needs putting into question. There are no longer any real temperatures in Boliva in the data. They are “estimated” from reading in adjacent countries! Bolivia is on a high plateau the other countries not. The result of this change was a warming of the whole of Bolivia by over 2C.
Similar dubious estimates are done for the Arctic where supposedly “interpolated” data show more warming than the surrounding data they are drawn from.
These FICTITIOUS temperatures over large areas are used to offset the real cooling that real climate is experiencing and perpetuate the “hottest year ever” mantra.
Galileo used pseudonyms because his work was regarded as heretical, a charge that risked him being burnt alive.
Tamino is neither of the same calibre nor in the same situation.
Your comment , as so often, is specious and a deliberate distraction.
He knows his ‘analysis’ is so unworthy he is not willing to put his own name to it.
His name is hardly a secret. And unless you’re prepared to show us where he’s wrong why should any sensible person bother to consider your opinion?
My opinion is not relevant but his is.
He did the analysis and thinks it is of such little merit that he will not put it to publication and he will not put his own name to it.
When he – who did the analysis – thinks it is worth so very little then there is no reason for anybody else to waste their time considering it.
Well if you want to dismiss any argument in the form of a blog post (ie not put to publication) the fine – that’s virtually anything ever posted at WUWT out of the window.
We can also ignore anyone who either blogs or comments under a pseudonym if you like, which would exclude many of the posters here (on either side).
That doesn’t leave an awful lot though.
Your response is typical of AGW-supporter ‘arguments’. It posits a ‘straw man’, distorts the case, and obfuscates.
The issue is NOT whether ‘Tamino’ posts on a blog. However, it is worth noting that the “it is not peer reviewed” mantra is used by AGW-supporters whenever that avoids answering an argument.
The issue is that ‘Tamino’ knows what he writes is so untrustworthy that he is not willing to put his own name to those writings. So, his refusal to publish under his own name is a declaration by him that his posts are rubbish.
Many people – including me – need a good reason to trawl through rubbish that others have thrown out.
I second this recommendation. If Tamino wasnt such an ass ( I can say this cause he’s anonymous) he would make a welcomed voice to a reasoned debate about the surface temperature. It’s gotten to the point where he will block any comment on make on his blog ( like “good work !”) he does this despite the fact that I recommend his analysis to readers on WUWT. So go figure, Anthony allows through a comment where I recommend Tamino. But Tamino doesnt allow through comments on his blog where I praise his work. Why? You know the answer. You can wear a hat that says “I believe in AGW” but you cannot wear a hat that says “Mann and Jones were wrong” That sort of personal view is not allowed. even though that view is not about the science of AGW.
This is definitely on my list, but I have been waiting on a few things before tackling this
And don’t forget Judith we need to address the fundamental point which is whether the concept of an average global temperature (as currently calculated) actually has any scientific merit.
Good point. I will be fascinated to see what is said, since I;ve long wondered ‘WTF’ such a figure is supposed to tell us about anything of any practical use.
Sea ice doesn’t melt because of the global average temperature, but because of the local water temperature. Animlas don’t breed according to it. Plants don’t grow according to it (unless they are very special ones who can indulge in ‘teleconnections’. (ROTFL HMFS) and will subsequently be chopped up by palaeoclimatologits (if not in safehouse retreats (ROTFL HMFSA)).
I can think of absolutely nothing that is driven by this figure. So why the obsession with it?
do you believe in a LIA?
What does that mean?
What does it mean to say that the average global temperature is going up or going down. It does have a meaning. But the meaning is operational.
Think of it this way. When we say that the world will be 2C warmer in 100 years what do we mean by that? What do we mean when we say that the global average was colder in the LIA, or colder in ice ages. We clearly say this sort of thing. And we clearly understand each other when we say it, but what do we mean, operationally.
No Skeptic for example argues that the LIA did not exists because the concept of a global average is meaningless. So what do we mean when we say there was an LIA?
I thought it was clear. He means that the only temperature that matters in a practical sense is the temperature where you live.
For somebody standing on the frozen Thames in the middle of the LIA, it doesn’t matter one whit whether the mid Pacific is warmer or cooler than usual. The only thing that matters to them is the temperature in London.
Similarly for anybody living anywhere else. And as you know, the spread and variability of local temperatures at any given place acts very differently to the global average.
If it gets much colder we won’t need to go back to the Little Ice Age to try the experiment for real.
The Thames at Hampton Court is very very cold right now.
But NiV got it exactly right. What bits of physical reality are actually affected by the nebulous concept of the global average temperature.
If today’s GAV is 0.1C greater than yesterday’s, how would I as an organism located in a small geographical area,
a. know that it had changed? and b. why would I care?
This is where historical records are valuable. For example the records kept by the gardeners in the forbidden palace in China. They kept notes on when different species flowered in spring etc. When those kind of records are pieced together from around the world, we can get an idea of global climatic conditions at the time.
It’s fairly well established that the period from 1000AD to 1200AD saw a wave of warmness pass across the world. We don’t have exact magnitudes, but we do have clues left to be deciphered by archaeologists, paleobotanists and archivists which can be coupled with knowledge about the range of average contemporary temperatures across regions. So for example evidence of grapes being grown at Hadrians Wall by Roman Legionnaires shows that the English/Scottish border was at least 3C warmer than today, and because temperature falls 1C per two hundred miles north you go on average, we can reconstruct temperatures in Spain at the same epoch. Imperfect, but useful for approximations.
The LIA was characterised by froen rivers in northern Europe. SImilarly, we can get clues about temperatures further south.
Careful TB. If you follow that line of argument you may come close to denying that today’s temperatures are unprecedented. Or even that it has been warmer before the start of the Industrial Revolution and the release of the Demon Gas.
Since this is clearly not possible, there must be something wrong with all those old records. I guess that if you examine them closely then you will probably see a small watermark saying ‘Copyright Big Oil Denial Machine 2005’ on most of them. And the ones not so marked are just forgeries put about by Creationists.
Do not let these wily Satanic manifestations lead you from the True Path.
When sports drug administrators announce that an unnamed participant is under investigation, it casts aspertions on all team members until the investigated player is named.
Greg Craven has cast aspertions on all paleoclimatologists. Until and unless these (4 possibly more according to Craven) climatologists are named, I have no option but to suspect any and all research conducted by all paleoclimatologists maybe a product of advocacy and/or confirmation bias. Either way, their research is useless for policy considerations.
Please can the survival retreat be a very long way away and remain undisturbed and undiscovered for hundreds of years. That way we may never have to hear from these guys again. Result!
Above, I lised a link to one article in a series on the theme of model development and evaluation. The entire series is at
As a guide to the entire content, the introductory article by Mat Collins is worth reading for a sense of both the capabilities of both simple and more complex models (GCMs) and the current and future limitations of models that deserve to be addressed.
A number of items mentioned in some of the comments are addressed in the various articles.
For all you scientists who want to step up to the plate and condemn the “C” in CAWG, here is your chance.
Climate scientists say Earth’s climate also is changing thanks to man-made global warming, bringing extreme weather, such as heat waves and flooding.
The excessive amount of extreme weather that dominated 2010 is a classic sign of man-made global warming that climate scientists have long warned about. They calculate that the killer Russian heat wave – setting a national record of 111 degrees – would happen once every 100,000 years without global warming.
“The Earth strikes back in cahoots with bad human decision-making,” said a weary Debarati Guha Sapir, director for the World Health Organization’s Centre for Research on the Epidemiology of Disasters. “It’s almost as if the policies, the government policies and development policies, are helping the Earth strike back instead of protecting from it. We’ve created conditions where the slightest thing the Earth does is really going to have a disproportionate impact.”
“The extremes are changed in an extreme fashion,” said Greg Holland, director of the earth system laboratory at the National Center for Atmospheric Research.
For example, even though it sounds counterintuitive, global warming likely played a bit of a role in “Snowmageddon” earlier this year, Holland said. That’s because with a warmer climate, there’s more moisture in the air, which makes storms including blizzards, more intense, he said.
“These (weather) events would not have happened without global warming,” said Kevin Trenberth, chief of climate analysis for the National Center for Atmospheric Research in Boulder, Colo.
I see a lot of comparisons between modern temperatures and historical points of reference, for example how much warmer now than the 50s, 30s, 1800s, etc. How well do the models for instance, in the absence of increased cO2 from say the common 1800s point forward, predict the trending that would have taken place? Can they tell us anything, for example (since we were recovering from the Little Ice Age) as to when that warming would have peaked, levelled and then moved to further warming or cooling with any accuracy?
I ask with some dubiousness as to precision yes, but it would still be interesting to know how far we’ve come in our modelling. Graphically when I see the anomoly plots we often get it over a straight line representing a given date, surely the models aren’t saying the temperatures should or would have been fixed suddenly at the start date of mass production of anthropogenic CO2 or for that matter at any other chosen reference point.
At least a few models have been used to compare the effect of different forcings on the climate system. A good paper is
Meehl, G.A., W.M. Washington, C.M. Ammann, J.M. Arblaster, T.M.L. Wigley and C. Tebaldi, 2004: Combinations of Natural and Anthropogenic Forcings in Twentieth-Century Climate. J. Climate, 17, 3721-3727.
I have been asking the same question;
What kind of warming/cooling do the models produce for the 20th century when the antropogenic factors are excluded? How much natural variability do they predict?
I would also like to see what warming/cooling the models estimate over a 1000 year period. Matching a rising trend over 100 years (excluding the bump they do not show) just is not very convincing. Even the ability to reproduce the 19th century would more convincing, as evidence that the models react on other parametres than CO2.
In Hype Versus Reality on Indian Climate Change, Willie Soon and Selvaraj Kandaswamy raise a number of issues where projections of global climate models differ substantially from historic variations:
Coldest temperatures in England since 1639.
Cooling trends in northwestern & south India.
Himalayan glaciers maxed ~ 260 years ago from the 500 year Little Ice Age and have been retreating since.
India’s 20 years tide gauges show 1.3 mm/year rise, not 4 mm/yr predicted.
At least four 1000 to 1800 yr periods showed 1-3 m higher sea levels than present.
The Middle Holocene was warmer than today. etc.
It appears GCM’s need more verification & validation to improve predictions!
Steve McIntyre is exploring GISS Model E Data
In Lumpy vs Model E
Lucia shows a simple natural forcing with lag works remarkably well vs GISSE.
In <a href=”http://wattsupwiththat.com/2010/12/19/model-charged-with-excessive-use-of-forcing/#more-29728 Model Charged with Excessive Use of Forcing Willis Eschenbach explores the forcings in GISSE. He finds very strange linear forcings.
He cites Kiehl, GRL 2007 “there are no established standard datasets for ozone, aerosols or natural forcing factors” as a cause. Furthermore: “So in addition to the dozens of parameters that they can tune in the climate models, the GISS folks and the other modelers got to make up some of their own forcings out of the whole cloth …”
Such results from simple checking raise serious questions as to the usefulness of current climate models. How seriously they have been verified, or validated?
If such simple testing exposes such practices, what would result from a thorough evaluation and testing by an aggressive Red Team?
It looks like we need at least half of climate modeling funds to go into starting over from scratch with explicit documentation, transparency and thorough verification and validation.
The other half of modeling funds need to go to thorough testing by an antagonistic Red Team of experts in physics and statistics.
Lets get hard results on which to base trillion dollar policy decisions, not sophmoric alarmism.
Actually, modelers don’t get to pick forcings willy-nilly. For CMIP5, see
If a climate modeling group picks different forcings, they’re going to have lots of explaining to do.
‘If a climate modelling group ……they’re going to have lots of explaining to do’
A skill for which they are singularly ill-prepared IMO.
When they admit that even they no longer know what their models do, nor seem at all concerned about this fact, one must conclude explaining things comes well down their list of priorities.
They spend more time arguing about why actually testing their models against reality would be a waste of time than it would take to have done so.
Your “IMO” should be “IMVHO”, since AFAICT, your opinions are backed more by personal biases and bigotry than they are by facts.
Simple answer to that one. Present the facts as you see them.
Endlessly saying no more than ‘you don’t know what you’re talking about ..believe me I do…but never presenting any arguments contrary to the ones I present is just making you look ever more juvenile.
The floor is yours.
You’d have to read the literature to understand the choices a climate modeling group makes as regards forcings. Pick a modeling group and read the papers it produces.
It is good to see these beginnings for standard data sets to inter-compare GCMs. I am more concerned over what is NOT included. See my post below on the systemic bias of omission.
There is a glimmer of recognizing that there may be a major natural change in solar cycles 23 to 24. See: Recommendations for CMIP5 solar forcing data
Contrast the trends reviewed and predictions made by Archibald. The
Under cloud forcings:
At least the Met Office is now asking:
In a quick look, I did not see any reference to “cosmic” or “neutron”.
Perhaps you could point to where neutron count data or corresponding solar parameters (F10.7 etc) are accounted for to model their impact on clouds. See the parameters tracked by Archibald in making his accurate predictions from 2006 to this year.
Actually with solar forcing in AR4 they had a choice. Only a few used a cyclical 11 year pattern going into the future, while the majority took a straight line approach an extrapolated from the current high. In one case the modelling group appear to have EFFED up there solar forcing for the future but there runs got including anyway.. FWIW..
And for AR5 it actually helps if you press down and look at the actual data
( I’ve been looking at the HYDE land use data for some time). It’s not safe to assume that readers here are not aware of some of the details, for solar you also have this interesting note:
3. What to prescribe in the future?
Repeat the last cycle (cycle 23), with values from 1996 to 2008 inclusive mapping to 2009-2021, 2022-2034 etc. Please note that cycle 23 starts in 1996.4 and ends in 2008.6!!!
NEW: There have been some concerns that cycle 23 was unusually long and repeating this special cycle would give out of phase behavior of a normal 11-year solar cycle around 2050. Cycle 23 is actually only 12.2 years long not 13 years since it goes from 1996.4 to 2008.6. In Lean and Rind (2009, GRL, doi:2009GL038932) the irradiance was projected forward by just repeating cycle 23. Since it is unknown what the sun will do, there is going to be a lot of uncertainty for future solar irradiance projections. Also the two prior cycles (21 and 22) have been shorter than average – the official times of minima are 1976.5, 1986.8, 1996.4 and now 2008.6 so cycle 21 was only 10.3 years and cycle 22 was 9.6 years – which are not 11 years either! Cycles 21 and 22 have been some of the highest and shortest on record and its quite possible that cycle 23 may be more representative of the future – but of course nobody knows.
Can I just throw the “cat amongst the pigeons” for a minute….I hope this isn’t too O/T.
I’m not persuaded at all, even by this detailed discussion, that the confidence levels in current climate models are anything like good enough to use as a basis for any policy strategy involving mitigation.
Indeed I seriously wonder if climate modelling will ever achieve sufficient confidence in such a chaotic system to have anything other than a theoretical value, fascinating though that is.
I’m not suggesting that we need the kind of certainty that we have in the idea that the sun will continue to rise and set each day – but to rush into drastic “sustainable” energy projects, when fusion power is probably less than a century away and fission power can “cleanly” bridge the ensuing gap, seems too great a leap of faith in these models to me.
> I’m not persuaded at all, even by this detailed discussion, that the confidence levels in current climate models are anything like good enough to use as a basis for any policy strategy involving mitigation.
Certainly climate models will have to be significantly improved if we’re going to try geoengineering. That’s an irony – the most ardent geoengineers will have to believe that climate models are correct much more so than even the most worshipful climate modeler does.
Indeed, it does seem that the lions share of these discussions start from the assumption that the scientific basis is sound and that it is just the aplication of this science, or rather it’s ‘reporting’ or ‘presentation’ that is the issue.
I think a good idea for a thread would examine this.
Dr Curry could list the, say, 5 most crucial aspects of the cAGW ‘theory’ and then present evidence for and against.
We the great unwashed could then submit further evidence to support/reject the ‘main’ evidence provided and Dr curry/mods could modify the post accordingly.
I think it would not only be a useful exercise for all, but it would be a very useful resource for all in this debate. Obviously the thread would need to be a sticky and be transferred to a new thread (as a copy) once the comments become unmanageable.
Models circle ceiling
Counting flowers on the wall.
Now, let’s all fall down.
It is a nice to see that you have no confidence in current climate models. Manipulation of numbers is too easy for a certain outcome. The constant changing of stations closing and different methodologies make temperature readings irrelavant to the actual physical planetary changes.
“Climate balance” is an illusion on a constantly changing planet.
Systemic Bias of Omission
A critical factor in “verification and validation” is not just what is IN the GCM’s, but identifying systemic bias (Type II errors) from what is NOT in the GCM’s, but should be.
With the UK under gridlock from the coldest winter ever recorded with high snowfall, the Telegraph trumpets: The man who repeatedly beats the Met Office at its own game
“Piers Corbyn not only predicted the current weather, but he believes things are going to get much worse, says Boris Johnson.”
Similarly in: A Dalton Minimum Repeat is Shaping Up
Anthony Watts cites:
David Archibald observes:
Why has Piers had a consistently higher prediction accuracy than the Met Offfice? Why was Archibald able to predict the current solar cycle parameters?
Do GCM’s consistently ignore or underestimate the impact of the sun and cosmic rays on climate, especially on clouds?
With the current solar cycles trending towards a repeat of the Dalton Minimum, this will be a classic natural “experiment” on how well GCM models perform. Those predicting catastrophic anthropogenic global warming appear to be projecting consistently warmer temperatures.
Those looking at neutron counts, cosmic rays and solar magnetic fields are projecting seriously colder temperatures.
On top of that, Don Easterbrook and others following the Pacific Decadal Oscillation (PDO) predict a cooler trend till about 2035 (compared to IPCC) followed by a warmer trend back to the long term average increase since the Little Ice Age (null hypothesis).
Based on the superior performance of Piers, Archibald and Easterbrook, I will invest in longjohns, not shorts.
Where is the verification and validation that GCM’s accurately account for these solar/cosmic/cloud factors?
Do these GCM’s demonstrate a Gigantic Cooperative Mania? (aka the lemming factor.)
Why should we to tolerate such systemic group think?
Indeed. And Ive been saying for some time that what we need is teams of climatologists with competing hypotheses and models getting equal time use of the supercomputers which are capable of running high resolution climate calculations.
This is the correct way to make more rapid progress in the young field of climatology. The stultifying exclusion of alternative hypotheses is simply a ruse to create an illusion of consensus and correctness.
My simple solar model matches historical temperature and OHC better than the IPCC co2 driven models do. The mechanism is more viable too.
Another comment from someone very familiar with model development for the aerospace industry.
It seems to me that those developing/commenting on models in this area have it substantially wrong regarding the determination of success/metrics in model development.
A climate model (one being developed to determine the impact of higher atmospheric CO2) needs to be able to accurately predict only two things, (IMO) temperature and precipitation. The key is that the model needs to be able to predict these to conditions at not less than a local level in order for the model to be validated as accurate and meaningful for use.
There are a great number of comments here, but it seems most miss that simple point. If a model is developed that can accurately “predict” the temperature and precipitation levels of a specific city then it is one that is meaningful for policy development. If it cannot is not suitable for use in policy development and actually worthless (except as a basis for future model development)
An observation and a question:
First, climate models aren’t just developed to determine the impact of higher Co2 levels, but rather to model the interactions between a variety of systems that work together to determine climate. Given that the outputs of the subsystems can serve as inputs to others, by necessity a model must be able to derive factors beyond temperature and precipitation. I posted a link on another comment to a conceptual model of a GCM that you may find helpful.
My question would be the timescale you’re looking for for the prediction. I’m certainly not in the convinced camp about the validity of existing models, but given the nature of the climate system, I think accurate prediction of the weather for some future date is setting the bar too high.
Gene- Since the actions that need to be taken from a policy perspective are based on the actual climatic impact to local/regional areas, it seems the models must be designed to predict at that level. The models would need to demonstrate a high degree of accuracy for 1 to 5 years in the future (IMO) with a wider margin of error farther out.
I guess it depends on how you define “regional”. Being able to determine that you’d need an umbrella in Peoria on 12/21/2015 would be neat, but almost certainly beyond realistic expectation. I can see utility for coarser grained projections.
This is where I’d echo Fred’s call for those with direct experience to chime in. Let’s hear what is feasable now and discuss if/how that can be improved.
Gene- then maybe we agree. I state that a climate model should be able to accurately predict how much rainfall, and what the temperature (average) will be at a level say of a specific region (say at a overall county level). If a model can not reliable predict what the temp and rainfall will be at that level…..how can it really be used for infrastructure planning??? At less than that level…well who cares other than the developers…..interesting but of low value
You’re wrong. And, you’re right.
Wrong: models are supposed to give you an idea of the global response to inputs. As far as I can tell that’s what they do, *assuming* that the system is understood well enough to know what a feedback is. It’s not supposed to be a detailed fortune telling for downtown Mumbai. Insofar as I know the gridding is too coarse for regional prediction (at least that’s what a recent paper from Greece seems to underscore.) Rather, a model ought to give some sense of direction.
Right: the advocates who embrace the correctness of models then go on to extrapolate everything from Acne severity to Mumbai rainfall patterns for the next 75 years. Suddenly and with great magic, the gridding too coarse for prediction is no longer too coarse. The advocates transit, with no embarrassment or clue, directly from science to fortune telling.
I really do think that 99% of the problem (i.e. why this blog exists) is neither the models nor the science, but the shrillness of advocates claiming to be able to tell fortunes. Fortune telling and black magic and alchemy have always claimed the veneer of science; medieval physicians argued loudly that their Galen model was correct and scientific. Science of the day was the club they beat their detractors with as well. Greg Craven is suggesting that there ought to be no line between fortune tellers and scientists, which is the undoing of the last 400 years of painstaking work to delineate the opposite.
“Wrong: models are supposed to give you an idea of the global response to inputs.”
A global response is inadequate. The impact of a couple of percent change in global cloud cover is completely different if it occurs predominantly at the equator or at at the poles or at the mid latitudes.
A global average is no more useful in determining climate response then trying the determine the flight characteristics of a 747 with 100,000 pounds of freight in the belly.
Without knowing where the freight is in relation to the mean aerodynamic chord of the wing then impact of 100,000 pounds of freight in the belly of a 747 is somewhere between crash and burn and normal controlled flight.
The climate debate is the same, without knowing the distribution of the clouds then the impact is between we are all going to burn/drown or ‘nothing to see here move along’.
If that is your definition of what the models are designed to simulate, then I suggest that you are shooting for a relatively meaningless model. Unless models mature to the point where they will be able accurately predict relatively local climate reactions to CO2 levels then they will have minimal impact on policy development. (or at least the issue will continually be in dispute)
I do understand where climate models are now, but the goal should certainly be to accurately predict regional/local temperature/precipitation. It is only on a regional/local level that the impact of climate changes effects humanity.
GCMs will never be NWP models. Ever.
Something that concerns me re model testing is that what seem to be serious failings of the models are simply brushed off. For example, Willis at WUWT today shows that several models show a warming trend even under constant forcing. Lucia has shown that models give GMT as much as 4 deg C too high (doesn’t that suggest the physics is wrong?). Some recent papers at the continental scale show temps off by several degrees and ppt too high by up to 36%. The answer seems to always be that “nevertheless” we can trust the long-range predictions of trends. Really?
It’s not clear to me that Willis’ analysis is valid.
“It’s not clear to me that Willis’ analysis is valid.”
Then – as you say to others about temp. reconstructions – do your own analysis.
Willis is right and and he is right for reasons I have repeatedly explained (over the last decade) including on several threads of this blog. Check the ‘climate model … Part 1’ thread for my argument and the dispute of it from Fred Moulten.
Not necessarily. It could be the models are more or less correct and that the range of ‘natural’ variability is large. Suppose we could build lots of actual planets identical to the Earth (Magrathea from The Hitchhikers Guide to the Galaxy comes to mind). If we started them all from the same point, with all external forcings following the same trajectory, do you really think they would all have the same temperature time series within a small fraction of a degree? I don’t. I’m not at all sure that the variability within one realization is a good model for the variability between realizations either. Admittedly, it’s all we have.
How is the application complexity of the AOGCMs apprortioned.
Are we talking solely about application algorithmic complexity or is there a leakage of system issues into the application.
By the first part I mean a lot of difficult mathematics to be performed.
For the second part I would include multithreading at the application level.
I have been a software developer for over 30 years – much of it involved with compiler development.
At least with a compiler, you can test the latest build with a series of test examples – about 2000 examples in our case. Even with this level of testing, some problems leaked through the process, and the relevant code was then added to the test set!
Note that such tests are typically performed every day, so that when things go wrong, it is easy to find the changes responsible. ‘Validating’ the software after months of work would be almost useless.
A compiler test usually delivers a yes/no answer. When testing software designed to predict the future from a set of assumptions that did not apply in the past (CO2 levels), definitive tests are much harder to devise. Presumably there are a few conservation laws that can be checked in the results (energy, angular momentum, etc.) but little more.
Even if a model was error free, it is entirely possible that its solution is chaotic, or just numerically unstable.
My gut feeling is that unless a model can be expressed in a reasonably simple way, a computer solution is almost certainly meaningless, whatever methodology is applied. Complex models for which adequate test data does not exist, perform a huge disservice by pretending to give answers to problems which are not soluble.
Data is essential, but models are more toys of super computer age than the fundamentals of science. Science is unlikely to make radical steps forward guided by computer models, that can only be achieved by the reasoning of an individual.
For some time now I have been advocating the idea that solar activity destabilises the Arctic magnetic field
which in the association with the Arctic ocean currents, may have profound effect on the Arctic temperatures.
Of course this is currently labelled as a ‘nonsense’ by experts.
I wrote sunspot formulae in 2003 (published Jan 2004)
not only predicting low SC24, but indicated even the lower SC25, and SC26.
This was at the time when the NASA’s top expert Dr. Hathaway was predicting highest ever cycle for SC24, and declared my work ‘irrelevant’. How wrong one can be!? Now many experts are predicting low SC24 and possibly SC25; where were the experts in 2003? Solar polar magnetic fields formula followed (vehemently disputed by solar science as numerology)
with one of the highest correlations between two very precisely defined but apparently unrelated natural events.
Solar science is trailing the events, continuously modifying its interpretations to suit the events. Climate science is not much different in that respect.
I’m a Software Engineer so for once I feel entirely at home with this post :p.
If you don’t know enough to produce a specification, you don’t know enough, period. Admitting you don’t know enough to specify a “correct” model is admitting your model isn’t suitable for use as a basis for deciding policy.
Any model you do produce containing representations of the concepts you think you know about, is purely a hypothesis generator for the domain of knowledge you are seeking to resolve. That is to say, it is not necessarily representative of the real world, only the world you’ve modelled.
The question of how to engineer the software is secondary and easily discovered once your have a specification. A specification is used as evidence you’ve actually thought about how you’re going to solve the problem and demonstrates whether or not you actually understand the problem in the first place.
Building a model is fine, but unless you can generate a description of how it works and what it does, it’s not a finished article, it’s a prototype, the results of which can, if successful, be used to produce a proper, correct and well engineer piece of software at a later stage. We almost never use prototype code in the final product.
In conclusion then, all climate models are prototypes; just as you wouldn’t fill your Airbus 320 prototype with 400 paying passengers, you probably don’t want to be basing public policy on a climate model prototype either, especially when you have past experience of looking out of the window and watching one of the engines fall off (the engines fall off our Met Office prototype twice a year: barbecue summer prediction and mild winter prediction).
Hear hear now!
As someone whose understanding of climate change is substantial in some areas and shallow in others, I have looked forward to increasing my depth in a shallow area – software development for climate models and its application to model design and implementation.
Among those whose experience I would hope most to learn from are those who specifically do the climate model software development, but also individuals who design and/or parametrize climate models – particularly GCMs, and even those who routinely utilize GCMs and are familiar with their performance. This roster probably includes Judith Curry, Dan Hughes, Steve Easterbrook, Andy Lacis, and others. I hope they will participate further.
My own knowledge of GCMs is that of an outsider, but I have found some useful reviews relevant to the thread that include the article at
Tebaldi and Knutti as well as other articles in the series that can be visited via the linked site. Comments on that material would be welcome.
To date, the thread has been less informative to me than it might be, because many of the comments appear designed to prove a point rather than inform, and most come from individuals who may be knowledgeable in related fields but not expert in GCMs, and have perhaps overestimated the extent to which their own knowledge can be extrapolated to this area – a temptation that may be hard to resist. They should continue to comment, but comments from experts who do the sort of thing I describe for a living would be even more welcome.
On the contrary Fred, it is precisely the experience of outside “experts” that is needed here. Start with the premise that the models are wrong and work your reasoning back from there.
“it is precisely the experience of outside “experts” that is needed here. “
Robinson – you may be right. My understanding of models is limited to basics without a detailed knowledged of intricacies, so I’m not prepared to dispute you. On the other hand, in areas of climate science that I know well, such as radiative transfer, greenhouse effect principles, and feedbacks, I’ve observed that it is more often than not the case that what you call “outside experts” exhibit a poor understanding and harbor serious misconceptions. Without impugning any particular branch of “outside” expertise, I would say, for example, that most individuals here with an engineering background get climate feedback quite wrong, possibly because the term is used differently in the two disciplines, but also for more fundamental reasons.
I certainly don’t challenge the right of those outside climate science to offer opinions, but they should probably offer their opinions from the perspective they would expect from individuals outside their own disciplines who chose to advise them on the mistakes they were making.
More important, though, is that some of the “inside experts” could contribute substantially to this thread if they made their voices heard louder. Finally, getting back to climate specifics, I re-recommend the link I cited above as a source of information with particular relevance to climate modes in terms of design, parametrization, and assessment.
Fred- Successful model development is a difficult task for any complex system. At the end of the day what is needed for a model to e considered useful??? Simple answer really…..it needs to accurately predict the future temperature and precipitation at not less than a regional level.
I see that many write “well that is to difficult” or “you expect to much of the models”. I respectfully disagree since it is only at the local/regional level that weather/climate matters and that the models can truly be verified as accurate.
I am amazed that there is even any disagreement on this most basic of goals
GCM’s may have problems that are unique to climate change, but they also share the problems that are common to all large software projects.
It is incredibly easy to over estimate the reliability of software. Industrial software uses many techniques to make it more reliable. Most of these – e.g. beta testing – rely on the assumption that someone can recognise when a program has gone wrong – at least for certain inputs!
I don’t believe it’s the program itself that’s wrong. I mean it would be very easy indeed to construct a physics model where acceleration due to gravity was dependent on the level of CO2 in an atmosphere around a perfect sphere (you could do it in a few lines of code!). The issue of actually constructing the software is secondary to understanding the basic physical processes that are involved. If you don’t understand those, your model has no predictive power in any domain other than the one you’ve modelled, which bears only a tenuous relation to reality.
Given that this is the case – and that any Computer Science major will be able to tell you so, why is it that Climate Scientists continually rush out press releases related to their latest model results which are spun as predictions in the media? I think I know the answer and I’m pretty sure I don’t need to spell it out. I won’t do so for risk of causing offence.
Your hypothetical gravity model, would, as you say, only take a few lines of code. The problems start with large programs, particularly in cases where the answers can’t be checked. It is worth reviewing the HARRY_README.TXT file, released along with the emails – it gives you some idea of the software problems that the CRU was having.
Of course, these problems are in addition to uncertaintities in the actual physical processes.
I think it is also important to remember that a lot of research code is written by graduate students with little or no experience in writing software. Yes, they can write the sort of ‘model’ that you mention, but they don’t have the experience to structure a large program so that it can be maintained or extended, or checked for errors as it is changed. Ultimately, they get their PhD or move on, and another green recruit takes over the code. In my experience, it was rare indeed for a supervisor to take any interest in the actual computer code.
Like Dan Hughes, I found this statement by Steve Easterbrook very troubling:
“For climate models, the definitions that focus on specifications dont make much sense, because there are no detailed specifications of climate models (nor can there be theyre built by iterative refinement like agile software development).”
No detailed specifications? If GCM software developers buy into this notion, they are doomed to fail.
I’ve been speaking out for a while about the necessity of high profile climate modeling groups to provide detailed documentation for the models they’re building and modifying. This is especially needed because the people working on these codes are never static. Workers (many being students and post-docs) come and go, and the knowledge of the logic behind the algorithms they develop comes and goes with them. For example, someone may come along and add a subroutine which calls a thermodynamic function (e.g. the specific heat of dry air as a function of temperature) which has limitations which are unknown to the user. These limitations may not show up until a lot of expensive runs have been completed. The worst bugs in computational physics are the ones that don’t cause your code to crash, but instead show up in results that appear (on the surface) to be plausible.
At the very least, each group can produce a document specifying in detail (1) the differential equations and their boundary and initial conditions, along with all additional equations associated with submodels, (2) the basic numerical methods being employed to solve these equations, and (3) the limitations associated with all modeling procedures. Pointers to specific subroutine modules would be helpful. And all code should be heavily commented. I don’t know why some groups (like GISS) seem to think that their codes are self documenting and don’t require comments…but looking at their source code, it’s amazing they even run at all, much less knowing what the individual routines do and how they fit together to perform a given task required by the numerical algorithms.
In contrast, groups like NCAR and GFDL do a great job with their code and documentation, as do the European groups.
Maybe the way forward is to consolidate the research efforts, thereby reducing costs (a necessity, given the explosion in government debt here in the US and abroad) and allowing for some better level of coordination in developing truly world-class, open source software.
A little nerdy disagreement here Frank. Personally I don’t like to comment so much any more. Experience has taught me that not only do I have to maintain the code, I also have to maintain the comments (twice as much work!).
The solution I’ve found is to structure the code – making it neat and tidy and somewhat self-explanatory to other developers. This includes plenty of white-space between blocks of statements in functions and the use of meaningful symbols.
Robinson – I agree that tidy code is easier to maintain (there are whole style manuals which discuss this). However, comments are both for you (in the future, when you’ve forgotten how a routine or function worked) AND the person who has to work on the code after you (who doesn’t have a clue as to what your code does, unless it’s fully documented elsewhere). My inclination when I see a badly written piece of code is to trash it and rewrite it in my own style – which assumes that I fully understand how the piece of code works both locally and globally.
You write as if these were new discoveries (‘the solution I’ve found’).
All commercial programmers were being taught this by the late 1970s, under the generic title of ‘structured programming’.
Only 35 years later it has finally broken through into academe……at what they delude themselves is the ‘cutting edge’ of programming :-)
Well, just take a look at the NASA GISS Model E source code.
I’ve worked in the commercial computational fluid dynamics (CFD) field for over 20 years, and I’ve seen some good and bad coding in my time. The GISS code is not the worst, but it is close. I’m truly amazed that the thing actually compiles, and that someone understands how to modify it without breaking important subroutines. And note that your average single phase RANS CFD code is about an order of magnitude less complex (in terms of the physics being modeled) than a GCM, and they are still hundreds of thousands of lines long…
I didn’t mean to say that structured programming was the solution I’d found; I was talking about commenting code particularly. From experience of maintaining other people’s code, it’s common for the comments to no longer reflect what the actual code is doing. Hence the need to maintain both, which is why I don’t do it too much. You’ll find plenty of guidelines out there that say you should, however.
Three cheers for Robinson and David L. Hagen! Models, whether they are mechanical, mathematical or computer-generated mean very little until they are tested in the future for the function for which they were designed. Use of computers cannot circumvent that need no matter how sophisticated the mathematics. Just ask the failed financial wizards of banking.
Having read most of the posts on thois tread, I hope Judith takes notice of all the previous ones which say the same thing as Morley has said. To me this is absolutely obvious, but how many times does it need to be said until everyone knows it is true.
It appears to me that we are discussing two topics here:
One is the question (raised by David L. Hagen) of whether or not the climate models cited by IPCC have adequately included the many natural climate forcing factors (or natural variability), in order to be able to estimate anthropogenic factors.
The second (raised by Eric Ollivet) is whether or not they have done a good job at hindcasting past periods or forecasting most recent periods of climate change.
Both questions will lead to an answer to the question of whether or not they are likely to produce usable forecasts of future climate changes.
It appears to me that if the answer to either of the above questions is negative (for whatever known or unknown reasons), then the conclusion is that they are not able to forecast future climate.
Simply validating one model against another is meaningless if the above questions cannot be answered positively.
Am I missing something here?
Max, You are missing absolutely nothing agt all. You have captured the essence of the problem. There are uses for non-validated models, but the way climate models have been used by the proponents of CAGW is not one of them.
Derecho64, I don’t see your posting at Tisdale’s blog.
This will be the year climate science will be thrown under a bus due to the mass growth of skeptical society and the incredibly bad weather currently happening.
No one seems to notice that the weather patterns have slowed down to generate these massive weather annomalies.
Test for Ice Extent in Arctic and Antarctic
One major test for climate models is how well they model BOTH Arctic AND Antarctic ice extents. e.g.
Sea Ice News #32 – Southern Comfort
While we hear a lot about 2007 dip in the Arctic ice, where is the reporting on record HIGH ice in the Antarctic for this date? Currently the Antarctic ice is running >2 standard deviations ABOVE the 19779-2000 mean.
Compare: Another IPCC Error: Antarctic Sea Ice Increase Underestimated by 50%
Does this reflect quality assurance in climate science?
OR does this reveal systemic bias in GCMs and/or IPCC?
The last time I checked (today) Antarctic sea ice was running at almost exactly the 1979-2000 or 1979-2008 mean and has been for the last 6 weeks or so. In fact if you detrend the anomaly data by OLS (slope 0.015 Mm2/year +/- 0.0009, 95% confidence), the residuals have been negative for the last three weeks. Today’s data for Antarctic ice area from Cryosphere today has a residual of -0.346 Mm2. The standard error of the fit is 0.437 Mm2. I made no attempt to correct the degrees of freedom for serial autocorrelation, which is likely to be significant for daily anomalies.
NSIDC tells us that Antarctic sea ice extent (end December) was 9.2% greater than the 1979-2000 baseline value at 12.10 versus 11.08 million square km.
At the same time, the (end December) Arctic sea ice extent was 10.3% lower than the 1970-2000 baseline value at 12.00 versus 13.37 msk.
Overall this looks pretty close to a wash to me.
I have just received this newsletter on the COMBINE project:
Interesting article on ClimatePolicy.org re: the organization of climate modelling efforts as production systems instead of purely research efforts.
I am out-of-date for this thread but responding to Dr Curry’s comment of June 12, 2011 at 3:58 pm, suggesting homework, on the Lindzen Choi part II thread.
I refer to Dessler 2010, Science 330: 1523-1527.
As I understand things, for the present climate régime, there are two well-recognized destabilizing contributory factors in the earth’s energy transport process, water vapour radiative absorption and ice albedo. Two well recognized stabilizing contributory factors are the Planck response and the lapse rate response. It seems agreed by the AOGCM people that of these four, the destabilizing factors outweigh the stabilizing ones. There remains room for argument about the cloud effects.
It seems to me that if the cloud effects were destabilizing, then, because of the combined effects, the system would be unstable, and we would see a transition to a new dynamical régime, not warmer than 5700K, more or less explosively, or as they like to say, a runaway global warming. Dessler as cited above tells me he thinks the cloud effects are indeed destabilizing. I suppose this means he thinks we are indeed looking at such transition.
It seems to me that if this runaway story were really so, then 1998 would have been followed by a distinct further rise in temperature, beyond the 1998 value. That seems not to have happened. So I think Dessler must be mistaken.
That is why I don’t think that the AOGCMs can be trusted. Can you put me right on this? Christopher Game