Assessing U.S. temperature adjustments using the Climate Reference Network

by Zeke Hausfather

Measuring temperatures in the U.S. no easy task. While we have mostly volunteer-run weather station data from across the country going back to the late 1800s, these weather stations were never set up to consistently monitor long-term changes to the climate.

Stations have moved to different locations over the past 150 years, most more than once. They have changed instruments from mercury thermometers to electronic sensors, and have changed the time they take temperature measurements from afternoon to morning. Cities have grown up around stations, and some weather stations are not ideally located.

All of these issues introduce errors into the temperature record. To detect and deal with these errors, NOAA uses a process called homogenization which compares each station to its neighbors, flags stations that show localized changes in longer-term temperatures not found in nearby stations, and removes these local breakpoints. While the impact of these adjustments on temperature records is relatively small globally, in the U.S. it has a much larger effect due to the frequent changes that have occurred at our volunteer-run Historical Climatological Network (USHCN) stations (specifically time of observation changes and instrument changes). Fixes to errors in temperature data have effectively doubled the amount of U.S. warming over the past century compared to the raw temperature records.

Slide1

A picture of the three redundant temperature sensors at a U.S. Climate Reference Network station.

To help resolve uncertainties caused by reliance on the historical network, NOAA began setting up a U.S. Climate Reference Network (USCRN) starting in 2001. USCRN includes 114 stations spaced throughout the U.S. that are well sited and away from cities. They have three temperature sensors that measure every two seconds and automatically send in data via satellite uplink. The reference network is intended to give us a good sense of changes in temperatures going forward, largely free from the issues that plagued the historical network.

While the USCRN will provide excellent data on the changing U.S. climate in the future, in the past we are stuck with the historical network. What we can do, however, is use the USCRN to empirically assess how well our adjustments to the historical network are doing. Specifically, since we know that the USCRN is largely free of issues, we can see if adjustments to USHCN are making station records more similar to nearby USCRN stations. That is the focus of our study recently published in Geophysical Research Letters (non-paywalled version here).

Slide2

Figure 1 from Hausfather et al 2016.

For overall contiguous U.S. temperatures, the record from raw USHCN, adjusted USHCN, and USCRN are quite similar during the period of overlap, as shown in Figure 1 from our paper, reproduced above. USCRN does have a noticeably higher maximum temperature trend than both raw and adjusted USHCN data, though the cause of this is still unclear (see our paper for more discussion of this divergence).

We do see large differences between raw and adjusted USHCN data from individual stations when we compare them to nearby USCRN stations. Here we looked at all possible pairs of USHCN and USCRN stations within 50, 100, and 150 miles of eachother. The 100-mile case is shown in Figure 2 from our paper below, but all distance cutoffs have fairly similar results (and are available in our supplementary materials).

Slide3

Figure 2 from Hausfather et al 2016.

In the vast majority of cases adjustments served to make the USHCN trends much more similar to those of proximate USCRN stations, particularly for larger divergences. Since we know that the reference network is largely free of measurement problems, this increases our confidence that adjustments are effective at finding and removing problems in the historical network.

Slide4

Figure 3 from Hausfather et al 2016. Trend differences in mean temperatures are shown.

The spatial structure of variation in adjusted USHCN data is also much more similar to that seen in the USCRN, as shown in Figures 3 and 4 in our paper. Figure 3, above, shows the distribution of trend differences between USHCN and USCRN stations for raw and adjusted data, and compares them to the distribution of trend differences between proximate USCRN stations. While the adjusted USHCN trends are still slightly biased low (due entirely to differences in maximum temperatures), the shape of the distribution of USHCN adjusted trend differences is much more similar to that found between homogenous USCRN stations.

The Climate Reference Network was only established in 2004, so we can only directly test the adjustments for the recent period. However, the algorithm used to detect and adjust for problems in the historical network is applied equally over the past decade and the past century, so the fact that it seems to work well during the past 10 years increases our confidence that it also effectively deals with problems in the past, though this conclusion is somewhat tempered by the potential changing nature of inhomogeneities over time. Other work using synthetic data by Williams et al (2012) and Venema et al (2012) as well as comparisons to reanalysis data by Vose et al (2012) also suggest that adjustments are effective in removing localized inhomogeneities in the temperature record without introducing detectable spurious trend biases.

JC note:  As with all guest posts, keep your comments relevant and civil.

244 responses to “Assessing U.S. temperature adjustments using the Climate Reference Network

  1. from Figure 2 CRN data suggests that there has been no trend in maximum. minimum or average temperature, and calls into question claims that 2015 was the hottest year on record.

    The lack of a CRN trend confirms previous studies the urbanization causes a rising minimum trend and a decline in the maximum in USHCN data.

    Due to the lack of a CRN trend Is Zeke’s conclusion that USHCN trends are a function of urbanization effects and are not a good estimator of climate sensitivity to CO2?

  2. Not a climate scientist (but a scientist, nonetheless). How exactly is it “no easy task to measure global temps in the US”? It’s impossible, which makes me wonder about the validity of every subsequent statement in the piece.

    • Hi Jarjobro,

      It’s certainly not impossible. We have 10,000 measurement stations with long records in the U.S., and we can get a decent amount of information from those. At this point multiple independent groups have gotten similar U.S. (and global) temperature records, which is certainly encouraging.

      One of the reasons we set up the Climate Reference Network is so we can know for sure that future temperature records are free from inhomogeneities. The network also has the benefit of providing a good empirical testing ground going forward for our adjustments to the historical network, as we discuss in this paper.

      • David Springer

        USHCN was a much different network in 1950 than in 2000.

        Therefore comparing more recent USHCN-adjusted with USCRN is valid for assessing adjustment performance in recent decades but it is not valid for the older USHCN-adjusted record.

        In particular I believe the methodology used in the adjustments is messed up for cotton-region shelters (CRS) which dominated the network in the more distant past. CRS exteriors darken with age. White paint fades and those closer to sources of soot darken from soot deposition. That causes a CRS to have a false, very gradual warming trend. Because the false trend is so slow it doesn’t trigger any adjustment to compensate for it.

        The case is not the same when a CRS is restored by cleaning and/or painting. Because the cleaning or painting happens in a single day the temperature measured by the box cools literally overnight and because neighboring stations don’t get cleaned or painted at the same time this sudden cooling of one station triggers an adjustment in the record. It triggers it in both NOAA and BEST methodologies.

        The end result is that false slow warming from CRS aging gets baked into the record while the sudden cooling from CRS exterior restoration gets rejected.

      • Unfortunately as I understand it these independent groups use similar, area or field averaged, methods which all have questionable statistical validity. For example, different stations get what amount to different weights, depending on the regional spatial density.

        And of course this is a convenience sample of locations, not a random sample as required by the postulates of statistical sampling theory.

        Plus stations come and go so there is no true sampling. You cannot take one sample here, then a later one there, and call the difference a temporal trend.

      • Thanks, Zeke; let me frame my intial response in a different way- how exactly is the temp/ temp record from devices located inside the US a measure of GLOBAL temps as your statement clearly suggests? By comparison, it would seem that i could measure temps here in wisconsin and say that’s a measurement of SE US temps, which clearly wouldn’t be the case

      • How do measure global temperature with (only) U.S. data?

      • I’m pretty sure that line is just a boneheaded mistake/ I don’t think the post actually meant to claim it is simply not easy to measure global temperatures with data taken only from the United States.

      • jarjobro,

        Sorry, that was a typo on my end (it’s what I get for writing blog posts while listening to conference talks…). This study only looked at the U.S. and did not assess global temperatures. I’ve asked Judy to fix it.

      • i think he meant, first give us a definition of it…it would help, then ,once we have a defintion we can try to see how we can measure it and test it….never forget you measure the remperatures of thermometers any step bayond that is based on hypothesis…

      • David.

        “Unfortunately as I understand it these independent groups use similar, area or field averaged, methods which all have questionable statistical validity. ”

        This post addresses the hypothesis that some people have put forward about adjustments. Namely, that they don’t work, or that they corrupt the record, or that they add warming which isn’t real.

        This skeptical concern is testable.
        We can test that hypothesis in two ways.
        1. We can create synthetic data and add random errors to it and test if the adjustment codes are able to change the data in the direction of the truth.
        If the trend in the synthetic data is 3c per century and we introduce errors to create versions with trends of say 2.7c and 3.3c we can test if the algorithm moves the adjusted answer closer to the truth. These tests have been done. Adjustments work.

        2. We can take gold standard data, crn, and observe it’s trend. That is a truth reference dataset. Just like sondes are used to test satellite data.
        We can then look at the trends of all the neighborhood stations and see the trends before and after.

        Wrt area averaging and convience samples. You are incorrect about area averaging. Many methods, kriging and thin splines don’t do that. The temperature at unsampled places is modeled. For that model you can use the convience sample and test out of sample, or you can resample the convience sample to get a sample that is representative and random.

        In the end it’s the skill of the model that is testable.
        For all approaches you get the same answer for the metrics of interest.

        1

      • To David Springer:
        Good observation, as usual.

      • Thank you below for acknowledging my initial point and clarifying your statement. I look forward to further digesting your post.

      • David Springer

        Mosher the tests you described do not catch the problem with slowly darkening CRS stations. I know how the algorithms work. They won’t catch a slight warming trend caused by paint deterioration and soot accumulation that happens over many moths or years. They will catch an abrupt cooling event from a shelter that gets cleaned and painted. The CRS is supposed to be painted once every two years. In a volunteer network do you think that regimen is followed with any discipline?

        The end result is the slow warming gets baked into the record and the cooling gets adjusted out of it (NOAA) or for BEST the sudden cooling triggers the start of a new record. In either case the false warming trend is retained while the compensatory cooling is rejected.

      • David Springer

        Yeah that’s what I thought Mosher and Hausfather’s response would be: crickets chirping.

        Stick with the satellite temperature record. Older surface stations are simply not adequate for establishing micro-trends locally, regionally, or globally. You can’t make a silk purse out of a sow’s ear. Can’t be done. If you try to pass it off as a silk purse regardless then it’s dishonest. Period.

      • David Springer, disappointed as well that both Zeke and Mosher failed to reply to your comment. Zeke’s post is compelling, but your points far more so. Mosher talks about detecting synthetically introduced errors which is impressive but it would be nice to know what errors the correcting algorithms can actually detect. White noise/ red noise? randomly added stepwise functions, or crucially, a gradual steady cumulative error signal.

      • Seems like jest too many uncertainties requiring too
        many suppositions leading to too many uncertainties …

    • “No easy task” raises a number of questions.

      1. Were the comparison stations MMTS, CRS, ASOS, Gill, or some combination. If some combination, why wasn’t a single type (MMTS) used?

      2. Is the raw data before or after compensation for station types?

      3. Why wasn’t the comparison limited to USHCN stations that were unaltered (continuous record) for the period?

  3. “Cities have grown up around stations, and some weather stations are not ideally located… Fixes to errors in temperature data have effectively doubled the amount of U.S. warming over the past century compared to the raw temperature records.”

    Is that despite ‘Fixes’ or because of ‘Fixes’ to the raw data?

    • In the U.S. adjustments increase the global-scale temperature trend. Their impact is much smaller globally, and once you include the oceans you actually get slightly lower trends post-adjustments:
      http://cdn.arstechnica.net/wp-content/uploads/2016/01/noaa_world_rawadj_annual.png

      It just so happens that in the U.S. we have two big systemic network trend biases due to TOBs changes and the MMTS transition, both of which are absent in most other countries.

      • “in the U.S. we have two big systemic network trend biases due to TOBs changes and the MMTS transition, both of which are absent in most other countries.”
        So the rest of the world did not make TOBs errors,
        inconceivable!
        And the rest of the world were already using using electronic sensors rather than mercury thermometers?
        Or do you mean they never updated from mercury thermometers?
        Or the rest of the world uses unreliable satellites?
        or, well you get my drift, Inigo.

      • As a matter of caution, when Zeke Hausfather says:

        In the U.S. adjustments increase the global-scale temperature trend. Their impact is much smaller globally, and once you include the oceans you actually get slightly lower trends post-adjustments:

        He is very conveniently looking only at particular periods. Over other periods, that ceases to be true. This wouldn’t be much of an issue except natural variability is said to have a larger influence (percent wise) prior to 1950 than after 1950, and it is in that period when natural variability has a larger role (again, percent-wise) in which adjustments reduce warming.

        The reality is this can make it easier to “explain” the changes in temperature as being anthropogenic. Without the adjustments reducing past warming, there would be a stronger case for a meaningful component caused by natural variability. While people like Hausfather portray these adjustments as reducing global warming, as though that would be something “skeptics” would like, the reality is they are made in a way which only strengthens the case for AGW. It’s rather misleading.

        http://www.hi-izuru.org/wp_blog/2016/01/proof-adjustments-dont-exaggerate-gw-they-just-exaggerate-agw/

      • Why did the large adjustment suddently appear in 2015?

        http://imgur.com/cLIeE3g

      • Glenn,

        If you read the fine print on that graph, it says that the raw data only goes through 2014. NOAA haven’t released their raw gridded data past July 2015 or so, though hopefully they will soon so I can extend the graph.

        NOAA does produce a version of the same graph that goes through the end of 2015; you can see it here:

        http://s12.postimg.org/n5i79qxe5/Screen_Shot_2016_02_10_at_7_44_52_AM.png

      • Strange choice of datasets there. NOAA “global temp” is surely a land + sea dataset, why plot it on the graph as US land temps. Idem hadCRUT is land + sea, why not choose CRUTEM4 which is the land record. Also the HadSST3, which is the other part of HadCRUT has similar “corrections” which remove most of the early variability.

        It seems that this graph is supposed to ‘prove’ that the USHCN ‘corrections’ make it match the other datasets and that it they corroborate what is done.

        But it is intentionally misleading because it plots oranges and apples on the same graph but avoids making that clear.

      • climategrog, I think you’ve gotten confused somehow. That figure is to show global temperatures, not temperatures for the CONUS.

        Though the figure does seem kind of weird. For instance, GISS uses data from NOAA which has been adjusted by the NOAA. Knowing that, does it really tell us anything GISS’s results are more similar to the NOAA adjusted data set than the unadjusted one?

  4. Speaking of Profiles of Courage – Zeke once again into the breach!

  5. Amazing. “Temperature” is still being used as code for min/max, the record which dares not speak its name.

    It’s like the whole climatariat knows that the game is over once you recognise the simple fact that cloud has tampered with the historical records far more than the wildest UHI or adjustment geekery.

    Me, I don’t care if things have warmed a bit or paused a bit. Apart from cooling a bit, what else can the climate do? But pretending that min/max=temp really is like ignoring a mammoth in the phone booth…after the mammoth just ate several tons of beans.

    Burp.

    • Harry Twinotter

      mosomoso.

      “Amazing. “Temperature” is still being used as code for min/max, the record which dares not speak its name.”

      What on earth are you talking about?

      • Noticed that heat is intercepted and retained by cloud, that a higher minimum and lower maximum are brought by about such interception?

        Whereas, absent cloud at the potential warmest/coolest parts of the day, a higher max and lower min can be recorded? And that this is a common effect in those parts of the world where people live and record temps, namely the places where cloud forms often? (This is putting aside the interesting question of wind change.)

        Harry, this is why after a chilly overcast day I find I often don’t need the fire but I do need it after a warm clear winter day. The difference can be 10 degrees C…and only due to the cloud or absence thereof. So when they took the highest temp achieved in a 24 hour period somewhere on some day in 1950, that’s all they were taking. They were not measuring the duration of the heat, nor the potential for more heat absent cloud, wind etc.

        I find it interesting that my region had lots of high monthly and yearly max readings in the period between 1910 and 1919, far too many for coincidence. So maybe the afternoon cloud pattern changed, as it did again after the 1970s? How can I know?

        You might ask how one can establish a global temp if one takes into account all such factors, even with improved measuring like RTD. Well, it’s a bit like the transmutation of metals all those centuries ago. Maybe you can’t.

      • Harry Twinotter

        Mosomoso.

        You are asking rhetorical questions about weather, followed up by anecdotes.

        I recommend you say something scientific instead. If you think certain effects are happening for “reasons”, then provide data.

      • I’m staggered people need references to science or data to know that cloud frequently raises minima and lowers maxima, sometimes an awful lot.

        “Rhetorical” question: Why was 1950 the year with the highest mean minimum in my region and a number of others (some with records going back to the 19th century)? Was it so “hot”?

        “Reasons”: No, it wasn’t so “hot”. 1950 was the year eastern Australia just about floated away. In our winter/spring-dry climate, instead of skies clearing at night to make temps plunge and carpeting the ground in frost, what heat we got during the day hung about. Where was it going to go?

        Why were 1929 and 1974 our ‘”coolest” years here by mean max? Well?

        You guessed it. (I hope.)

    • Then, too, life is an abstraction constrained by two concretes — birth and death.

  6. I am wondering how validating the HCN and CRN data over the time span of the CRN data applies to the past, especially when considering Time Of Observation ‘corrections’. Also, how does it validate the HCN data for station moves and changes? I’m just wondering?

    • Gary,

      If homogenization can pick up divergent trends at nearby stations (as seen in our figure 2) in recent years, it’s a reasonable assumption that it can also pick up things like TOBs changes that also create localized breakpoints. However, we also have good evidence from experiments using blinded synthetic data like Williams et al (2012).

      • Reasonable? Assumption? Come on, that is just a fancy way of saying you guess it works. How many moves and changes occurred during your test period? TOBS changes during that time? You are describing a second order comparison not a direct comparison of two data sets when you reference those other studies.

      • There were some TOBs and instrument changes between 2004 and present, though certainly less than during the 1980s and 1990s.

        https://curryja.files.wordpress.com/2014/07/slide11.jpg

        It would certainly be nice to have had the CRN back then, but at least for the moment the best we can do with this particular approach is to continue to compare the records going forward.

      • I presume the TOBS adjustments are still done from metadata, and before homogenisation. But TOBS changes aren’t an issue for MMTS, which must surely be dominant for USHCN post-2004?

      • Alas, MMTS is still a min/max thermometer, and only records the min/max temperature between the last reset. You’d think the government would have installed electronic instruments that would record hourly temperature, but I suppose that would have cost more…

        http://www.srh.noaa.gov/srh/dad/coop/mmts.html

        So even post MMTS installation, the time of observation matters. Thankfully the CRN has instruments that report every two seconds, so there at least TOBs is irrelevant.

      • David Springer

        The problem is that homogenization picks up abrupt anomalies but not slow ones. An example of a slow anomaly is a c0tt0n region shelter exterior darkening with age cause a gradual warming trend. An example of an abrupt anomaly is that same shelter getting cleaned or painted.

        The slow warming anomaly is incorporated into the adjusted record while the abrupt cooling is not. Both should be either rejected or accepted otherwise you get all warming and no compensatory cooling.

      • Pretty good point that, David.

      • Zeke,
        “So even post MMTS installation, the time of observation matters.”
        I’m curious why you have apparently not read the Nimbus manual. While it is possible to read the unit manually a any random time, the unit records the midnight values. It is those midnight values that are reported. TOBS changes are eliminated with MMTS stations. So none to correct for.

        Had this been and engineering project, once the algorithm was chosen a comparison between raw and adjusted values for each site would be made. Those variations beyond some chosen value would be checked for a match with actual physical changes at the site. Changes to individual site raw data should then be verified against site records for physical changes at that site. It is especially important to justify inflection points in the adjustments. Sure, that would be difficult and some small percentage of changes might not be document but you are dealing with only a decade worth of site data.

        Insufficient funds or personnel to do that work? If that was the case, then don’t make such sweeping claims for your results. Engineering is an exacting job.

      • Gary,

        I’d have to check but I thought that observers using MMTS were instructed to continue to reset at the NWS recommended times (~7 AM). After all, we don’t see a dramatic shift toward midnight observation times in the 1980s when most MMTS instruments were installed.

        https://curryja.files.wordpress.com/2015/02/figure-1.png

    • “It is those midnight values that are reported.”
      Actually, it also allows automated reading at any preferred times (“Global” below). That looks designed for USHCN work. The operator could simply set to suit his existing agreed time. That would explain Zeke’s graph of times, and remove any desire for TOBS changes.

      “The Nimbus produces two maximum/minimum data sets, Global and Daily. The Global maximum/minimum allows the user to define the time period over which the maximum/minimum is recorded. The user “resets” the Nimbus at the beginning of a new time period and recalls the data at the desired end time. The Global maximum/minimum data set is independent of the internal clock. The Daily maximum/minimum is dependent on the internal clock using midnight to midnight at the standard time period. The Daily maximum/minimum is recorded with the actual time of occurrence within the Nimbus memory.”

      • David Springer

        It appears Stokes misread the Nimbus manual. Hausfather is correct. The Nimbus PL min/max can either be read and reset manually at any time (Global mode) or a daily min/max can retrieved and the reset is automatic at midnight. The user “defines” his own min/max time period by manual reset at the desired time. In “Daily” mode the min/max period is predefined to reset at midnight and 35 days of these readings can be stored.

        There is a way the automatic reset time in Daily mode could be changed but that would mean setting the internal clock to a time that is not the real time i.e. set the time on the instrument so that “midnight” to the instrument is actually 7AM (or whatever) in the real world. It probably wouldn’t occur to the average observer to trick the instrument in that manner. Time stamps on temperature recordings would all be wrong too so couldn’t be used for reference without manual adjustments to compensate.

        Further to Hausfather’s point the instructions to the observer say to record and reset manually at the agreed-upon time just as if the instrument were a LIG setup.

        Stokes 0, Hausfather 1

      • “It appears Stokes misread the Nimbus manual.”
        It’s possible. But it doesn’t make sense to me for them to provide and describe such a system and not make it automatic. They call Global a dataset; doesn’t make sense if it’s just going to provide a single pair of numbers. The say ‘the user “resets”‘. Why in quotes, if they mean it literally? Still, maybe that is what they did.

        It seems the instrument will remember 35 days data, with times of max and min. But yes, it does seem geared to midnight for normal operation, and you’d have to rig the clock for other times. But then why would anyone use it manually when you can just set it to midnight operation and collect the data at end of month?

      • David Springer

        Rain gauge still needs frequent attention so even if daily min/max were reset at the correct time the rain gauge also needs to be reset. Midnight operation in daily mode introduces TOB error. But you’re right that some clever observers might figure out they could set the clock 7 hours behind ahead so that daily mode would reset min/max at 7am instead of midnight. Where I live I could go for days or sometimes weeks without needing to check the rain gauge but that’s not typical for most locations.

        Why didn’t NOAA choose an instrument that allowed the user to set the daily mode reset time? I’d guess either stupidity or someone in a position to spec the product has a nephew owns Nimbus.

  7. For the period in question, the data do appear to indicate the adjustments are working. I haven’t dug into the paper, which would be necessary in order to validate, but OTOH, I don’t believe Zeke is lying. So, I take the evidence presented at face value. I have no doubt climate has warmed over the past 200 years, even longer than that.

    • Agreed, jim2.

    • Zeke is not lying. Saying the adjustments are working does not say much though does it.
      He could hardly present adjustments that did not work.
      The problem is the lack of sufficient stations, the lack of a long enough comparative time base, and the stated fact that to make TOBs/Sensor change adjustments work he has had to artificially lower the historical temperatures at all comparative USHCN in the past.
      {“Fixes to errors in temperature data have effectively doubled the amount of U.S. warming over the past century compared to the raw temperature records.”].
      Now a TOBs adjustment 20 years ago is 0.2C, 50 years ago 1.0C and 100 years ago 2.0C.
      You see the problem?
      Same thermometer 50 and 100 years ago but the 100 year old temp is discounted by twice as much a even though it recorded the same error.
      Great science and why the temp is always presented as an anomaly, not a true baseline reading, right Nick?

  8. Thank you Zeke for your post. I have long wondered about the challenges of having a large network of stations with inherent inconsistencies due to the factors you have listed versus a much smaller pristine network void of problems of various changes. Is it possible to select a very small subset of locations strategically located that need minimal adjustments? If the goal is to understand temperature trends over 100 years or so, how different would those trends be in Kansas vs Alabama or Ohio. Simply put, are we really gaining greater insights into long term climate trends by insisting on having more(problem) stations rather than less (pristine) stations or are we losing accuracy because of it?

    Could you discuss the tradeoffs between more and less sites?

    • Cerescokid,

      Selecting a subset of stations that require minimal adjustments is challenging in practice, as almost all USHCN stations have changes instruments and time of observation since the 1950s. Anthony is trying to do something like this for the new paper they are working on, but even they have to add in an adjustment for instrument changes (CRS mercury thermometers to MMTS). Part of the problem is that the weather network was not initially intended for long-term climate monitoring, and until the last few decades preserving the long-term consistency of the records was not a priority for the National Weather Service.

      Thankfully we have the U.S. Climate Reference Network now. What we really need going forward is a Global Climate Reference Network…

      • “What we really need going forward is a Global Climate Reference Network…”
        Correct. Well said.

      • We have one. Sparse, but reasonably uniform coverage including oceans. It is the radiosonde network, about 85 stations, four different versions. And the results agree quite well with the three satellite records. Results back to about 1970 are quite robust because of painstaking instrument bias calibrations and instrument change calibrations.

      • David Springer

        “What we really need going forward is a Global Climate Reference Network…”

        We have it. Please see global UAH and RSS satellite temperature records available since 1979 calibrated by independent radiosonde data for different levels of the atmosphere. For global see global ARGO buoy data available since 2003 for all ocean depths 0 – 2000 meters.

        Unfortunately over half the volume of the ocean remains unsampled (the half below 2000 meters) and a good fraction of the surface near coastlines and also that covered by ice.

        Land surface temperatures remain problematic even today. CRN’s coverage is limited to just a few percent of the earth’s surface. Satellites have a difficult time measuring surface temperature as well because the readings must be taken using infrared soundings (unlike atmosphere columns which are microwave soundings) and are confounded by clouds and ice.

        The bottom line is you don’t study climate change with the data you wish you had you study it with the data you do have and that data is simply not reliable enough to establish global micro-trends before about 1980.

        The good news is we now have 35 years of adequate data from MSU instruments aboard satellites. A 30-year record length is considered the minimum to distinguish climate from weather. The tentative results of global warming science are now in and they are 0.14C/decade which is something of concern but not alarming nor cause for draconian measures to reduce it.

        A major problem with policy action is there is no credible evidence that the harm caused by 0.14C/decade warming exceeds the benefit of fossil fuel use, atmospheric fertilization, and warmer winters in colder climates. Indeed business as usual in regard to CO2 emission from fossil fuel consumption appears to be the most beneficial path when all things are considered with contrary actions being highly counterproductive on many levels.

        My conclusion is that climate change alarmism is a manufactured excuse for global social engineering. There is not a doubt in my mind about that.

        I’m not saying that global social engineering is not desirable or needed but we need to be honest about it and not corrupt the science establishment, undermine public trust in same, to get it done. That too is counter-productive. Honesty is the best policy.

      • We have a global Climate Reference Network. This is the lower tropospheric temperature measured radiatively from space using the same methodology for 35 years.

  9. For the last 160 years, England was ”the globe” now for Zeke US became the GLOBE… Welcome to the circus…!!! Zeke is best example; why public flogging should be reintroduced…

    • I really doubt that flogging those who espouse viewpoints that you disagree with would constructively add to the scientific process. That said, peer review could be considered a form of intellectual flogging at times…

      http://scienceblogs.com/startswithabang/files/2013/06/PeerReview.jpeg

      • WRONG again; when one discusses ”global” temp and analizes ”local” is the mother of all contemporary CON… #2: for ‘Pal reviewed papers” regarding the phony global warming/ renamed ”climate change” to confuse the ignorant — we had a saying in the old country: -” if you don’t believe him, ask his brother, the other liar” Cheers! Zeke I hope the new year is kind to you – for flogging you have to wait a bit longer…

      • McIntyre has flogging down to an art form.

      • Zeke,

        Unless those doing the peer reviewing have been infected by the activist science bug, in which the whole exercise becomes a sham.

      • Thank you for providing these comparisons Zeke. The cartoon you posted of the peer-review process was also on the wall in my cubicle, one of my favorites. There are other forms of flogging, hirings, promotions, shunning, etc. But that’s an issue for another time. Maybe by the end of this decade some apologies will be in order.

        It’s interesting that the USCRN plots in Fig. 1 show the exceptionally warm year in the U.S. in 2012 but does not appear in the USHCN plots. Also, the tmax trend in the USHCN is slightly lower while the tmin data show a slightly higher trend. I think that tmax is a more accurate representation because tmin can contain contamination from urbanization, disruption of the normal nocturnal boundary layer. The USCRN is a great program and I hope more stations can be added to the network. The trend over the past decade is not significantly different from zero over the conterminous U.S. But I’m worried about global data sets and usage, GHCN, ERSST.v?, Karl’s “hiatus buster,” etc. We need more objective analyses and I think what you have provided here is a great example.

        JerryG∞

  10. Thanks for this, Zeke. It’s an interesting and informative post.

    There are many skeptics who assert that adjustments have lowered the earlier part of the temperature record significantly while raising the more recent trends slightly, or not lowering them as much as earlier years. Those inclined to a conspiratorial viewpoint say this is creating a spurious high level of warming.

    To me your post is an effective argument against such claims. How would you respond (or how have you responded in the past)?

    • Hi Tom,

      I’d respond that the real adjustments that matter are those to the global temperature, not just the land. If that’s our measure, we’ve actually raised the earlier part of the record significantly, something that would not be very smart if we were nefariously plotting the cook the proverbial temperature books.

      • Hi Zeke, I’m afraid most people aren’t foolish enough to believe things like:

        we’ve actually raised the earlier part of the record significantly, something that would not be very smart if we were nefariously plotting the cook the proverbial temperature books.

        This is actually exactly what one ought to do to improve the case for AGW as human forcings in the earlier times were not large enough to account for the large amounts of warming shown in the unadjusted record. That is why people, including our hostess, have the natural reaction of blaming a meaningful portion of the earlier warming on natural variability.

        That exact sort of variability has been widely downplayed in the discussions of global warming (though somewhat less so due to the “pause”) as it calls into question how much influence humans have actually had on the more modern portions since if natural variability could have significant effects in one period, it could have significant effects in the current period.

        I don’t know why you believe changes which would ultimately downplay the role of natural variability and remove warming which couldn’t be explained by anthropogenic activity would “not be very smart if we were nefariously plotting the cook the proverbial temperature books.” It is, in fact, exactly what nefarious minded people would do. Nefarious minded people with any intelligence would know increasing the amount of warming in the past when that extra warming couldn’t be explained by anthropogenic activity would be stupid.

        Your position here is completely backwards. I get some people with little understanding and loud mouths have talked about how adjustments always cool the past, as though that would enhance the case for global warming, but intelligent people have always known the more warming there was in periods prior to ~1950, the greater the mystery of just what caused that warming. Removing that mystery just makes it easier to say humans are to blame for everything.

      • Brandon S said:

        This is actually exactly what one ought to do to improve the case for AGW as human forcings in the earlier times were not large enough to account for the large amounts of warming shown in the unadjusted record.

        My thoughts exactly.

        How in the Sam Hill does one explain this?

        http://imgur.com/StXINDB

      • Glenn Stehle, one strange thing is this exact point has been discussed a number of times over the years, and I’ve explained the problem to a number of people who make this argument. They still seem to act like they don’t get it.

        I don’t understand that. It’s not a complicated point. They’re basically over-simplifying the topic to an extreme degree while ignoring what tons of people have said over the years. Heck, Hausfather’s argument runs contrary to things our hostess has said on this site for years.

        How do they actually come to believe this argument, and why would they expect other people to believe it?

      • Brandon S,

        It really is pretty amazing.

        As a person who spent his lifetime in the oil and gas business, analyzing oil and gas production curves (oil and gas production plotted vs. time), when I look at the graph I see it as I have pictured below.

        Each change in inflection of the curve has a cause.

        With an oil well, the change may be caused by putting the well on artifical lift, changing the artificial lift, working the well over to improve production (or if the workover a failure, it can decrease production), initiating water flooding or enhanced recovery in the field, etc.

        Changes in the inflection curve from the natural decline rate of the well don’t just happen by magic. There is always a cause.

        How anybody could look at that curve and come to the conclusion that it is a straight-line, linear curve has to be out of their friggin mind, or intentionally trying to fit the data to some preconceived theory.

        http://imgur.com/jHc7yNf

      • Zeke it would be interesting to see your response to Brandon as his points were largely the same as my reaction to your response to Tom.

        More warming = better for alarmism is just so wide of the mark that I find it hard to understand why somebody with your understanding would say it. Maybe your just fired off a flippant remark with little thought. As Brandon hinted more warming in the earlier decades when there is little increase in external forcing clearly favours a JC-type position that is looking for a greater role in unforced change.

        Do you agree that reducing the warming rate in the earlier part of the record actually favours the concensus view that change on >decadal time scales is largely externally forced?

      • “How in the Sam Hill does one explain this?”

        Why can’t people give their sources? Who calculated that NOAA raw graph? That might provide the explanation.

      • Nicky, ask Zeke to post his famous graph that is in a twitter tweet. It shows the same thing with more easily discernible detail. Would you trust Zeke?

      • Nick Stokes, while I agree with your sentiment, I feel I should point out that graph was provided by Zeke Hausfather, the author of this post. Other people here have just been discussing it since he posted it.

        So if you do have any questions or concerns about it, you know who to direct them to now.

      • “you know who to direct them to now”
        Actually, I fell into the same trap I chided elsewhere – I missed the decimal points on the y-axis, and found it unbelievable. Now that I see that, and that it is got by integrating gridded NOAA data, that makes sense.

      • This stuff looks like Keynesian climanomics. An application of ex post facto temporal smoothing.

      • intertemporal-smoothing

      • Nick: for reference, the NOAA raw/adjusted global data plotted in my figure is available here: ftp://ftp.ncdc.noaa.gov/pub/data/scpub201506/

        Unfortunately they haven’t updated their unadjusted global values since July or so. I could make my own version using the raw station data, but comparisons would be complicated due to methodological differences.

      • Glen wrote: “As a person who spent his lifetime in the oil and gas business, … Each change in inflection of the curve has a cause.

        In the field of climate (and weather), deterministic chaos rules. Every change does not have a obvious “cause”, particularly since outcomes can be sensitivity to unobservable differences is starting conditions. See this simulation of a double pendulum.

        https://www.youtube.com/watch?v=QXf95_EKS6E

        In the case of climate, chaotic fluctuations in ocean currents can effect exchange between the deep ocean and warmer surface. El Nino, for example, represents a slowing down of upwelling of cold water in the Eastern Equatorial Pacific and downwelling of warm water in the Western Equatorial Pacific (among many other changes). You can separate climate change into: naturally-forced variability (solar and volcanic), anthropogenically-forced variability and unforced (or internal) variability. It is the latter that makes climate science particularly challenging.

    • Brandon S,

      I should have said that each change in inflection in the curve has a cause, or causes.

      There can be several causes operating simultaneously.

  11. Homogenizing is a process of forcing raw data to conform to a predetermined standard –i.e., eliminating noise and introducing consistency to the point of redundancy such that the adjusted data to an artifact of the parameters that were used.

    • In that case, why would adjustments cause records to be more similar to nearby unadjusted (but homogenous) CRN stations, especially since the adjustment process has no knowledge of the temperatures at CRN stations?

      I think our Figure 4 gives a good indication that adjustments are making the consistency of stations much more similar to reality, as indicated by the CRN.

      http://s28.postimg.org/6bi3v2ea5/Figure_4.png

      • Trend differences between the two appear to become uniformly smaller over time after ~1000 but, why should there be any difference between the two before ~1000?

      • So the raw USHCN data reads 0.05C higher than the USCRN raw data consistently no matter what the distance?
        They both use modern electronic sensors and the same TOBs.?
        No, “Alas, MMTS is still a min/max thermometer, and only records the min/max temperature between the last reset [USHCN]. So even post MMTS installation, the time of observation matters. Thankfully the USCRN has instruments that report every two seconds, so there at least TOBs is irrelevant.”
        The answer,
        [Do I get an acknowledgment in your paper correction?]
        is that the TOBs adjustments between the two thermometer sets causes a 0.05C difference between USHCN and USHCN readings.
        Who needs an algorithm?
        Heuristically you can just add 0.05C.
        Thanks, angech.

        “The reference network is intended to give us a good sense of changes in temperatures going forward, largely free from the issues that plagued the historical network.”
        “since we know that the USCRN is largely free of issues, we can see if adjustments to USHCN are making station records more similar to nearby USCRN stations.”
        “We know that the reference network is largely free of measurement problems, this increases our confidence that adjustments are effective at finding and removing problems in the historical network.”
        Um, what problem?

      • I’m suggesting that if it looked more like this:

        https://evilincandescentbulb.files.wordpress.com/2016/02/rehomogenized.png

        …and then, it was adjusted for the UHI effect it, which wouldn’t exist prior to –e.g., 1000, it would make more sense.

    • Wagaton, there is no such a thing as ”raw data” all data has being cooked. Nobody knows what was yesterday’s ”global” temperature, BUT: everybody ”pretends” to know the correct GLOBAL temp for 1000 years; PRETENDING cannot be ”raw data” but harvested from thin air… manure for the mushrooms… Zeke is a good bullshine merchant – regular supply for the adicts…

      • stefan your analysis seems to lack any depth or understanding of the subject, I’m going to rate it as unconvincing.

      • stefanthedenier his it right.

        Zeke can make colored squiggly lines, but he has difficulty with discussion that doesn’t toe the warmer line.

        Andrew

      • human1ty1st | February 10, 2016 at 1:49 pm | said: ”stefan your analysis seems to lack any depth or understanding”

        Sweetheart, what part you don’t understand Or you ”prefer” not to understand, because doesn’t suit your LIES? What part?

  12. Geoff Sherrington

    Zeke,
    Thank you for telling this recent part of the US temperature story.
    Most past concern has been with the decades before this. You note that TOBS and sensor types are the 2 biggest error factors of which you are aware.
    TOBS has always been a conceptual problem, because a correction can be applied to an observation in the absence of evidence that it is needed. Do you know of any papers where the need for a TOBS correction has been confirmed/denied by comparison with nearby stations? Or is it more often the case that stations close together in space have been read close together in time and so do not allow much of that approach?
    Geoff.

  13. Hi, Zeke

    Thank you for an informative session.

    What are your thoughts about the value of Anthony Watts’ identification of class 1 and 2 sites?

    Dave Fair

    • I’ll definitely be doing some of my own analysis with Watt’s new site classifications once he releases them. However, until the paper come out and data is released speculation is likely premature.

      • An excel spreadsheet with all the rates sites, the site rating, and whether urban, suburban, or rural has been available for some time. I used it to do an analysis of class 1 (best) comparing GISS raw to homogenized urban, suburban, rural. Urban sites had UHI corrected, how well dunno. But all but one suburban and rural sites had a spurious warming trend injected. Posted the analysis last year on AW’s WUWT. Ron Klutz in the UK did a similar analysis and came to the same conclusion.

      • rud, thanks for the plug, though I live in Montreal. I prefer the name spelling I got at birth, though I have a life-long friend of Austrian descent who invariably puts in a “K” instead of a “C”.
        Anyway, here is the analysis and the spreadsheets:

        https://rclutz.wordpress.com/2015/04/26/temperature-data-review-project-my-submission/

  14. This is weird. The CRN temperature anomaly spends the vast majority of the time well inside the -2 to 2 deg C range. The CHN(raw)-CRN and CHN(adj)-CRN differences frequently exceed +2 or 2 deg C. IOW just the difference between the two frequently exceeds the normal range! I find it hard to accept the assertion that “adjustments are effective at finding and removing problems in the historical network“.

    I’m also unhappy with the concentration on trends. Given that temperatures can easily vary by 10 degrees C between summer and winter months at most locations, it is surely inappropriate to use trends over barely more than a decade to assess whether temperature adjustments have been reasonable, when those trends are typically of a fraction of a degree C over the whole period. It seems entirely reasonable to suppose that on that timescale two locations 50, 100 or 150 miles apart could easily have opposite trends.

    Maybe it is just fanciful to suppose that meaningful trends can be seen in any of this data.

    • “differences frequently exceed +2 or 2 deg C”
      I think you need to look at the y-axis labels more carefully. Those should be 0.2, not 2.

      • Indeed, its .2, not 2. My graphing software doesn’t like to put 0s before decimal points, but it might be worth changing in the final proofs of the paper to make the figures easier to read.

      • Then change your “graphing software”.

        I would recommend gnuplot : plot “datafile.txt” with lines , will get you a very presentable graph ( with zeros before the point ) . Getting deeper knowledge will let you do nearly any config, format or whatever.

        Output to png, jpeg, pdf, eps, whatever you like.

        gnuplot.info ;)

  15. So you admit that you have adjusted US warming up by 100% compared to raw data. . Here’s your quote——-
    “Fixes to errors in temperature data have effectively doubled the amount of U.S. warming over the past century compared to the raw temperature records.”
    So how does that compare to the Watts stations used in their study? Dr Spencer said that the Watts study reduced US temps by 50%.

  16. Thank you for the essay, and the non-pay-walled document.

  17. Quite a cool day here on the NSW midcoast, at least for summer. Temp just dawdled up to 27.9 around peak time. But with a good old ENE breeze one would hardly suffer anyway.

    Of course, the sun didn’t come out (that oceanic ENE doing its thing)…so I really couldn’t say how hot it might have been. There was something between me and the sun…it was white, woolly stuff…and it wouldn’t get out of the way.

    Still, once someone feeds 27.9 into a computer and mashes it all up with other numbers, nobody’s going to notice, or care about some white, woolly stuff. They’ll just care about the mash of numbers, won’t they?

    Number mash. Mmmm.

  18. General point: it would make it far, far easier to read this sort of paper if the authors dropped the US- prefix from USHCN and USCRN.

    E.g.: ‘…our volunteer-run Historical Climatological Network ( hereafter ‘HCN’) stations…’.

    • Our original paper used CRN and HCN acronyms, but we added the US prefix at the request of one of our reviewers who pointed out that there are other historical climatological networks in the world and wanted to avoid confusion. I rather preferred the old naming convention, but so it goes.

  19. Stupid question, I know, In figure 1 why is the right side so radically different from the left side?

  20. Zeke

    As always, a good article. I do appreciate your efforts. You said;

    ‘Stations have moved to different locations over the past 150 years, most more than once. They have changed instruments from mercury thermometers to electronic sensors, and have changed the time they take temperature measurements from afternoon to morning. Cities have grown up around stations, and some weather stations are not ideally located.’

    To this, add errors by observers, poorly calibrated instruments and a host of other valid concerns and the difficulty in determining what has happened to the climate over an extended period-and over an extended area -become clear.

    The printed US weather review of the Mid and late 1800’s has some very interesting background on the vast differences between the two different types of thermometers commonly used at the time (3 degrees difference) and transportation of equipment. One observer thanks his friend for bringing his new barometer hundreds of miles on the back of a mule. Calibration anyone?

    However, it was this point I found most interesting;

    ‘To detect and deal with these errors, NOAA uses a process called homogenization which compares each station to its neighbors, flags stations that show localized changes in longer-term temperatures not found in nearby stations, and removes these local breakpoints’

    Localised changes are often very real, compared to neighbouring stations. It is clear that the predominant wind direction does change over time, then can change back again. Wind direction has a dramatic effect on temperature profile, for example a predominantly easterly will make the east of the UK significantly colder but the effect is often partially lost by the time you get to the west, North or south. Similarly with westerlies.

    Each introduces different weather patterns, rainier, drier cloudier, sunnier all of which fundamentally affect temperature day and night.. That change in wind direction will not only affect regions but towns quite close to each other when there are substantial hills or mountains involved, so one town on one side becomes wetter and one on the other sunnier.

    Collectively, all these elements have an impact.

    I am currently studying Hubert Lambs records of winds back to 1340 and the Met Office have been helpful in trying to fill in the modern gaps.

    Its not a perfect index but enough to know that wind direction-and the associated changes in weather have an impact. How great, its difficult to tell, but presumably someone (funded) is studying this.

    best regards on a fine article

    Tonyb

    • David Springer

      Localized changes are sometimes real and sometimes not. For instance rejuvenating the exterior of a cotton-region shelter (CRS) by a good scrubbing or painting causes a localized cooling. This will be detected by homogenization algorithms and rejected because it happens quickly and doesn’t happen simultaneously with nearby stations. However, the long slow warming induced by gradual deterioration of the shelter exterior is not detected and not rejected. The end result is all the artificial warming gets incorporated into the record while all the compensatory artificial cooling gets rejected.

    • nobodysknowledge

      Tony B
      “Localised changes are often very real, compared to neighbouring stations.”
      I think you have some good points. And I even don`t think that old measurements are so terrible. People with knowledge can adjust for equipment used, and for changes in location and surroundings of the stations. So most of the data can be good enough to look for temperature change.
      I think it is a problem when international or great natonal organisations go into data of a country to clean it up. Then it can end up with two datasets, one national, and one international. It happened on Iceland where national scientists didn`t see the reason to clean temperature records. (Ref. mails between Paul Homewood and Trausti Jonsson). I am sure that local meteorologist know best how to adjust their data.

    • I agree that localized changes can occur, but for the most part they are transient. That’s why we generally only flag breakpoints when we detect persistent differences between neighbors that span years (and why homogenization of daily data is so difficult). There are some cases with real longer term localized climate changes (building reservoirs comes to mind), and those might cause issues, though even in the case of a real local climate change it still shouldn’t be “spread” around when trying to estimate regional temperature if the changes are not seen at any proximate station. Similarly, UHI is a real localized climate change, but it’s not something we want to extrapolate to nearby areas when creating a gridded temperature product.

  21. nobodysknowledge

    Just a question of trend difference. I wonder if it will always be a quality control to reduse this difference. With some climate change there could be different trends at different locations, even if they are not so remote from each other. I found a very good example of this in an icelandic blog. From the icelandic meteorologist Trausti Jonsson. This is very illustrating, as it shows that at one place 2015 is almost the warmest year since 1870, with a clear upward trend (Stykkisholmi), and at another place it is the coldest year, with a downward trend (Vestmannaeyar-Grimsey). There you cannot homogenize and adjust trends without destrying data. And destroying data would make you miss some very important information, and lead you away from posing some interesting queations.
    http://trj.blog.is/blog/trj/entry/2163417/

    • Global, or even regional, average trend estimates are really only useful in policy debates. Not so helpful when you are trying to figure out when to plant your rutabagas. For that you need localized information.

  22. Another stupid question:
    If the homogenization doesn’t affect trends, what is the use of the exercise?
    Perhaps the knowledge that it doesn’t affect trends?
    And what’s the use of supposing the “real” temperature at some station might actually have been one or three tenth of a degree lower or higher on some day?

    • Homogenization will affect trends if inhomogenities result in a systemic trend bias. Things like station moves will often have mostly-stochastic effects; sometimes the new location will be warmer, sometimes it will be cooler, and these factors will tend to cancel out. Other things like instrument changes (to MMTS) or time of observation changes (afternoon to morning) will almost always introduce a negative bias relative to temperatures before the change. When these sort of issues predominate, as was true in much of the 1980s and early 1990s, homogenization will have a noticable trend impact.

      It just so happens that over the past 10 years at least, most of the biases in raw data seem to roughly cancel out in their trend impacts.

  23. “Since we know that the reference network is largely free of measurement problems” – comment made by first group using mercury thermometers
    ” since we know that the —– is largely free of issues”
    similar comment made by the people who took temperature measurements in the afternoon instead of the morning.
    The reference network is not free of problems and there are unknown problems awaiting in the future.
    “While the impact of these adjustments on temperature records is relatively small globally, in the U.S. it has a much larger effect. Fixes to [errors in] temperature data have effectively doubled the amount of U.S. warming over the past century compared to the raw temperature records.”

  24. “A picture of the three redundant temperature sensors at a U.S. Climate Reference Network station.”
    Is this caption correct?
    From the tenor of the article you might mean redundant USHCN sensors.
    If they are 3 USCRN sensors then the word redundant seems, well, redundant.

  25. David Springer

    While this is valid for assessing performance of adjustments to recent decades in the USHCN record it isn’t valid for the more distant past because the USHCN network has changed a lot since 1950 while it has changed little since 2000.

    • David there is some funny little quirk about the USHCN*.
      Although they “use” 1018 stations for the raw data collection the actual figure given for the US temperature anomaly is worked out from a subset of about 137 [?] stations [adjusted USHCN] which we are told are valid for the more distant past including back to 1950..
      ] * Not many people know that.
      Zeke would.
      Perhaps he could enlighten you
      raw USHCN, adjusted USHCN How many stations go into the graphs he has given.
      It is a Michael Mann grafting trick [Oh, everyone who uses it knows we only give a small subset, Did you not know?]

      • “The United States Historical Climatology Network (USHCN) is a high quality data set of daily and monthly records of basic meteorological variables from 1218 observing stations across the 48 contiguous United States”. not 1018
        “This initial USHCN data set contained monthly data and was made available free of charge from CDIAC. Since then it has been comprehensively updated several times [e.g., Karl et al. (1990) and Easterling et al. (1996)]. The initial USHCN daily data set was made available through CDIAC via Hughes et al. (1992) and contained a 138-station subset of the USHCN. This product was updated by Easterling et al. (1999) and expanded to include 1062 stations. In 2009 the daily USHCN dataset was expanded to include all 1218 stations in the USHCN.
        Note: A related product using a subset of the USHCN-Daily data is now available”

  26. Do not understand how one gets a single USHCN or USCRN historical temperature anomaly baseline.
    Nor how one gets a “temperature of the USA”.
    Obviously if you had an unchanged set of sites anywhere you could get an average of the temperatures at those sites over the years.
    It would be entirely site specific and bear no real relationship to the true temperature of the USA as a whole.
    People are not going to live on mountain tops or super hot areas like death valley and take temperatures.
    The best way would be to grid the area and arrange stations or homogenize areas where stations would be to give a spread out pseudo average.
    If you use USCRN to readjust the USHCN as a historical group of temperature sites you have already tampered with and maladjusted the readings.
    ie if they are site specific [which they are], with no gridding value [as I have been informed] then taking information from these other sites to work up an algorithm is wrong, wrong, wrong.
    You know a 10 or 20 year data base is absolutely useless in getting a true comparison between different sites and the number of sites is too small with too great an error bar.
    It is like 6 people in New Hampshire voting for Rubio and extrapolating it to say he will win the presidency [I guess he could].
    You need hundreds of years [which you have not got] or a hundred thousand stations [which you have not got] to make any half way reasonable valid comparison.
    Saying it is so may give you a warm glow but cuts no statistical ice.

    • The implication of your comment is that temperatures measured at a particular site are intrinsic to the terrain and weather dynamics in that locale. I understand the interest to identify and remove non-climatic changes affecting readings, but the fact that one site shows differences from another nearby does not require homogenizing. Temperatures values should not be altered or infilled from one place to another. Rather, the analysis should be to compare the change derivative patterns to see regional, national, continental or global tendencies.

    • I wonder how entertaining (or meaningful) it would be to see an average national temperature for the US on the nightly weather forecast?

  27. Zeke, I’m curious – what does adjusting the HCN values based on the UCN measurements do to the overall calculated accuracy of the HCN data?

  28. Here you can interrogate the google earth’s interactive map, just select decade, year and month.
    http://www.moyhu.blogspot.com.au/p/blog-page_24.html

  29. Zeke, I have a few questions regarding just how raw is the USHCN raw data?
    1) In your processing or in the original data processing were there any adjustments made to the original data collected?
    2) Since TOBs is a ubiquitous problem with USHCN data was any TOBs adjustment made before or during your processing?
    3) In the gridding step(s) were they exactly the same steps for the CRN and HCN stations?

    Thanks in advance.

    • Hi Corev,

      The raw HCN data we are using has been subject to quality control (which flags and removes values where the min is higher than the max, or values are well beyond records for that location) but apart from that is not subject to any adjustments (TOBs or homogenization).

      The adjusted HCN data we use has been corrected for TOBs as part of the adjustment.

      In gridding we only used grid cells that contained both CRN and HCN stations for a given month, so the spatial coverage is exactly comparable and the same gridding process is used.

  30. stevefitzpatrick

    Very nice paper Zeke.

  31. A good post, and good work by Zeke, but, as many of the commenters point out, we are still largely guessing what the actual temperature record really is, as well as the trends, given all the adjustments, station moves, refurbishments (CRS – are there other examples?), and homgenizations (which may be invalid anyway given grid scale, localized weather differences, refurbishments, winddirection, etc.). So, is it wise or in any way valid to use such a questionable record to set policy for co2 emmissions?

    • I forgot to include clouds – apologies to ATTC.

    • Barnes said:

      So, is it wise or in any way valid to use such a questionable record to set policy for co2 emmissions?

      This post deals only with whether global warming exists.

      It does not explicitly delve into:

      1) What caused the global warming,

      2) The consequences of that global warming, whether they be benign or harmful in the aggregate,

      3) If global warming is deemed to be harmful, whether it can be mitigated or not, and

      4) If it can be mitigated, would the cost of doing so ourweigh the potential benefits?

      As can be seen, the debate gets stuck on whether global warming exists or not, before we even get to the other plethora of questions.

      • Glenn – agreed. Unfortunately, that does not stop the warmunistas from pursuing a political agenda to stop all Co2 emissions based on a very questionable temperature record. Fortunately, SCOTUS put a stay on Obama’s Clean Energy Plan, but that is only a temporary set back, and we still have Sheldon Whitehouse & Co. looking to introduce legislation to criminalize fossil fuel companies based on some imagined harm with no consideration of the benefits. The simple answer to my question is an emphatic NO.

  32. “Measuring global temperatures in the U.S. no easy task.”

    I can’t get past the first sentence. It’s nonsense.

    Andrew

    • We can study the CET because the records still exist. We can study the MWP because records still exist. Lamb, has written at length about the antidotal records of mans recent history. Yet no AGW scientist seems to care about what it was that Phil Jones ‘dumped’. Why not; is what I need to know. What written records, how many 5.25″ hard drives and what was on them? How many floppy discs and what was on them? Saving ‘space’ works only if you are a fool. Everybody kept the records until Phil. Now no AGW scientist seems to care. Did he delete or wipe the old data he and HAD/CRU collected for this endeavor? We know how Nasa chose to degauss the Moon footage to save even more space for mankind. Trust but verify we were told.
      .

  33. “weather stations were never set up to consistently monitor long-term changes to the climate”

    This is where the party ends.

    Andrew

  34. After getting to the end of the Methods section, I saw this line:

    Code used in performing these analyses is available in the supplementary materials.

    I looked in the Supplementary Materials, and saw the line:

    Annotated Code: http://www-users.york.ac.uk/~kdc3/papers/crn2016/

    That took me to an Overview page for the paper which had the line:

    If you would like to access the data and methods from the paper, follow the Methods and data link.

    But when I went to that page, I couldn’t find any code posted. I found a link back to the Supplementary Materials, links to four data files and links to two locations where the most recent USHCN/USCRN data can be found. I don’t see any other links on that page though. Did I somehow manage to the wrong page, or is the code not currently posted?

  35. David Springer

    Answer the following question.

    Is an unreliable temperature record the same, better, or worse than no temperature record at all?

    [ ] same
    [ ] better
    [X] worse

    • An unreliable temperature is better than none at all.
      It tells us about the people collecting the record.

    • David Springer: Is an unreliable temperature record the same, better, or worse than no temperature record at all?

      It depends on how unreliable it is, a problem that a number of people are actively working on. The work of Zeke Hausfather et al, presented here and elsewhere, shows that the record we have is better than no record at all. Note well, one use of unreliable data, illustrated here, is to motivate and guide the collection of more reliable data in the future, as when a series of poor estimates of the speed of light led eventually to Michelson’s accurate estimate.

      On the other hand, your question may contain a tautology, in which the word “unreliable” is nothing quantitative like a mean square error, but is by definition a record that is worse than no record.

  36. ZH:

    Thank you for posting this interesting and informative discussion. And especially for linking to a non-paywalled version. Good work.

    One question: Did you produce any graphs that display the confidence intervals, or is that restricted to Table SM 1?

  37. Here’s an interesting thing about the tests done in this study. USHCN only uses ~1,200 stations for averaging to create a US temperature record. However, when doing its homogenization, it uses ~10,000 stations to created an “adjusted” version of those 1,200 stations. This means the raw USHCN record actually uses far less data than the adjusted one.

    This post, however, says:

    While the USCRN will provide excellent data on the changing U.S. climate in the future, in the past we are stuck with the historical network. What we can do, however, is use the USCRN to empirically assess how well our adjustments to the historical network are doing. Specifically, since we know that the USCRN is largely free of issues, we can see if adjustments to USHCN are making station records more similar to nearby USCRN stations. That is the focus of our study

    Even though we don’t actually know any adjustments to the records are having a desired effect due to the methodology being sound. An alternative possibility is the ~1,200 stations used in the USHCN data set aren’t representative of the full ~10,000 stations.

    If the full list of ~10,000 stations has a different signal in recent times than the ~1,200 stations used in the USHCN data set, then any form of homogenization would necessarily cause the USHCN data set’s results to shift more toward those of the full ~10,000 data set. There’s no telling what effect that might have on the sorts of comparisons done in this study. It could well be if one compared the same unadjusted data to the same adjusted data, the homogenization process would give less impressive results. Or even bad ones.

    I don’t see any discussion of this concern in the paper. The authors acknowledge the topic by creating an unadjusted data set which doesn’t use the CRN stations in the homogenization step (they are included in the larger ~10,000 station network), showing they’re well-aware of the issue. Given that, I’d have expected them to at least talk about the topic.

    As it stands, the authors seem to be rather disingenuous in their public statements. I mean, how many people hearing about these results will realize the authors didn’t just look at what effect adjusting the data has, but also massively changed the amount of data being used? That may not actually affect their results, but it’s certainly a huge caveat people should be made aware of.

    • Berkeley Earth’s CONUS record uses all of the station data (not just USHCN). The results are quite similar to adjusted USHCN data and different from raw USHCN data.

      http://s24.postimg.org/e5ozv5oit/us_raw_adj_berkeley.png

      • David Springer

        I love how the adjustments cool the past and warm the present. Nothing else is quite so clear and convincing of the pencil whipping’s real purpose. LOL

      • Could you explain what point your comment is trying to make? Your comment explains the results Berkeley Earth gets when it uses more data are similar to the results you get when you use more data. That would seem to tell us nothing useful about the issue at hand = what effect does homogenization have on any given data set.

        The raw USHCN data set uses only ~1,200 stations. It shows a particular signal. The adjusted USHCN data set uses over 10,000 stations. It shows a different signal. Pointing out when Berkeley Earth uses those same 10,000 stations, as well as some additional ones, it gets the same results as when you use those 10,000 stations to create the adjusted USHCN record tells us nothing about the appropriateness of the homogenization process being used.

        What I pointed out is simple. Adjusted USHCN uses a different data set than raw USHCN. It is impossible to know which differences between the two arise from the homogenization process itself as opposed to the massive change in data being used.

        Pointing out another group which uses a massively different data set gets similar results to when USHCN uses a massively different data set doesn’t tell us much of anything about the homogenization process. For all we know, the homogenization process may not really be good, but anything which lets you use the extra ~9,000 stations in your calculations is.

      • Zeke, why do you have year to year variability remaining after applying a “5 year smooth”. Don’t tell me you are still using running means as a low pass filter.

        Recommended reading:

        http://climategrog.wordpress.com/2013/05/19/triple-running-mean-filters/

        recommended , general purpose filter:
        https://climategrog.wordpress.com/2013/12/08/gaussian-low-pass-script/

    • I don’t see your point at all. USHCN is what it is. It has a core set of stations, that you can use for a “raw” average. Then some processing is done, which generates the published result. Zeke’s point is that the processing brings the resulting estimate of CONUS average closer to the USCRN estimate that his estimate based on “raw”, and improves some measures of correlation.

      There is a small sub-point that data from the CRN stations is used in homogenisation, which gives some circularity to using agreement with CRN as the critarion. Zeke has answered that by saying that the result still holds if CRN stations are not used in homogenising.

      In any case, I think the influence of the other stations on the homogenised average is exaggerated. Homogenising identifies non-climate changes, and stations in the average are adjusted to remove the change. They aren’t generally moved to match the other data, except to the extent that the other data helped to estimate the change amount.

      • Nick Stokes:

        I don’t see your point at all. USHCN is what it is. It has a core set of stations, that you can use for a “raw” average. Then some processing is done, which generates the published result.

        The point is found in the note you’ve left off here. You say “some processing is done,” making it sound like some processing is done on the USHCN data, but that’s not the case. What actually happens it the USHCN data is processed with a bunch of other data.

        I’ll repeat, the ~1,200 USHCN stations are processed with approximately ~9,000 other stations. That after using these other ~9,000 stations one can look only at the results for the 1,200 USHCN stations doesn’t change the fact the other ~9,000 stations were used.

        Given the processing done on the USHCN data used a great deal of other data, one cannot tell if changes were caused by the processing algorithm or the use of the other data.

      • “Given the processing done on the USHCN data used a great deal of other data, one cannot tell if changes were caused by the processing algorithm or the use of the other data.”
        USHCN had a set of 1218 stations that it chose for continuity of record etc. They used those as primary reference points for the temperature averaging. They didn’t use the other stations as primary mainly because of the continuity issue. But in the end, what NOAA posts is its best estimate of CONUS temperature. And I don’t see any basis for criticism if they got benefit from the extra data, in addition to what homogenisation provides directly.

        I use past tense because it’s all different now with nClimDiv, which doesn’t make this distinction in the same way.

      • Nick Stokes, you say:

        And I don’t see any basis for criticism if they got benefit from the extra data, in addition to what homogenisation provides directly.

        Nobody has criticized NOAA for using the extra data. What I have said, and what you seem to be trying very hard not to understand, is you can’t prove an algorithm is good by comparing results before and after using the algorithm if when using the algorithm you also add a ton of extra data.

        There is simply no way to know what effect the homogenization algorithm has as opposed to the effect using the extra data has. You don’t solve for one parameter by changing two variables and assuming all changes in the results come from the changes to one variable.

        This isn’t a difficult concept to grasp, and you pretending anyone has criticized NOAA or anyone else for using extra data to try to produce a better result is absurd.

      • David Springer

        Homogenization detects abrupt changes in a single station where that change isn’t reflected in neighboring stations and also where that change persists for a long period of time.

        Notably what it won’t detect is a gradually increasing very slight warming signal caused by wooden Stevenson screen (cotton region shelter) exteriors getting darker with age. Notably it will detect the abrupt cooling when the shelter is cleaned and painted.

        This is a critical failure as it incorporates all the cumulative false warming due to deterioration of the highly reflective shelter exteriors and rejects the compensatory cooling that comes along when the exterior is rejuvenated. It’s nefarious and it’s worldwide.

  38. Zeke –
    I know it’s minor, but could you re-label the trend difference (x-axis in figure 3, y-axis in figure 4) correctly? The units can’t be degrees C; presumably it’s degrees C per year. [Or K/a if you want to go pure SI.]

    • Thanks Harold, those changes are already slated for the final proofed version of the paper.

      • That was the first thing that struck me , glad you picked it up. Also “degree C ” should be small D . If that is your plotting software again, changing what your told it was the x axis label, I again suggest checking out gnuplot :

        set xlabel “trend / degree C”
        will get you exactly that. Default number formatting with have a zero before the decimal point. ;)

      • Zeke Hausfather: Thanks Harold,

        As of my writing, that is the last of the responses by Zeke Hausfather, so let me say now “Thank you Zeke Hausfatherfor your responses to comments”. I’ll be checking back later to see if you post more.

      • Thanks Zeke!

  39. It’s worth noting that the “et al” here are Menne and Williams, two of the people primarily responsible for the USHCN adjustments in the first place. It’s therefore a self-audit, and very unlikely to convince anyone who is sceptical of the adjustments.

    • ==> “and very unlikely to convince anyone who is sceptical of the adjustments.”

      Meaning that you think that “anyone” who is “skeptical” of the adjustments doesn’t evaluate the evidence on its own merits but rather evaluates the evidence base on identity politics?

      • Or Joshua, that people who either don’t have the time, capability or interest in verifying the accuracy of a study’s results are likely to use other things to gauge its potential reliability.

        It’s quite common for people to be skeptical of claims made in defense of work by the authors of that work. That’s not something most people would call identity politics.

  40. I feel privileged to be on a blog that champions data as presented here by Zeke Hausfather. I am inclined to accept his explanation of adjustments to temperature while still awaiting Watts take.

    Now Berkeley Earth shows this correlation of Temperature and CO2 not just for late 20th century but going back to the LIA. Having pondered this question previously (ie Vaughn Pratts demonstration*), I am also inclined to accept this notion. Although I still have a hard time believing that the attribution goes entirely to CO2.

    http://berkeleyearth.org/wp-content/uploads/2015/02/annual-with-forcing-small.png

    * https://judithcurry.com/2015/11/03/natural-climate-variability-during-1880-1950-a-response-to-shaun-lovejoy/#comment-740616

    But again many thanks to Zeke Hausfather for sharing this valuable information on this blog and thanks to Judy for posting it.

    • The oceans are giant carbonated drinks, It would defy the laws of physics if the vapor pressured of CO2 in the atmosphere did not go up and down as temperature goes up and down. The CO2 is a result and not a cause. This is simple stuff people. There is this correlation throughout much of history. sometimes CO2 does not follow temperature but more fort it does.

  41. Is integration over a digital elevation model included in the spatial averaging? You know, 4 degrees per 1000 feet of elevation?

    • nickels

      Interesting question. I have been informally studying this over the last year and have rarely seen the temperature obligingly drop by the amount expected.

      tonyb

      • It seems like a bias could be introduced by moving a station in elevation. Also, one has to then question how all the operations involved commute.

      • “rarely seen the temperature obligingly drop by the amount expected.”

        I’m curious, if you have time to expound a little…

      • nickels

        We live at sea level in the UK and often drive up to nearby Dartmoor where there are convenient markers for 1000 and 1500 feet.

        I therefore calculate what the temperature ‘ought’ to be.

        It is often much warmer than expected, sometimes (but rarely) much colder, sometimes (but rarely) about what expected.

        There is often a temperature inversion with various staging posts being warmer than the sea level reading (via a car temperature gauge which measures in Half degrees.)

        I also travel to Austria a lot and knowing altitude markers can again calculate what the temperature ‘should’ be.

        Again, it rarely matches the theory, although in this case a surprisingly modest elevation can drop the temperature to much colder than expected. Again, temperature inversions are common.

        I don’t know if there has ever been a scientific study done regarding altitude, type of topography, tree cover, humidity etc according to latitude, but certainly the theory doesn’t work out in my experience, although I suppose everything would be ‘averaged’ out

        tonyb

      • climatereason,

        I see what you mean. I’ll bet the various humidity effects are a major player with your results.

        Here in Colorado its pretty much a hard and dried rule amongst us mountain climber types to respect the 4 degree rule. But, of course, there isn’t a drop of moisture to be found.

    • Temperature is modelled as a function of elevation.. That function changes with the season. It’s an empirically derived lapse rate not a theoretical number.

      • And as irridiance increase around 10% per 1000 metres you have an inverse problem that is poorly constrained in so called global sat data sets.

        The altitude effect of direct irradiance is considerably higher than that of global irradiance at all measured wavelengths.

        http://www.sciencedirect.com/science/article/pii/S1011134496000188

      • It seems that some sort of elevation normalization is necessary?
        My first instinct would be to integrate the average over a DEM. Or one would need to make some estimate showing it is a minor effect…

      • Nickels

        elevation normalisation? The trouble is to get that all sorts of averages derived from all sorts of different conditions would have to be created.

        maksimovich1 has posted a good link on this. Would creating averages from data that seems to vary wildly according to a dozen important factors be of any use? I dunno.

        It become another fruit salad doesn’t it, rather than a specific single fruit.

        However bearing in mind the temperature difference between a standard thermometer at 2metres above the ground and one measured ten feet higher, I think this altitude factor is an important one

        tonyb

  42. Glen Stehle:
    Re your post of 8:45 a.m. where you ask “How the Sam Hill does one explain this?” (circled area on graph)

    Actually, fairly easily explained.

    Dimming Anthropogenic SO2 aerosols emissions decreased by approx. 29 million tonnes during the Depression era, primarily due to decreased industrial activity. This atmospheric cleansing allowed sunshine to strike the earth with greater intensity, resulting in greater insolation and increases in average global temperatures.

    This period is also known as the “era of early brightening” due to the atmospheric cleansing.

    (see the graph of anthropogenic SO2 emissions in the paper “Anthropogenic Sulfur Dioxide Emissions: 1850-2005” by Smith, S.J. et al (2011).

    • If that were true, then the adjustments removed a real signal present in the data which could be of great importance in understanding some aspects of the global warming topic. Which would be a very bad thing.

    • BH:

      Not sure how you draw that conclusion (even looking at your citation). US SO2 emissions have multiple peaks including post-WWII era. US peaked in the 1970s, along with global emissions.

      Also need to consider that industrial emissions in US were concentrated in the east. So, if aerosols at the source, there should be measurable difference West Coast to East Coast at station sites.

      • The primary source for the highlighted portion is actually the ocean record, which raises additional questions about how one would explain it.

      • Opluso:
        Re your 2:13 pm post:

        Whenever SO2 emissions decrease, temperatures will rise for the reason given. The referenced graph showed decreased emissions in the 1930’s

        SO2 sources (factories, power plants, etc.) are spread across the country so there might not be much of a measurable difference West to East–but it would be interesting to compare the measurements)

        Globally, the anomalous temp. rise is approx. .02 deg C for each net Megatonne of reduction in SO2 emissions.

    • http://www.mpimet.mpg.de/en/communication/news/focus-on-overview/new-study-cooling-by-aerosols-weaker-and-less-uncertain/

      Fun study. Contends aerosol forcing is -0.3 to -0.9 W/m2 and the IPCC central estimate -0.9 W/m2 is on the edge of what is plausible.

      Low aerosol forcing means the models are overheated.

      • PA

        Thank you for the link.
        Yes, the models are being overstated.
        But what is even worse is the fact that the IPCC diagram of radiative forcings has only negative forcings for aerosols.

        Warming due to the reduction in SO2 emissions due
        to clean air efforts results in a positive forcing, which they totally ignore–and which is easily equivalent in magnitude to that now attributed to CO2.

  43. Valuable paper. CRN also confirms the sat (I have seen UAH) CONUS pause/cooling.
    But doesn’t solve three global problems. 1. Insuffient land data for Africa and Siberia and portions of South America. 2. Until ARGO, insuffient ocean data. 3. Growing divergence betwee land/sea surface estimates and satellite observations since 2000. I suspect it is the methods used to ‘correct’ 1 and 2 that cause 3. Karlization being but one current example.

    • Ristvan, Zeke and Tonyb
      Great to have data and science plus non paywalled paper.

      This is valuable.
      But still, adjustments overwhelm the signal so with the short record of 1979 to now, how can anyone call skeptics names?

      Plenty to do and let the data and observations lead the way to model improvements.

      How did it became the hottest ever by .001 *C with lack of data from arctic, Antarctic, Africa, Siberia and South America, plus middle pacific ocean?

      Zeke, is this just in the top 10 hottest in the series of the long slow thaw?

      Scott

    • richardswarthout

      Rud

      Concur, and thank you for answering a question I was about to ask; how does CRN data compare to the satellite data? Would also like an answer from Zeke. His fellow Berkely Guy (Mosher) insists that the satellite date is unreliable and that Christy failed when correlating radiosonde to satellite (I believe Christy used the 85 site radiosonde network of which you speak and compared those sites to limited satellite data, limited to comparable grids and comparable times of day).

      Richard

      • Richard, I do not know which of the four radiosonde data sets UAH used, but do kmow they ‘spot checked’ sat to sonde at specific locations and times, as you surmised. Christy wrote a paper on this, as I recall.
        One thing the UAH team did was march up the North American west coast from southern Mexico to Barrow Alaska, checking that their satellite interpretation algorithm matched the local sonde readings at the different latitudes and over different seasons. That is solid validation work.

      • richardswarthout

        Rud, Thank you for the information. Writing this comment reminds me of the few days spent in Islamoranda last summer, especially the evening with my daughter, dining and watching the sunset over the gulf; met some fine people there too. Richard

      • John Christy’ s Testimony Fig 2 refers to temperature
        variation measured at 59 radiosonde stations in the US
        and Australia.
        https://science.house.gov/sites/republicans.science.house.gov/files/documents/HHRG-114-SY-WState-JChristy-20160202.pdf

      • It’s not that satellite is unreliable.
        It’s rather this.
        You have three data sets measuring different things in different ways and relying on only one of them is not a good skeptical practice.

      • mosher, “You have three data sets measuring different things in different ways and relying on only one of them is not a good skeptical practice.”

        A good scientific practice would be to show all three and the associated uncertainties. Politically, you would pick the one that best makes your case. Career wise, promoting your product and tweaking your product for strategic deadlines with plenty of media exposure could be in order.

        I think the pure science for science sake is the unicorn.

  44. Zeke I’m struggling to undestand what this study is supposed to be telling me. In part that’s because it seems to be saying that homogenization, adjustment and de-biassing are all the same thing when it seems that there not. I’ll try to explain my brain-fart with reference to fig2.

    So for me the homogenization process in fig2 is represented by the difference between the blue and red curves. If the intent is to make close stations more similar and you apply an adjustment that removes the variance then hey-presto the graphs change shape the way they do. No problem with that, individual sites more closely resemble their neighbours.

    But thats not ‘removing bias’ from the data set as a whole. So again taking the premise of this study that CRN=reality then bias is not represented by shape of the curve but that the mean of the data set is equal to reality. So for the ten years of overlap here the Tmin HCN raw data shows no bias (the mean is about 0) adjustment doesnt change that. The Tmax HCN raw data is biased toward a slower warming rate than reality (mean <0) and adjustment doesnt correct this.

    So again working on the premise that CRN is reality, why isnt the conclusion of this study that taken as a whole dataset
    1)For Tmin homozenization makes close by sites more homogenous but no bias exists in the raw data.
    2)For Tmax a bias exists in the raw data which is uncorrected by adjustment.

    I dont know why we should feel more certain about the homogenization process after this study, it failed to correct bias in Tmax in your study period.

  45. Willis Eschenbach

    Zeke, many thanks for a most interesting analysis. A couple of comments.

    First, I was very interested in your analysis of the trends of paired stations. It shows that while the correlations of paired stations are strong out to large distances, it appears that the correlation of trends is not as strong. It would be interesting to see an analysis comparing the two.

    Second, let me quote David Springer from above, viz:

    In particular I believe the methodology used in the adjustments is messed up for cotton-region shelters (CRS) which dominated the network in the more distant past. CRS exteriors darken with age. White paint fades and those closer to sources of soot darken from soot deposition. That causes a CRS to have a false, very gradual warming trend. Because the false trend is so slow it doesn’t trigger any adjustment to compensate for it.

    The case is not the same when a CRS is restored by cleaning and/or painting. Because the cleaning or painting happens in a single day the temperature measured by the box cools literally overnight and because neighboring stations don’t get cleaned or painted at the same time this sudden cooling of one station triggers an adjustment in the record. It triggers it in both NOAA and BEST methodologies.

    The end result is that false slow warming from CRS aging gets baked into the record while the sudden cooling from CRS exterior restoration gets rejected.

    I’ve raised this question before. As David correctly points out, the method used by both BEST and NOAA is guaranteed to convert a sawtooth wave into a spurious trend. I’ve asked both Mosh and yourself to what extent this biases the BEST data towards exaggerated warming. I’ve been assured it has been looked at … but I’ve never gotten a link to such an analysis.

    The simplest analysis would be the sum of all of the “inhomogeneities” discovered in all of the records. If the total absolute size of the downward jumps is greater than the total of the upwards jumps, it means you are baking in a warming trend that may have no basis in reality.

    Now, it may well be a “difference that makes no difference” … or not. But we need the analysis to determine that.

    Best regards, thanks for all your work and for an interesting analysis, and I do hope we can put this sawtooth issue to bed,

    w.

    • Hi Willis,

      I agree that sawtooth patterns are a particularly interesting case. As for CRS darkening, the only empirical studies I’ve seen (e.g. Doesken’s work) suggest that the effect is rather small (on order of ~0.06 C) but still important. I’ll check with the ISTI group creating homogenization benchmarks to see if they are including sawtooth patterns in their synthetically perturbed data, as it would be interesting to see how the different homogenization algorithms perform in practice in the presence of gradual trend biases followed by sharp corrections.

      • Willis Eschenbach

        Thanks, Zeke, but I was looking for an answer to the question, not an acknowledgement that it is “particularly interesting”.

        I look forwards to your response from the ISTI group, but I’m astounded that after both David and I have been asking this same question for over a year, that you have no answer yet.

        w.

  46. Why weren’t MMTS sensors originally installed at the USCRN locations?

    If the MMTS sensors had been co-located that would have a couple of benefits including much better test of the homogenization algorithm.

    • PA there were a good number a sites that had both for comparison and a handful even today for continued comparison.

      See comparing MMTS to LIG

      http://www.srh.noaa.gov/srh/dad/coop/mmts.html

      • All studies and results agree on one thing. The change from LIG to MMTS did introduce a detectable and not fully correctable inhomogeneity
        to the U.S. long-term temperature time series.

        From the very first year when the MMTS was installed at the Fort Collins weather station back in 1984, MMTS has consistently measured lower
        daily maximum temperatures with the largest differences occurring in winter.

        It is possible that with aging and yellowing of the MMTS radiation
        shield that there is slightly more interior daytime heating causing recent MMTS readings to be more similar to LIG temperatures.

        The use of multiple sensor/shelter types for “climate measurement” is just indefensible. It is bad enough that both CRS and MMTS shelters get warmer with time.

        I don’t hear much about the “aging” correction for shelters, a negative adjustment based on the age of the shelter. Further the MMTS correction is weather dependent and varies with time of year. This means the correction is site specific.

        Homogenization (as someone else noted) is just an excuse to jack up modern temperatures. The homogenization compensates for discontinuities (painting shelter, changing instruments) but doesn’t address aging. Pre-homogenization that would tend to wash out since the station would get warmer and warmer, then get maintained and cool off again.

        Since BEST branches on breaks they capture all the aging related warming.

        SInce all this presumably is well known I am surprised that the correction for this doesn’t get more discussion.

      • PA, “SInce all this presumably is well known I am surprised that the correction for this doesn’t get more discussion.”

        If you did that and considered how 1950s pollution would accelerate the darkening of shelters you would be messing with the almost unbelievable confidence intervals you get with big numbers and assuming normal distribution. That would just make things complicated.

      • The unweighable lightness of being.

      • “SInce all this presumably is well known I am surprised that the correction for this doesn’t get more discussion.”

        Well known? This is just someone from Colorado Sate University saying “it is possible that…”. Quoting in context helps:

        “The MMTS-LIG daily maximum temperatures differences are smaller now than they were in the early years of the intercomparison, but the average monthly change has been less than 0.1 deg F. It is possible that with aging and yellowing of the MMTS radiation shield that there is slightly more interior daytime heating causing recent MMTS readings to be more similar to LIG temperatures. But in a larger perspective, these changes are very small and would be difficult to detect and explain, except in a controlled co-located environment.”

        We’re in the sub 0.1°F range.

      • Quoting in context helps:

        “…the average monthly change has been less than 0.1 deg F. …”

        We’re in the sub 0.1°F range.

        More context might help even more…

        MMTS daily maximums were cooler than LIG every month with most months showing a mean difference between 0.4 to 0.7 degrees F. Occasionally larger differences appear such as the -1.4 deg F difference in November 2002. … The difference in mean annual temperature measured with MMTS compared with LIG for the 2002-2004 period was -0.38 deg F.

      • “More context might help even more”
        But you’ve quoted irrelevant stuff about the discrepancy between LiG and MMTS. Yes, that’s a significant and well-known difference, but it’s something else. PA is claiming that the effect of MMTS aging is well-known. That’s what the part I quoted is about.

  47. Consider the data with no adjustments. Consider the data with all the adjustments. Both ways, all the data is still inside the bounds of their past ten thousand years. There is no problem that needs fixing while we are inside the bounds. Understand what causes the bounds. Climate is chaotic while inside their bounds. The limits of the bounds is not chaotic. When oceans get warm, polar oceans thaw and increases snowfall and the upper bound is enforced. When oceans get cold, polar oceans freeze and decreases snowfall and the lower bound is enforced. Look at a ice core temperature and ice accumulations rates for the SH and NH

    Adjustments have not pushed data out of bounds. The adjustments do not matter.

    • Ice volume on land increases during warm times with thawed oceans. Ice volume on land decreases during cold times with frozen oceans. It always gets colder after warm times. It always it always gets warmer after cold times.

  48. Little scientific light can be shed upon the [in]efficacy of “homogenization” techniques when there is a chronic lack of recognition that it’s intrinsically a low-frequency signal, not a simple statistical, problem. Paired comparisons with decadal-length CRN records cannot provide useful information about the discrepancies that may or may not exist vis a vis multidecadal and longer signal components. That’s the crux of the whole “homogenization” issue!

    The discrepancies addressed here are merely in the high-frequency fluctuations, which, given the different sensors and shelters used in the USHCN network, are not the components of climatological interest. Given our knowledge of the spectral characteristics of vetted century-long records, “trends” of decadal or shorter proportion are a useless metric.

  49. Willis Eschenbach

    One final question, Zeke. In figure 3, you show the difference in trends between pairs of CRN and USHCN stations, with values that range from -.3°C (presumably per decade??) to +3°C. To get these you’ve subtracted CRN from USHCN.

    My question is, in the green part of the graph you’re showing pairs of CRN stations … so if you subtract the first of the pair from the second you get a negative value, and if you subtract the second from the first you get a positive value. And there is no way to determine which is first or second.

    So … why do you have both negative and positive values for the differences between CRN pairs? How have you determined in which direction to do the subtraction?

    Finally, you say:

    Trend confidence intervals for the resulting CONUS records are calculated using an ARMA[1,1] model to account for autocorrelation in the data.

    Give me a minute to digitize your results …

    OK, 12 minutes to screenshot and digitize the data, 140 points. I digitized Figure 1, upper panel right, tmax difference, USHCN – CRN. I get a trend of the tmax difference of -0.23°C per decade.

    When I did an ARMA(1,1) analysis, it gave me ar=.9951,ma=-.9093 for the tmax difference dataset. So I generate a thousand examples of an ARMA process with those AR and MA coefficients. Here’s a sample of my pseudodata, all to the same scale.

    https://dl.dropboxusercontent.com/u/96723180/zeke%20data%20and%20pseudodata.jpg

    One of those is the actual tmax difference dataset. As you can see, the pseudo-data is very lifelike.

    The problem is, I get a standard deviation of the decadal trends of that same pseudodata of 0.89°C/decade, about three times the absolute trend in your figure.

    Which, if correct, would mean that that trend is far from significant.

    I’m happy to provide my R code. To ensure that the pseudo-datasets are comparable with the actual data, I adjust their amplitude individually so that each one has the same average month-to-month step length as the actual data. This is preferable to adjusting the pseudodata to match the standard deviation of the actual data. Unlike standard deviation, step length includes the temporal information. In addition, it is independent of the trend of the data.

    To calculate the time step length so it is comparable between datasets, I assign to the full time scale the length of six standard deviations of the data. I append the function below.

    Anyhow, that’s how it looks right now, although I’m still pondering it.

    My best to you,

    w.

    steplength=function(x){
      thestep=rep(6*sd(x)/length(x),length(x)-1) # length of one time step
      sqrt(mean(diff(x)^2 + thestep^2))  # root mean square of lengths
    }

    This returns the mean step length of a numeric vector. I adjust the proxy data so it all has the same average step length as the step length of the data.

    Having said all of that, the average step length is generally not a whole lot different from the standard deviation of most data (0.15 for step length of actual data, 0.13 for standard deviation). So feel free to use either one.

    Oh, yeah. The actual data in the figure above is Series 8.

    • “presumably per decade??”

      Trend units in the paper are erratic. They should of course be given properly. Fig 2 has °C/yr, which I ind unbelievable. I think they should both be °C/century.

    • Hi Willis,

      Since the trend differences between pairs within a network are indeed symmetric, I randomly discard one of the pairs when the same two stations are matched twice. There are enough station pairs and the dropping is random so the resulting PDF of trend differences is still symmetric.

      Regarding the ARMA(1,1) analysis, here is the process I’m using (appologies for poor formatting:

      . arima adj_minus_crn date, arima(1,0,1)

      (setting optimization to BHHH)
      Iteration 0: log likelihood = 104.45123
      Iteration 1: log likelihood = 107.10796
      Iteration 2: log likelihood = 107.62312
      Iteration 3: log likelihood = 107.71933 (backed up)
      Iteration 4: log likelihood = 110.15665
      (switching optimization to BFGS)
      Iteration 5: log likelihood = 111.06529
      Iteration 6: log likelihood = 112.35947
      Iteration 7: log likelihood = 113.87148
      Iteration 8: log likelihood = 114.28374
      Iteration 9: log likelihood = 114.57421
      Iteration 10: log likelihood = 114.67839
      Iteration 11: log likelihood = 114.94352
      Iteration 12: log likelihood = 114.94762
      BFGS stepping has contracted, resetting BFGS Hessian (0)
      Iteration 13: log likelihood = 114.94893
      Iteration 14: log likelihood = 114.94893 (backed up)
      (switching optimization to BHHH)
      Iteration 15: log likelihood = 114.94895 (backed up)
      Iteration 16: log likelihood = 114.94898 (backed up)
      Iteration 17: log likelihood = 114.94898 (backed up)
      Iteration 18: log likelihood = 114.94898 (not concave)
      Iteration 19: log likelihood = 114.94898
      (switching optimization to BFGS)
      Iteration 20: log likelihood = 114.94898 (backed up)
      Iteration 21: log likelihood = 114.94899
      Iteration 22: log likelihood = 114.94899
      BFGS stepping has contracted, resetting BFGS Hessian (1)
      Iteration 23: log likelihood = 114.94899
      Iteration 24: log likelihood = 114.94899

      ARIMA regression

      Sample: 2004m1 – 2015m8 Number of obs = 140
      Wald chi2(2) = 2095.73
      Log likelihood = 114.949 Prob > chi2 = 0.0000

      ——————————————————————————-
      | OPG
      adj_minus_crn | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      ————–+—————————————————————-
      adj_minus_crn |
      date | -.0019116 .0000418 -45.74 0.000 -.0019935 -.0018297
      _cons | 1.134083 .0250893 45.20 0.000 1.084909 1.183257
      ————–+—————————————————————-
      ARMA |
      ar |
      L1. | .7431432 .0593853 12.51 0.000 .6267502 .8595362
      |
      ma |
      L1. | -1 . . . . .
      ————–+—————————————————————-
      /sigma | .1052958 .0064135 16.42 0.000 .0927255 .1178661
      ——————————————————————————-
      Note: The test of the variance against zero is one sided, and the two-sided confidence interval is truncated at zero.

    • Willis Eschenbach

      Thanks, Zeke. I assume these numbers are for the tmax CRN vs USHCN adjusted. It appears that the AR value is .7431432, but I can’t read the MA value.

      If you could verify the AR, the MA, and the dataset (tmax difference CRN vs USHCN adjusted), that would be great.

      Finally, an oddity that affects all of this—the difference data (tmax diff CRN vs USHCN adjusted) contains a strong annual signal, which I find surprising. It seems to me that this signal might be evidence of a UHI effect … but what do I know?

      In any case, the range of the detrended difference data is about 0.6°C … and the range of the seasonal component of that is a full third of the signal, 0.2°C. Now, two tenths of a degree is quite large … I’d be interested in your comments on that. Why should the USHCN minus CRN data contain such a strong seasonal signal?

      w.

  50. Nick Stokes
    “USHCN had a set of 1218 stations that it chose for continuity of record etc. They used those as primary reference points for the temperature averaging. They didn’t use the other stations as primary mainly because of the continuity issue.”
    1218 or 1219 stations listed.
    Less than 450 original stations.
    Only 670 active in winter, 850 in summer. [approx]
    Rest are infilled to maintain the “historical nature”
    The United States Historical Climatology Network (USHCN) webpage says
    “The raw database is routinely reconstructed using the latest version of GHCN-Daily, usually each day. The full period of record monthly values are re-homogenized whenever the raw database is re-constructed (usually once per day)”
    So the raw data is
    not raw.
    Changes value [downwards] every day
    ie routinely reconstructed every day.
    all the way back to the start of records over a hundred years ago.
    Every “raw” monthly value is changed.and is already homogenized.
    This happens daily.
    Si, Zeke, with your study did you label which day you did the graphs on?
    Because they will be different if done the day before or after.
    Do you point out that the “raw” data is not really raw data?
    Ie raw data should have one original unchanging value, which USHCN raw data does not have.
    Since it is already homogenized why doesn’t it match perfectly with your other data anyway?
    [I gave one explanation above]

    • “So the raw data is not raw.”
      No, you’ve just got it muddled (again). Reconstructed means they gather the latest data from the global collection GHCN Daily. Raw data is unaltered.

      And I’ve no idea where you get your station numbers from. Citation needed.

      • Citations needed.
        Try How not to calculate temperature at the Blackboard
        5 June, 2014 (14:04) | Data Comparisons Written by: Zeke
        1218 stations larger population of 7000-odd cooperative network stations.
        since the late 1980s a number of stations have closed or stopped reporting. due to the nature of the instruments; many USHCN stations are manned by volunteers (and are not automated instruments), and these volunteers may quit or pass away over the decades.
        Since the 1980s the number of reporting USHCN stations has slowly declined from 1218 in the 1980s to closer to 900 [6/5/2014 figure].

        USHCN infills missing stations based on a spatially-weighted average of surrounding station anomalies (plus the long-term climatology of that location) to generate absolute temperatures. This is done as a final step after TOBs adjustments and pairwise homogenization, and results in 1218 records every month.
        -So much for raw data.

        http://moyhu.blogspot.com/2014…..alies.html fraction stations reporting raw shows only 23% of stations were reporting raw data.
        Did you forget this?
        sorry for cherry picking.
        -Zeke’s explanation of the ever cooling past.
        “The reason why station values in the distant past end up getting adjusted is due to a choice by NCDC to assume that current values are the “true” values. Each month, as new station data come in, NCDC runs their pairwise homogenization algorithm which looks for non-climatic breakpoints by comparing each station to its surrounding stations. When these breakpoints are detected, they are removed. If a small step change is detected in a 100-year station record in the year 2006, for example, removing that step change will move all the values for that station prior to 2006 up or down by the amount of the breakpoint removed. As long as new data leads to new breakpoint detection, the past station temperatures will be raised or lowered by the size of the breakpoint.”
        They do this daily by the way so there is never any permanent raw data record.
        Each day it is a new lower figure.

        Zeke again,
        “An alternative approach would be to assume that the initial temperature reported by a station when it joins the network is “true”, and remove breakpoints relative to the start of the network rather than the end. It would have no effect at all on the trends over the period, of course, but it would lead to less complaining about distant past temperatures changing at the expense of more present temperatures changing.”

        angech,
        As I mentioned in the original post, about 300 of the 1218 stations originally assigned to the USHCN in the late 1980s have closed, mostly due to volunteer observers dying or otherwise stopping reporting. No stations have been added to the network to make up for this loss, so there are closer to 900 stations reporting on the monthly basis today.
        .
        – Less than 450 original stations.? See WUWT new paper.
        Press Release – Watts at #AGU15 The quality of temperature station
        Anthony Watts / December 17, 2015
        Using NOAA’s U.S. Historical Climatology Network, which comprises 1218 weather stations in the CONUS, the researchers were able to identify a 410 station subset of “unperturbed” stations that have not been moved, As Mosh says , each break point is a new station.
        Hope this helps
        I cannot find your old graph of USHCN stations which showed the low of 670 stationsand you forgot our discussion and that Zeke confirmed a figure of 650? active stations over winter possibly in a blog with your bogey man.

      • After all that, I still have no idea where you got your station numbers from.

      • angech: “USHCN infills missing stations based on a spatially-weighted average of surrounding station anomalies (plus the long-term climatology of that location) to generate absolute temperatures. This is done as a final step after TOBs adjustments and pairwise homogenization, and results in 1218 records every month”

        Otherwise known as MAKING STUFF UP.

  51. The United States Historical Climatology Network (USHCN) webpage says
    “The raw database is routinely reconstructed using the latest version of GHCN-Daily, usually each day. The full period of record monthly values are re-homogenized whenever the raw database is re-constructed (usually once per day)”
    Can we have the real raw data tasked about, please.

  52. Nick Stokes ” min-max is an adequate daily temperature measure, which has the critical virtue of being supported by historic information.”,
    can you explain to me the rational of taking maximum from the 24 hours preceding 9.00 am and the minimum from the 24 hours post 9.00 am?
    The average temp is the average of these two different days.
    [BOM, Australia]. Why is this done.What makes it right?
    “The rationale is convenience. ”
    Explains to me at last why you can have a minimum greater than a maximum. So why does the algorithm correct this non anomaly?

    Zeke Hausfather (@hausfath) | February 10, 2016

    “The raw HCN data we are using has been subject to quality control
    (which flags and removes values where the min is higher than the max”
    Even though you know they occur on different days and are therefore “correct”????

  53. Z uses confirmation bias to confirm his findings and remains blissfully or deliberately unaware of it.
    He happily calls modified data raw data and seems to believe that it is OK to call the modified data raw data.
    He admits that the system deliberately lowers temperatures 100 years ago in the USA by over a degree and does not understand that thermometers, no matter how much error the older ones had were never all a full degree out because the readings were taken a hundred years ago and the same thermometers are only 0.5 degrees out when taken 50 years ago.
    In fact if you used them today they would match very closely with all the electric sensors now used.
    Not perfectly but certainly within a tenth of a degree.
    How dare they describe/make the past artificially so much colder.
    Taken from a post at Lucia’s on Lewindowsky

  54. Dear Zeke.
    Thanks for a good article and your answers to questions.
    I hope you can shed a little light on the changes Ole Humlum has documented.
    http://climate4you.com/images/NCDC%20Jan1915%20and%20Jan2000.gif
    Could you explain why the anomali for januar 2000 and 1915 in just 8 years from 2008 to 2016 could change that much.
    I would think that they in 2008 allready had adjusted what was to adjust on these older measurements.

  55. Willis Eschenbach

    David Springer | February 11, 2016 at 12:46 am |

    Mosher the tests you described do not catch the problem with slowly darkening CRS stations. I know how the algorithms work. They won’t catch a slight warming trend caused by paint deterioration and soot accumulation that happens over many moths or years. They will catch an abrupt cooling event from a shelter that gets cleaned and painted. The CRS is supposed to be painted once every two years. In a volunteer network do you think that regimen is followed with any discipline?

    The end result is the slow warming gets baked into the record and the cooling gets adjusted out of it (NOAA) or for BEST the sudden cooling triggers the start of a new record. In either case the false warming trend is retained while the compensatory cooling is rejected.

    David Springer | February 11, 2016 at 9:50 am |

    Yeah that’s what I thought Mosher and Hausfather’s response would be: crickets chirping.

    Yeah, crickets. Well, I’ll ask the question again.

    Zeke and Mosh, despite being asked repeatedly, you have NEVER answered David’s question. He’s asked it. I’ve asked it. Don’t know a bout him, but I’ll continue to do so. Are you ready?

    The “scalpel” algorithm has been demonstrated to convert a sawtooth wave (such as is caused by occasional maintenance of the CRS shelters that David describes) into a totally bogus trend. So I have two questions:

    1. How much does this affect the Berkeley Earth results, and

    2. Why is it like pulling teeth to get you to answer this simple question? You’ve been dodging it for over a year now.

    w.

    • Well, the first challenge is figuring out how sawtooth waves are dealt with by homogenization. The best way to do this would be through testing on synthetic data with sawtooth patterns added, which is why I suggested the ISTI effort (since they are the only group I know of actively working on testing the effectiveness of homogenization approaches under different types of inhomogenities).

      An alternative is to go through Berkeley stations and try to identify cases where bogus trends have been added, to get a qualitative sense of the potential impact. However, in most stations I’ve found with sawtooth patterns the Berkeley approach seems to pick out both the gradual rise and the quick drop, e.g. in the Bakersfield case: http://berkeleyearth.lbl.gov/auto/Stations/TAVG/Figures/161526-TAVG-Alignment.pdf

      Let me know if you can find any cases where it looks like the sawtooth was not picked up and a spurious trend was added.

      • Looking at a few more, here is one with a sawtooth-like pattern early in the record that might only be partially caught: http://berkeleyearth.lbl.gov/auto/Stations/TAVG/Figures/163204-TAVG-Alignment.pdf

      • Willis Eschenbach

        Thanks for the thoughts and examples, Zeke. What I have been looking for is not individual stations, but a project-wide analysis of just what is cut by the scalpel. Does it find (and throw away) more upwards jumps, or more downwards jumps? Is the difference significant? If so, what are possible explanations for the differences?

        For example, a common situation is a temperature station that starts out on the outskirts of town. The town grows up around it, leading to a gradual increase in temperature due to UHI. At some point, they do an undocumented station move to the local airport. This leaves a sawtooth wave.

        Of course, the airport becomes busier and busier, and development continues around it, asphalt and buildings, so the airport is gradually warming from UHI as well.

        The final record then appears to show a steady warming interrupted by a downwards jump, which would be interpreted by the algorithm as an “empirical break” … and after being subjected to the “scalpel”, the inconvenient jump of the “empirical break is now removed and we show steady warming.

        So my question is, year by year, what is the sum of all the upwards “empirical breaks”, and what is the sum of the downwards “empirical breaks”? Because the difference in these is the net amount of information being discarded, and it behooves the developers of the dataset to look at that both temporally and spatially.

        The part I don’t get is, this issue has been out there for a while. What are the Berkeley Earth folks waiting for? If someone were questioning my algorithm and put forth a very specific and very believable objection to it, I’d run some tests and publish the results, and I’m just a guy.

        Berkeley Earth gets funding to put out a well tested dataset, people depend on their work, and they are not performing their due diligence.

        w.

      • Is there any relatively agreed upon definition of what would count as a sawtooth pattern? I obviously know what the shape would look like in a toy example, but I’m curious how that applies to real data. I assume it would still be gradual warming followed by a sharp drop in temperatures (or potentially the inverse, if the sawtooth pattern were upside down). I’m just wondering how long a period/big of changes there’d need to be before most people agreed it qualified as an example.

        If one could come up with a definition that could be tested for programmatically, it’d be quite possible to find out how BEST winds up handling sawtooth patterns for its empirical breakpoints. That wouldn’t give the full story since there are other steps which could be relevant, such as the iterative weighting process BEST uses based upon presumed station quality, but it might be something worth doing.

      • Unfortunately Berkeley Earth no longer gets much funding for climate work; almost all in the past year or so is for air pollution research. We have enough to support updates to the dataset, but not too much bandwidth for side projects. That’s why I suggested the ISTI process, as if they produce the benchmarks it should be relatively easy to run it through the Berkeley algorithm.

      • Willis Eschenbach

        Zeke Hausfather (@hausfath) | February 11, 2016 at 11:55 pm |

        Unfortunately Berkeley Earth no longer gets much funding for climate work; almost all in the past year or so is for air pollution research. We have enough to support updates to the dataset, but not too much bandwidth for side projects. That’s why I suggested the ISTI process, as if they produce the benchmarks it should be relatively easy to run it through the Berkeley algorithm.

        Thanks, Zeke. I’d forgotten that they are now pollution experts, I see they are claiming that 17% of all deaths in China are from air pollution …

        Berkeley Earth released today a study showing that air pollution kills an average of 4000 people every day in China, 17% of all China’s deaths.

        My point is that it would not be all that difficult to see, for example, if there are more upward jumps or downward jumps detected by the “scalpel” method, and just where they are located. It may be, for example, that there are a lot of downwards jumps in a certain area, which could mean that the temperature field calculations in that region had problems. If I had designed the dataset and the algorithms, it wouldn’t take me over an hour or two to do it … but they’ve had a couple of years to do it and they’ve done nothing.

        Instead they are spending their time inflating Chinese claims for some clients or other … in fact, all Chinese deaths related to lungs in any infectious or irritative form (influenza, plus the general category “lung diseases”, plus pertussis, plus pneumonia, plus asthma) only add up to 14% of the total deaths … so even if every single lung-related death were directly caused by pollution with no other possible cause, there still wouldn’t be as many deaths as Berkeley Earth claims.

        And of course, we know that it is not true that all cases of pneumonia or influenza are due to pollution, that’s not possible. So they are well off into computer model world, with no grounding in reality.

        In addition, while the deaths from all forms of lung related illnesses (except cancer) are 14% of all Chinese deaths, they are also 13% of all deaths globally … meaning that despite all of Berkeley Earth’s moaning about the terrible Chinese air pollution, they are dying at just about the same rate from lung disease as the rest of the planet.

        So are we to believe that smog is as bad everywhere as in China?

        Sadly, it seems that Berkeley Earth is now doing the same thing to the Chinese death data that they did with climate data—they are not doing the due diligence needed to ground-truth their results and check for errors. I’m sorry, but the number of Chinese people dying from air pollution cannot be bigger than all Chinese lung-related deaths, that’s crazy talk that is designed to convince people that It’s Worse Than We Thought™ … except in this case it’s worse than is physically possible.

        But even with their pressing need to produce more and better alarmist Chinese pollution claims, I’m still not buying that they don’t have enough time to take a few hours to verify that their temperature work might contain a major error … and I am very concerned about the fact that two years after the nature of the error was reported, apparently nothing has been done. That kind of a delay makes me … well … inquisitive as to what might be going on behind the scenes.

        My best to you, and after two years, I’m not holding my breath expecting Berkeley Earth to do anything about this matter except avoid it. Ever since Richard Mueller trampled Anthony in his mad dash for the microphone in the Senate hearing, I’ve expected the worst of ethics, morals, and actions from them, and I’ve never yet been disappointed. The Chinese claims are a perfect example, as is their avoidance of this error issue.

        w.

      • Have you looked into non-lung related deaths caused by air pollution, Willis? Heart disease, stroke, etc. The 1.6m/yr of China deaths doesn’t seem out of line with other estimates I recall seeing on all deaths attributed to air pollution. That would put them about in the middle of other studies. They get paid for this?

  56. Zeke wrote: “While the USCRN will provide excellent data on the changing U.S. climate in the future, in the past we are stuck with the historical network. What we can do, however, is use the USCRN to empirically assess how well our adjustments to the historical network are doing.”

    Your adjustments to 20th-century records doubled the rate of warming, but there was documentation and research supporting the need and magnitude for about half of this correction (TOB, for example). Pairwise homogenization at undocumented discontinuities added 0.2? degC to 20th-century warming. In the 21st century, pairwise homogenization at undocumented discontinuities didn’t change the amount of warming that was observed. Clearly, the nature of these discontinuities was different before and after the turn of the century. That means you have learned absolutely NOTHING about the reliability of methods used for undocumented 20th-century discontinuities by comparing USCRN and USHCN-raw and USCRN-adjusted in the 21st-century. The raw USHCN data had nearly the same trend as USCRN and was not improved by adjustment.

    OK. Homogenization did reduce the scatter in the data. Any homogenization methodology – right or wrong – will reduce scatter in the data. And the correct answer may be that stations without forced ventilation naturally exhibit a much wider range of trends than the newer systems used by the USCRN. The important issue is whether the adjustments at undocumented discontinuities in the 20th-century provide a better estimate of global warming in that century.

  57. Don’t argue with Zeke. After all, he drank the Kool-Aid a long time ago. I’d just say that most of the temperature rise is human-caused – it lies in the adjustments!