Emergent constraints on TCR and ECS from historical warming in CMIP5 and CMIP6 models

By Nic Lewis

This is a brief comment on a new paper[i] by a mathematician in the Exeter Climate Systems group, Femke Nijsse, and two better known colleagues, Peter Cox and Mark Williamson. I note that Earth Systems Dynamics published the paper despite one of the two peer reviewers recommending against acceptance without further major revisions. But neither of the reviewers appear to have raised the issue that I focus on here.

“Emergent constraints” methods relate observable climate trends, variations or other variables to climate system properties of interest, such as equilibrium climate sensitivity (ECS), in an ensemble of models. They then use the observed values of the variable(s) involved to estimate ECS or the other properties of interest. I’m not a great fan of emergent constraints studies, the results of which are often sensitive to the model ensemble used. Here the emergent constraint is the relationship, assumed linear, between transient climate response (TCR) and global warming from 1975 onwards.

The authors thereby derive, from a CMIP6 26 model ensemble,  a TCR estimate of 1.68 K (16-84% ‘likely’ range 1.29–2.05 K, 5-95% range 1.0–2.3 K) using warming up to 2019.

The paper states that using instead warming up to 2014, thereby enabling use of a larger set of CMIP6 models, reduces the TCR estimate to 1.54 K (5-95% range 0.76–2.30 K), but don’t mention that in their abstract or conclusions.

The authors also derive an estimated likely range for ECS of 1.9–3.4 K (5–95% range 1.5–4.0 K) from CMIP6 models, median estimate 2.6 K, from warming to 2019. Based on warming up to 2014 in the larger ensemble of CMIP6 models, the median ECS estimate is 1.9 K (5–95% range 1.0–3.3 K). Again, the result from the larger ensemble is not mentioned in the abstract or conclusions.

I am very doubtful about estimating ECS by comparing observed and simulated historical warming. Without also using observational data on ocean heat update, to estimate changes in the Earth’s energy imbalance, it is impossible to distinguish satisfactorily between ECS and ocean heat uptake both being high and them both being low – either combination can produce the same historical warming. So I would not place any reliance on their ECS ranges, even if they don’t look unreasonable.

On the other hand, one would expect historical warming in climate models to have a close to linear relationship with their TCR, since pre-1975 ‘warming in the pipeline’ is fairly negligible and post 1975 forcing is reasonably close to the quasi-linear ramp forcing used to measure TCR, and similarly of multidecadal length. The existence of episodic volcanic forcing, to which models and the real climate system may respond differently, is a possible confounding factor, although use of the difference between 1975–1985 and the 2009–2019 means to measure warming excludes years affected by the 1991 Mount Pinatubo eruption. There is also the issue that the mean change in effective radiative forcing (ERF) in climate models between those two periods may not equal the ERF change in the real climate system. For CMIP5 models at least, I suspect that their mean ERF change falls somewhat short of the actual change.  That would induce an upwards bias in the emergent constraint TCR estimate.

Regardless of the above considerations, there is a fatal problem with the regression method used to relate TCR with warming. If a model has a TCR of zero, then it would be expected to show zero historical warming. The authors appear to recognise this, writing “As no warming would be expected if climate sensitivity were zero, we expect the regression to pass through the intercept”. They actually mean pass through the origin (have a zero y-intercept), as their equations (A3) and (A4) make clear. And their equation (3) theoretical relationship between TCR and warming, TCR = s ΔT, has no offset term. It is therefore physically inappropriate to use regression with a y-intercept term being estimated.

However, despite admitting that a zero y-intercept is physically appropriate, the study estimates a regression fit using a y-intercept as well as a slope coefficient parameter. Moreover, the resulting best-fit line does not pass at all close to the origin. Their estimate implies that climate models with a TCR of ~0.7 K would have simulated zero post-1975 warming. Their Figure 4(a), reproduced as Figure 1 below, shows this.

Figure 1. Reproduction of Figure 4(a) of Nijsse et al. (2020). Emergent constraint on TCR against historical warming ΔT . ΔT is calculated from the difference between 1975–1985 and 2009–2019 of a time series of GMSAT. Linear regression is performed with all CMIP5 and CMIP6 simulations. Shaded areas indicate a 90% prediction interval. The vertical dashed line is the mean value of the observations, and the y axis shows the probability distribution of both generations of ensembles.

A quite involved and not very clearly described hierarchical Bayesian model regression method is used, which makes it difficult to reproduce exactly the study’s results. Remarkably, the numerical value and uncertainty range of the observed warming estimate is nowhere stated. I therefore measured it off their Figure 4(a), as 0.606 K, and took the shading as showing a normally distribution 5–95% range, width 0.225 K. Based on a simple ordinary least squares (OLS) with-intercept regression of TCR on model-ensemble mean simulated historical warming, across the multimodel combined CMIP5 and CMIP6 ensemble, I estimate a median TCR estimate of 1.62 K (or 1.66 K using the CMIP6 ensemble alone – marginally lower than their 1.68 K).

If I repeat the exercise but without estimating a y-intercept, thereby forcing the regression fit to match a zero TCR with zero historical warming, the emergent constraint gives a TCR best estimate of 1.43 K. The regression fit is very good (R2 = 0.97). There is little regression dilution when no y-intercept  is estimated. Regressing warming on TCR rather than vice versa gives an emergent constraint TCR best estimate of 1.47 K. Regressing TCR on warming just across the CMIP5 ensemble gives a slightly lower TCR estimate of 1.37 K (R2 = 0.97). Doing so across the CMIP6 ensemble alone gives a TCR estimate of 1.50 K ((R2 = 0.98). The slope coefficient standard errors imply that there is only a 3.1% chance that the TCR–warming relationship is the same in the CMIP5 and CMIP6 ensembles.

It is unclear which regression method is more accurate. Using the geometrical mean of the estimates regressing each way, as is sometimes recommended, gives a TCR best estimate of 1.45 K from the combined CMIP5/CMIP6 ensemble. A crude estimate of uncertainty can be obtained by using, for each type of regression, the standard error in the slope estimate and the observational uncertainty to form a large number of randomly sampled TCR estimates, combining the two resulting sets of estimates and computing quantiles. Doing so gives a median TCR estimate of 1.45 K, with a 16–84% range of 1.29–1.62 K and a 5–95% range of 1.18–1.74 K. However, this does not account for all sources of uncertainty.

Interestingly, when fitting a relationship between ECS and post-1975 warming, and between ECS and TCR, the authors didn’t use a y-intercept term, resulting in those fits passing through the origin.

My key point is that an analysis method that results in a physically reasonable estimated relationship between the variables being studied should be used. An estimated relationship that implies zero warming with a positive TCR, and significant cooling with a zero TCR, is unphysical. Therefore, the results of the Nijsse et al. paper are unreliable and should be discounted.

Nicholas Lewis                                                           19 August 2020

Update 20 August 2020

Thanks to a commenter pointing out that it was erroneous, in the third paragraph I have removed the statement ‘Unfortunately, the study does not provide a results table for their TCR estimates and the 5-95% TCR range is not stated.’, stated that range and also corrected the reference to the model ensemble involved. In the fifth from last paragraph I have added the TCR estimate that I derive using the CMIP6 ensemble on its own.

[i] Njisse et al., 2020: Emergent constraints on transient climate response (TCR) and equilibrium climate sensitivity (ECS) from historical warming in CMIP5 and CMIP6 models, Earth Syst. Dynam., 11, 737–750, 2020 https://doi.org/10.5194/esd-11-737-2020

Originally posted here, where a pdf copy is also available

65 responses to “Emergent constraints on TCR and ECS from historical warming in CMIP5 and CMIP6 models

  1. Reblogged this on Climate Collections.

  2. Lance Wallace

    “The authors also derive a…5–95% range (1.5–4.0 K) from CMIP6 models.”
    Kind of humorous that we go through an involved mathematical effort only to end up with (almost) the same 1.5-4.5 range from the NAS meeting 41 years ago.

    • David Wojick

      Yes, except this contrasts strongly with the much higher values coming from the CMIP6 models alone. My guess is that this is part of some of the modeling community’s negative reaction to the dramatic new hotness of many of the CMIP6 models. The IPCC is struggling with how to deal with this suddenly increased model sensitivity. It presents a deep dilemma. If it is right then the models have been wrong for 40 years. If it is wrong then the models are now no good.

      • Gerald Browning

        There is a mathematically rigorous manuscript that will appear in the September issue of DAO that proves that all.climate models are based on the wrong atmospheric system of equations so any conclusions based on those models are nonsense.

        Gerald Browning

      • Gerald Browning | August 23, 2020 at 7:21 pm |

        There is a mathematically rigorous manuscript that will appear in the September issue of DAO that proves that all.climate models are based on the wrong atmospheric system of equations so any conclusions based on those models are nonsense.

        Looking forward to it … should be fun.


  3. Alasdair Fairbairn

    I would like to see this graph zoomed out to include negative TCR figures on the Y axis. I have a feeling that it would get into a bit of a mess.

  4. If I repeat the exercise but without estimating a y-intercept, thereby forcing the regression fit to match a zero TCR with zero historical warming, the emergent constraint gives a TCR best estimate of 1.43 K.
    This paper and Nic’s recalculated TCR estimate both falsely assume that almost all of the warming since 1975 was caused by greenhouse gases (GHG). There several hundreds of technical papers that show that a significant amount of the recorded global warming was caused by natural climate change and the uncorrected urban warming included in the land temperature record. There is an obvious 60-year temperature oscillation related to the AMO that started increasing around 1975. There is also an obvious millennium scale temperature oscillation, which began to increase at about 1700, which was also the end of the Maunder Minimum. Solar activity increased to a maximum at 1992 which would cause a maximum temperature response two or more decades later. Several studies show that the urban heat island effect (UHIE) caused about half of the warming from 1980 over land. The non-GHG warming 1975 – 2019 due to the 60-year cycle, the millennium cycle and the UHIE, is 0.204 °C, 0.037 °C and 0.176 °C, respectively. The sum of 0.417 °C of non-GHG warming must be removed from the temperature record for an appropriate comparison to the climate model results.

    The Lewis & Curry 2018 paper calculated a TCR best estimate of 1.20 °C, which is far less 1.43 °C Nic calculated above using the assumption that TCR would be 0 when the temperature change is 0. The LC18 paper effectively accounted for the 60 year cycle by using a long time period (136 years between the midpoints of the start and end periods, which were at similar parts of the oscillation).

    Unfortunately, the LC18 analysis was deficient in that the natural climate change from the base to final periods were not considered and no correction was applied to remove the urban heat island effect (UHIE) from the temperature record. This study https://friendsofscience.org/index.php?id=2519 presents corrected estimates of ECS and TCR with uncertainty estimates by including the UHIE and natural warming. The median (best estimate) of ECS and TCR are estimated at 1.04 °C and 0.83 °C, respectively. Global average temperatures are forecast to increase by 0.63 °C from 2019 to 2100, assuming the GHG concentrations in the atmosphere increase exponentially and no natural climate change. The FUND economic model, using updated energy impacts and CO2 fertilization effects and assuming an ECS of 1.0 °C, calculates that a 2 °C GMST rise from 2000 would increase global wealth by 1.45% by 2147, equivalent to 2019US$1.26 trillion.

    • I agree that the influence of the AMO in particular on post-1975 warming is a distorting factor. It very likely increased warming significantly, thereby increasing these emergent contraint TCR (and ECS) estimates. I wanted to keep this article brief and focus on the statistical estimation issue, so I didn’t attempt to make it comprehensive..

      There are also, as you say, issues with measured warming being biased relative to that in climate models, although there are also claims that the temperature datasets used understate global mean surface air temperature warming.

      • Nic,
        Can you please provide a link to a paper that makes the claim temperature datasets understate the global mean surface temperature.
        I am quite sure we can find obvious errors in it.

    • Steven Mosher

      There is no measureable UHIE in the global surface temperature record.

      Psst Mckittrick 2007 (which you refer to ) is basically junk as he calculated population growth
      incorrectly, and its his most important regression variable.
      he also calculated GDP growth incorrectly

      • Steven Mosher
        “Mckittrick 2007 (which you refer to ) is basically junk as he calculated population growth incorrectly, and its his most important regression variable. he also calculated GDP growth incorrectly”

        Although I didn’t refer to Mckittrick 2007, what is the evidence for your above-quoted statement?

      • You give no evidence that the population or GDP growth rates are wrong. There is no error in the paper and several other papers give similar results. Your mistaken belief that the temperature record isn’t countaminated by UHIE is likely due to noting the rate of temperature rise at low population density is similar to that at high population density. But the highest rate of temperature rise is at the lowest population density. So comparing the rate of temperature rise in regions of high and low population densities tells you nothing about the UHIE. The statistical correlation between indicators of economic development and temperatures is a correct method to measure the UHIE.

      • As usual, SM has nothing.

      • Ken Gregory wrote:
        “Solar activity increased to a maximum at 1992 which would cause a maximum temperature response two or more decades later.”

        The maximum for the solar wind in the space age was in the early to mid 1970’s, that drove a colder AMO, and so did fairly strong solar wind states in the mid 1980’s and early 1990’s. Post 1995 the solar wind weakened, and drove a much warmer AMO. And then North Atlantic cold blobs showed up 2000-2001, 2014-2015, and 2017-2018 when the solar wind strength was up a bit. The biggest lag is from negative NAO/AO and associated El Nino episodes, which drive major warm pulses to the AMO around 8 months later.

    • Ken, the IPCC would agree. On page 17 of AR5 Physical Science volume, it states they are confident that more than half of the post-1951 warming was due to human forcing, that is half of the 0.5 degC.

  5. Dang … that’s gotta sting. However, it’s totally accurate and a lovely proof—to be physically congruent with theory, a TCS of zero must yield zero warming. Well spotted, Nic.

    As a consistent heretic, I have to point out that the assumption at the base of this whole discussion is that all other variables somehow magically cancel out, and that at the end of the day,

    ∆T = λ ∆F

    where T is temperature, F is forcing, ∆ is “change in”, and λ is climate sensitivity.

    I am unaware of any rigorous examination of evidence for this claim. It obviously is trivially true for say a block of steel.

    But for complex systems like the climate or the human body, it may not be true at all. For example, if I walk out in the sun, ∆F changes by hundreds of W/m2, and my core temperature ∆T barely moves … what is my “climate sensitivity”? Pretty near zero.

    And if you don’t think that happens in the climate, here’s a clear example. CERES satellite data lets us calculate the relationship between downwelling radiation (∆F) and the surface temperature (∆T). And for most of the world, we find that indeed, they are positively correlated—when forcing goes up, temperature goes up in a roughly linear fashion.

    But we also find in a large expanse of the equatorial oceans that the exact opposite is happening. Temperature and forcing not only decouple, but they move in opposite directions—forcing is increasing as the temperature goes down!

    (hopefully that shows up … if not I’ll repost a link)

    Perhaps someone can tell me … in the blue areas in the graph, where ∆T and ∆F are negatively correlated … just what is the TCR and the ECS? Minus 1.5?

    My best to all, and once again, nice work, Nic.


    • Thanks, Willis. Good to hear from you.

      Nice chart. But I don’t think one can assume that correlations in monthly data, which are driven mainly by natural fluctuations, ENSO being a large contributor, tell one much about long term relationships that reflect a forced response to changing atmospheric composition.

      Because of the ocean mixed (surface) layer’s large heat capacity and its mixing timescale of > 1 month, one would expect fluctuations in absorbed radiation to be negatively correlated with SST in the absence of it affecting clouds (which it does, but I’ll ignore that for a moment). That’s because a colder surface will reradiate less strongly. However, low cloud amount is generally negatively correlated with local SST, so a lower SST will reduce absorbed solar radiation. The balance between these two effects will vary, so I don’t find it particularly surprising that the sign of the correlation in your chart varies geographically.

      • Thanks, Nic. I never considered that the negative correlation in the tropical Pacific might be a result of a lag between forcing and response as you say. Hang on, that can be checked … OK, I got the data. Seems like that is NOT the case. Here’s the cross-correlation of total absorbed radiation (downwelling LW + SW – upwelling SW) with the surface temperature.


        As you can see, the negative correlation is NOT the result of any lag or delay in the response …

        I was also curious about your claim that “low cloud amount is generally negatively correlated with local SST”. Do you have a link for that? Seems to me that in the tropics that wouldn’t be true … oh, man, now I have to go look at that. Hey, the fun never ends.

        Best regards to you and that good lady who puts up with you,


      • OK, took a look. Median cloud top height is around 6 km, so I used that to divide clouds into high and low. CERES data sez you are correct that the area of low cloud is often negatively correlated with local SST, although the global average correlation is near zero … but not in the blue areas where temperature rules absorbed radiation. In those areas, either the correlation is positive or there’s very little low cloud.

        So I fear that both legs of your conjecture, that 1) there is a lag effect going one way and 2) a negative SST-low cloud correlation going the other way, are not supported by the data in the area in question, the blue areas of negative correlation above. In those areas 1) there’s no lag between absorption and SST and 2) there’s a positive low cloud/SST correlation.

        My best to you, and again I have to commend your analysis in the head post. It’s not often that you can actually prove something is wrong in climate science, but you’ve managed it most excellently.


    • Willis: Where temperature doesn’t vary with total radiation, I’d look for another source of energy – in this case latent heat. One possibility is that trade winds sweep moist air and latent heat from the subtropics to the ITCZ, where it is convected aloft, converting latent heat to sensible heat. Although that sensible heat is released aloft, it could warm the surface through the adiabatic lapse rate by reducing upward convection of surface heat.

      When we have a hypothesis that causes us to look for a correlation between A and B, it is always worth remembering that correlation is not causation. A correlation between A and B could mean that: A causes B, B causes A, or that both are responding to a third factor C. In this case, C (imported latent heat) might be correlated with A (local temperature) and D (cloudiness) and D is inversely correlated with B (absorbed radiation).

  6. The illusion that ∆T – λ ∆T may arise from the fact that the extratropical land ∆T is indeed strongly positively correlated with ∆F, and that’s where most people live.

    But the oceans and the tropics tell a different story. Average ocean correlation is only 0.44, and as mentioned, large areas are negatively correlated …

    This means that while forcing controls the temperature many places, in the blue areas the only conclusion possible is that the temperature is controlling the downwelling radiation … which kinda knocks a hole in the underlying equation that claims that forcing roolz temperature …


  7. When CO2 when from 300 to 400 parts per million, that is one more molecule added to ten thousand in the atmosphere. When people study and model TCR and ECS, they are spending their time on almost nothing.

  8. Geoff Sherrington

    You write “CO2 is a remarkably powerful absorber of infra red”.
    That is in turn a remarkably powerful statement of dogma, quite unexpected from you. How is it remarkable, how is it powerful, when in reality it is what it is, within known physics?
    Here is an exam question for you, though some question its assumptions:
    What is the minimum number of CO2 molecules in the atmosphere able to cause a temperature change of 0.1 degrees C? This resembles the need in your critique to consider the zero-zero point intercept. As in, one CO2 molecule is intuitively too minor to cause a detectable effect, ditto 2, 4, 8, 16, … a million molecules … more?
    When is there enough CO2 to have that effect? The modellers use ratios like a doubling to avoid this question.
    Geoff S

  9. “Unfortunately, the study does not provide a results table for their TCR estimates and the 5-95% TCR range is not stated.”

    Did you even read the paper? It’s 1.0 – 2.3 K according to Table 3.

    • Peter East,
      Thank you for pointing out that the paper does provide a 5-95% range using the CMIP6 ensemble and warming to 2019. My bad (in mitigation, Table 3 isn’t actually a results table – it is part of the Discussion and conclusion section and provides comparisons with other studies, in the case of the CMIP5 ensemble over a different period than that used in this study’s results). I have now corrected the article and stated the 5-95% range. If you’ld like me to credit you by name at the end of the article as well as here for pointing this error out, I’d be happy to do so.

      • stevefitzpatrick

        Maybe the 0.6 to 0.7 intercept at zero transient sensitivity just is telling us that there was “background warming” of 0.6 to 0.7 due to the AMO and other long term cyclical processes between 1975 and today.

      • stevefitzpatrick,
        I don’t think the non-zero TCR intercept can be anything to do with natural internal variability, since both the post-1975 warming and the TCR values plotted are those in global climate models, where AMO etc. internal variability is not synchronised in any way with that in the real world.

      • Calling the AMO internal variability is the greatest error in the global models. It acts as a negative feedback to changes in the solar wind strength, and with considerable overshoot.
        The global warming from 1975 onward started from very strong solar wind states driving cold ocean phases, including multi-year La Nina, and shifted to weaker solar wind states driving warmer ocean phases, particularly after 1995.
        The AMO phases control changes in low cloud cover, the warm phase reduces it, and it also increases lower troposphere water vapour, in part due to increased surface wind speeds over the oceans since 1995.


  10. The problem with Bayesian statistics when it comes to weather prediction let alone climate prediction is that, The results has more to do with how you would like them to come out then how they actually will come out, e.g., Bayesians might say we know Hillary Clinton actually won the presidential election in 2016, based on the evidence in the polls because, that’s what people actually want to know.

  11. Pingback: Emergent constraints on TCR and ECS from historical warming in CMIP5 and CMIP6 models |

  12. Jupiter emits 2-3 times more energy by radiation than it receives from the sun.

    • Jupiter emits exactly what it receives from the sun and not more.


      • No it doesn’t. That’s the point. Read the question again and have another try.

      • The jovian planets get their heat from the Sun and from their interiors. Jupiter creates a lot of internal heat and releases this heat by emitting thermal radiation. In fact, Jupiter creates so much internal heat that it emits almost twice as much energy as it receives from the Sun. The only reasonable explanation is that Jupiter is still slowly contracting, almost as though it has not quite finished forming.


      • Thank you for an interesting link.
        It gives interesting data about the jovian planets densities.
        Also it says:

        “Internal Heat
        The jovian planets get their heat from the Sun and from their interiors. Jupiter creates a lot of internal heat and releases this heat by emitting thermal radiation. In fact, Jupiter creates so much internal heat that it emits almost twice as much energy as it receives from the Sun. The only reasonable explanation is that Jupiter is still slowly contracting, almost as though it has not quite finished forming.

        Saturn and Neptune also appear to be emitting more energy than they receive from the Sun. While we are certain Saturn is not still contracting, it seems clear that Neptune is still contracting. Uranus is the only jovian planet not emitting excess internal energy.”

        But how do we know? Jupiter receives 50 W/m2 solar flux. And the mean temperature of Jupiter at 1 bar level is measured to be T = 165 K.
        And Jupiter has albedo a = 0,503.
        How do we know Jupiter emits almost the twice amount of energy than what it receives?


  13. “Patterns are unlikely the reason for “low biased estimations of the sensitivity from observations”. New paper of Nic in collaboration with Thorsten Mauritsen ( also a “heavy weight climate researcher”) released yesterday: https://journals.ametsoc.org/jcli/article/doi/10.1175/JCLI-D-19-0941.1/354283/Negligible-unforced-historical-pattern-effect-on .
    Looking forward a blogpost of Nic with some helpful illustrations of this very technical stuff. Congrates for this new reviewed paper!

  14. Richard Greene

    After 14 years of study I have calculated the TCR to be 1.175, however I am not sure of the second and third decimal places. I’ll be back in 14 years with the previse answer.

  15. Richard, 14 years? Really? Reading ( and understanding of course) one paper is enough: https://niclewis.files.wordpress.com/2018/04/lewis_and_curry_jcli-d-17-0667_accepted.pdf

  16. Nic: Sherwood et al (2000) gives me the feeling that this publication may represent justification for AR6 to reach the conclusion that ECS is higher. Are you going to comment on this subject.

    We assess evidence relevant to Earth’s climate sensitivity S: feedback process understanding, and the historical and paleo-climate records.
    ● All three lines of evidence are difficult to reconcile with S 4.5 K.
    ● A Bayesian calculation finds a 66% range of 2.6-3.9 K, which remains within the bounds 2.3-4.5 K under plausible robustness tests.

    doi: 10.1029/2019RG000678

    (I’d say a pattern effect in the output of AOGCMs causes them to over-estimate warming and reduce cloudiness in the Eastern Pacific, thereby inflating model ECS.)

  17. What is the current best estimate of ECS and TCR derived from empirical data, not models?

  18. Pingback: Weekly Climate and Energy News Roundup #420 | Watts Up With That?

  19. Test.

  20. Hi Nic,
    Like you, I am not a fan of many papers that claim to match observational data using “emergent constraints”. When a straightforward approach demonstrates that the AOGCMs are failing to match observational data, it seems that you can always find a climate scientist somewhere who will produce a Byzantine, error-prone method in order to prove that the models are compatible with said observational data.

    In this instance, there are several criticisms which can be leveled against the Njisse et al paper, the most important of which IMO relates to the choice of the post-1975 period. During this period, predictably recurrent natural variation (the quasi 60 year cycle) would be expected to contribute to an upswing in temperature above the low frequency trend, irrespective of the addition of the usual basket of external forcing drivers recognized by the AOGCMS. Attribution of the entire temperature gain to just these radiative drivers must of necessity lead to overestimation of TCR, or an estimate of the TCR upper bound.

    If I suspend my disbelief on the above issue, then one of the things of note before any complicated statistical analysis is that only about one third of the selected 26 CMIP6 models actually manage to predict a temperature gain over the chosen period that falls within the estimated range of the 0.5 – 0.7 “observed” temperature gain. We note also that 16 of the 26 models predict values above the upper bound of the temperature range. If one accepts the argument of the authors that this chosen period represents the most robust interval in terms of signal-to-noise ratio, then this surely represents dispositive evidence that the inclusion of these 16 models in any ensemble averaging of 21st century temperature prediction will yield a strong bias to over-prediction. In a very short time, no doubt I will hear the clarion call of honest climate scientists in street protests and special reports on the subject led by the BBC and the Guardian.

    It is worth running a measure over the sample statistics of the 9 CMIP6 models which do manage to land in the “observed” temperature range for the post-1980 period. Importantly, the TCR values for this subset display no obvious correlation with temperature gain. They have a mean value around 1.55 with a 90% CI of (1.14, 1.96). The TCR samples all lie in the range 1.32 to 1.64, except for one outlier – CNRM-ESM2-1 with a value of 1.92. Since this model has 5 repeat runs to arrive at its values, I suspect that the reason it is an outlier is that the ratio of its change in ERF over the selected period divided by its ERF for 2xCO2 must be anomalously low relative to the other models in this subset. The alternative possibilities are that its PIControl has a persistent long term decline in temperature, or it has lurched into an unstable (and unbelievable) solution for one or two runs. There do not appear to be any other obvious explanations for a low predicted temp gain averaged over 5 runs with such a high deterministic TCR.
    An important question then is whether the above simple calculation procedure is enhanced or damaged by making use of the “additional information” contained in the correlation between the temp gain and the TCR. In this instance, I would argue that the use of any univariate linear regression is more likely to damage the estimate than to improve it. There are two reasons and I will try to keep them distinct.

    First, in an idealized world, this TCR to predicted temp relationship should pass through the origin, true. However, the theoretical relationship, while being close to linear, is not actually linear. It is a curve. With TCR as Y-axis, it should display a gradually decreasing gradient. Given that the actual data carries most of its density in values of predicted temperature above the range of interest, then forcing a linear fit through the origin introduces a small low bias in the estimate of TCR. Conversely, allowing a free OLS fit with intercept introduces a high bias into the estimate of TCR.

    For the specific data here, however, there is a second problem evident, as you have noted. The intercept value based on free OLS is not credible. It is far too high. I ran some numerical experiments, using a two-body, constant linear feedback model tuned to match C&W temperature profile, and a recent ocean heat construction. I varied TCR by changing climate sensitivity for fixed ocean parameters and ocean parameters for fixed climate feedback. The resulting curve passes through the origin, but the intercept from OLS TCR vs Temp gain using only the upper values of this plot (TCR range 1.72 to 2.7) yields a positive intercept around 0.1. An intercept of 0.2 might just be credible, but the high value of the intercept from the paper’s data suggests that there is a major confounding factor in these data. I suspect that it is the model-specific ratio of ERF change over the period to forcing from a doubling of CO2.

    I don’t think that you can crash through this problem by the simple expedient of substituting a zero intercept line. This just leaves unresolved problems to be unpicked from the data, and the high likelihood that you are using a mis-specified model in the estimation process. Either the data needs to be carefully censored or, perhaps, a multivariate approach is required to squeeze out any useful information. Personally, I don’t think the game is worth the candle in this instance. There seems to be only one certain conclusion to be drawn, namely that the CMIP6 dataset is heavily biased to overpredicting temperature gain. Quantification of the uncertainty in TCR from the univariate correlation risks being a clever illusion IMHO which adds little to understanding.

  21. Thanks, Paul. I agree with almost all your comments. I perhaps should have mentioned the expectation that low frequency natural variability in the form of the AMO is thought to have contributed to the post-1975 warming, resulting in a high bias in TCR estimation based on warming over that period. But I wanted to keep the article brief and the focus on the statistical model.

    On that point, I think that – as your own results suggest – their assumed linear relationship [Nijsse Eq. (3)] between TCR and post-1975 simulated warming is a reasonably accurate and appropriate assumption to make if one believes that there are no factors that confound the TCR – post-1975 warming relationship. If such confounding factors do exist, than the whole approach is unsatisfactory and allowing an offest cannot be expected to fix the problems.

    Moreover, retaining allowing an offset (estimating a y-intercept) to allow for model-misspecification risks worsening estimation, since we know that a zero TCR must correspond to zero warming. It would IMO be more logical to estimate a zero-intercept quadratic fit if one wanted to allow for possible model misspecification. But I think doing so would also worsen estimation.

    The authors’ other stated reason for including a fitted offset, to allow for regression dilution (caused by noise in the simulated post-1975 warming), is nonsense. It does exactly the opposite: it allows regression dilution to bias estimation of the regression slope downwards, resulting in a positive y-intercept – exactly the result they obtained. With no intercept term, the estimated slope coefficient is barely affected by regression dilution.

  22. Climate science is flawed in treating the far-from-equilibrium climate system as if it were in equilibrium. It’s not. The work of Ilya Prigigine establishes the big difference between these two types of system:

    Russian-Belgian physical chemist Ilya Prigogine, who coined the term dissipative structure, received the Nobel Prize in Chemistry in 1977 for his pioneering work on these structures, which have dynamical regimes that can be regarded as thermodynamic steady states, and sometimes at least can be described by suitable extremal principles in non-equilibrium thermodynamics.
    In his Nobel lecture,[4] Prigogine explains how thermodynamic systems far from equilibrium can have drastically different behavior from systems close to equilibrium. Near equilibrium, the local equilibrium hypothesis applies and typical thermodynamic quantities such as free energy and entropy can be defined locally. One can assume linear relations between the (generalized) flux and forces of the system. Two celebrated results from linear thermodynamics are the Onsager reciprocal relations and the principle of minimum entropy production.[5] After efforts to extend such results to systems far from equilibrium, it was found that they do not hold in this regime and opposite results were obtained.

    One way to rigorously analyze such systems is by studying the stability of the system far from equilibrium. Close to equilibrium, one can show the existence of a Lyapunov function which ensures that the entropy tends to a stable maximum. Fluctuations are damped in the neighborhood of the fixed point and a macroscopic description suffices. However, far from equilibrium stability is no longer a universal property and can be broken. In chemical systems, this occurs with the presence of autocatalytic reactions, such as in the example of the Brusselator. If the system is driven beyond a certain threshold, oscillations are no longer damped out, but may be amplified. Mathematically, this corresponds to a Hopf bifurcation where increasing one of the parameters beyond a certain value leads to limit cycle behavior. If spatial effects are taken into account through a reaction-diffusion equation, long-range correlations and spatially ordered patterns arise,[6] such as in the case of the Belousov–Zhabotinsky reaction. Systems with such dynamic states of matter that arise as the result of irreversible processes are dissipative structures.

    Clouds, ocean and atmospheric circulation patterns are dissipative structures. They are constantly changing and don’t obey rules of linear behaviour.

    Ignoring Prigogine doesn’t make him wrong.

    Hint – it’s not just noise. Dissipative spatio-temporal structures can have millennial timescales.

  23. A series of observation and requests.

    1. models with more simulations have a better-constrained post-1975 warming. This results in a set of 127 simulations from 26 different models.

    Should they just have analysed 26 simulations, one per model?
    Adding repeated runs of favourable models drowns out the idea of weighting each model equally.

    2. They did not use the actual models but “models with data added to them?

    “We extend the historical simulations from 2014 to 2019 using the shared socioeconomic pathways (SSPs) scenario runs.
    the larger set of models that have historical simulations up to 2014 but no future scenarios ”

    3. RCP 8.5
    Emissions – the ‘business as usual’ story is misleading
    Stop using the worst-case scenario for climate warming as the most likely outcome — more-realistic baselines make for better policy.
    Zeke Hausfather & Glen P. Peters