Does the Aliasing Beast Feed the Uncertainty Monster?

by Richard Saumarez

Many continuous signals are sampled so that they can be manipulated digitally.  We assume that the train of samples in the time domain gives a true picture of what the underlying signal is doing, but can we be sure that this is true and the signal isn’t doing something wildly different between samples?  Can we reconstruct the signal between samples and, more important, can we tell if the signal has been incorrectly sampled and is not a true representation of the signal?

This is one of the most basic ideas in signal processing and is worth discussing because it is regularly abused. A number of us have been caught out by this problem, when it occurs rather subtly, so this is cautionary tale. I will illustrate it using temperature records and suggest that it is not a trivial problem.

A common method of expressing climate temperature data is to take monthly averages.  This seems a perfectly reasonable thing to do, after all an average is simply the average.  If we wanted to compare the mean July temperature in Anchorage with that in Las Vegas using conventional statistics, it presents no problems.  But, when the averages are treated as time series or the representation of the underlying continuous signal, problems can occur.

The number of samples required to describe a varying signal is determined by the Nyquist Sampling theorem, which states that the sampling frequency must be at least twice the highest frequency in the  (Fourier series representation) of the signal.  If this done, the signal is correctly sampled and in principle the intermediate signal between samples can be reconstructed perfectly. (In practice there are some limitations and trade-offs that stem from estimating the Fourier series of an arbitrary signal).

However, if the signal is under-sampled, it is irretrievably corrupted and is said to be aliased.  This is well understood in many fields and is absolutely basic (Signals 101).  If one has, say, an audio signal and we wish to record it digitally, we would wish to resolve ~20KHz, the highest audible frequency and so we would record at a sampling frequency of at least 40KHz.  There may be higher frequencies around during the recording, although we can’t hear them, or artefacts such as clicks, so the analogue signal is filtered with an anti-aliasing, low pass filter to remove high frequency components before sampling or is sampled at a very high frequency, filtered digitally to simulate the anti-aliasing filter and finally re-sampled, or “decimated”, at 40KHz. Although this discussion is based on signals in time, the concept applies to any sampled system: images, computed tomographic imaging, temperature over the surface of the Earth and so on.

Difficulties arise when either one can’t do this because the measurement system doesn’t allow it or the problem of aliasing isn’t recognised. In this case one can be led, unsuspecting, up a long and tortuous garden path.

Sampling a signal is equivalent to multiplying a continuous function by an equally spaced train of impulses (figure 1).

Since, the Nyquist theorem is stated in terms of a frequency, we have to consider what sampling does to the spectrum of the signal and I am taking a short cut by going straight to the discussion of a sampled signal, rather than through the long winded formal theory.

 Figure 1 Sampling a signal

The spectrum of a signal is represented in both positive and negative frequency. Computation of Fourier series coefficients is a correlation with a sine wave and a cosine wave of a particular frequency and

cos (wt) = 1/2 [exp(iwt) + exp(-jwt)]  And sin (wt) = 1/2 [exp(iwt) – exp(-jwt)]/j

Therefore the spectrum of a cosine wave with an amplitude of 1 and frequency wf, is (½,0) at a frequency of wf and (½,0) at frequency of -wf. Similarly, a sine wave of the same frequency has a spectrum of (0, ½j) at wf and (0,-½j) at -wf, i.e.: the negative frequency component of the spectrum the complex conjugate of the positive frequency component.

The spectrum of the sampling process itself is an infinite train of impulses in frequency domain spaced at intervals equal to the frequency of the sampling process (figure 1).

Since sampling is multiplication of a continuous signal by a train of impulses, we can obtain the sampled signal spectrum by convolving their two spectra.  The spectra of a correctly sampled signal and an aliased signal are shown below in figure 2:

The highest frequency that can be determined is half the sampling frequency, known as the Nyquist frequency. In the case of monthly temperature series, the highest frequency that can be resolved is 2 months-1

Although this is the theoretical minimum, in practice one generally samples at a higher rate than the theoretical minimum. Therefore in most correctly sampled signals, there will be gap in the spectrum around the Nyquist frequency, as in the upper drawing of figure 2. Therefore the spectrum is resolvable because there is no overlap between its representation at w=0, and its reflection about the sampling frequency.  In an aliased signal, high frequency components of the signal overlap and are summed (as complex numbers) with low frequency components and so the spectrum become irresolvable and the signal is corrupted.  This has two rather important implications:

a)     One cannot interpolate between samples to get the original signal.

b)    A high frequency’ aliased component in the signal will appear as a lower frequency component in the sampled signal.

Figure 2 Spectrum of a correctly sampled signal (upper) and an aliased signal (lower).

Figure 3. Mean amplitude spectrum of 10 yearly records from Bergen, Norway. The 1-year component is suppressed, as it is huge compared to the other components

 Are Temperature records aliased?

Out of curiosity I looked at the HADCRUT series.  I extracted intact 10-year records from the series that contained no missing data and calculated the amplitude spectrum (after trend removal and windowing), for individual stations (Figure 3) and the mean of all valid records, shown in figure 4.

Figure 4 Ensemble amplitude spectrum of 5585 10 year monthly temperature records.

These certainly look aliased at first sight, but without access to daily records, it is impossible to be sure that aliasing is occurring, the temperature might be fortuitously sampled at exactly the correct frequency.

This begs two questions:

a)     Can one explain how the temperature record has become aliased?

b)    Does it really matter?

[Or, in translation: is this simply a load of pretentious, flatulent, obfuscating, pseudo-academic navel-gazing? I will show that, yes, aliasing is likely to be present and it may matter.]

Why are temperature records aliased?

Temperature records are constructed by taking daily, or more frequent, observations and taking their average over each month.  In signal processing terms this is a grotesque operation.

Forming the average of a train of samples is a filter.  One has convolved the signal with an impulse response that consists of 30 equal weight impulses and this filter has an easily calculable frequency response, shown in figure 5:

Note that the first zero is at 1/month.  If this filter were run over every daily observation, one would get a low pass filtered version of the daily signal.  Clearly, this does not attenuate all the high frequency components in the daily signal.

What is then done, is that filtered daily signal is sampled at monthly intervals, so the Nyquist frequency is 1/6 of a year.  If components of the signal with higher frequencies than the Nyquist frequency exist and are not attenuated by the averaging filter, these will become aliased components.

Figure 5 Frequency response of 30-day average.

To investigate this I have modelled a daily temperature record with the following components:

1)    A basic sinusoid, -cos(2 pi t), where t is years starting on January 1st, to model the yearly cycle with an amplitude of 30oC.

2)    This modified by a random, amplitude modulation of 2.0 oC to simulate variability of peak summer and minimum winter temperatures.

3)    A 15-day random modulation of phase to that spring and autumn can come a bit early or late.

4)    A heat wave during summer that can occur randomly from the beginning of June until the end of August and last a random length of between 5 and 15 days.  Its amplitude is random, between 0.5 and 5oC.  A similar “cold snap” is added during December and January.

5)    A random, normally distributed measurement error with a standard deviation of 0.25oC.

6)    Rounding of the temperature reading to the nearest 0.1oC.

One would not claim this is an exact model of a temperature station, but it contains features that are slowly and rapidly varying that would give the signal some of its properties.  Any rapidly moving components, for example short temperature excursions, will generate high frequencies, while the modulations will generate harmonics of the yearly cycle.

The spectrum of a 10-year record is shown in figure 6, with the frequency response of the averaging filter.

Figure 6 Spectrum of a 10-year record of simulated temperature with the amplitude response of the averaging process superimposed upon it.

Figure 7 The spectrum after applying a 30-day averaging filter.  Note: the spectrum of this signal sampled at 1/month is obtained by reflecting this spectrum around 1/month (red line) and adding it (as a complex number) to the original.

Once this spectrum is filtered, as shown in figure 7, the spectrum of the monthly signal is obtained by convolving it with the spectrum of the monthly sampling process, which is a train of impulses spaced in the frequency axis of one month apart.

This results in a severely aliased spectrum which is shown in figure 8 and this suggests that the HADCRUT data is aliased.

Figure 8 The calculated spectrum after sampling at 1-month intervals.  Note that this is aliased. The simple yearly cycle has been subtracted; the components shown around 1 year-1 are due to modulation.

Does it matter?

One important feature of aliasing is that it creates spurious low frequency signals that can become trends.  Using 150-year model records, this can be examined by creating yearly anomaly signals.

Figure 9 Spurious trends in the error between the true integrated yearly signal and the value obtained by taking monthly and then yearly averages.

The daily, simulated signals (I have constructed them carefully to ensure that they aren’t aliased) are reduced to monthly averages and then used to form yearly errors between the processed signal and the integral of the daily signal, which is trend free.  A particularly bad example of this is shown in Figure 9.

Note that the magnitudes of these trends are significant in terms of the variability of temperature records and that they are artefacts created by processing records where there are no trends.

Repeating this process for 100000 records, the distribution of the magnitude and duration of these spurious trends can easily be estimated and is shown in a 2d histogram in figure 10.  The green area is the empirical 95% limit.  The marginal distributions, shown in red are not on the same vertical (probability) scale but merely indicate the shape of the distribution.

Figure 10 Histogram of trend magnitude and duration.

Therefore one can conclude that aliasing in temperature records may be important.

Given daily records, it is preferable to integrate them in order to get average temperatures, paying careful attention to the effects of the finite precision of a thermometer, which is probably a maximum of 1:150 over most temperature ranges, but is more typically 1:100.  To get a monthly time series, one should filter the daily record with a carefully designed filter, which will inevitably have a long impulse response, to obtain a smoothed daily record.  This can then be sampled at monthly intervals.

There is a further problem in using data that may aliased as inputs to models. The effects will clearly depend on the model, but as an example, I made a simple linear model of a system with three negative feedbacks with different feedback gains and time constants.  This is driven with low pass filtered broadband noise (not aliased), and also the input signal, which is decimated to create an aliased input.  The results are shown below.  The true output is shown in black, the green is the input sampled at 25% below, and red 50%, below the Nyquist frequency.  If you wanted to extract the parameters of the model from the aliased input, they would of course be wrong.

Predicting what would happen in more complex situations, for example using principle component analysis, when some components were aliased and some were not, is difficult (even a nightmare).  However, any model that is constructed using sampled data should be viewed critically.

Figure 11 Output of a simple feedback model driven with a correctly sampled signal (black) and increasingly aliased versions as inputs (green and red).

Aliasing is a problem that rears its ugly head in many different fields.  In terms of pure analogue time-domain signal conversion, the procedure to prevent it is well-known and straightforward – anti-aliasing filters.  Problems occur when you can’t simply filter out high frequency components.  For example, this was a problem in early generation CT scanners where abrupt transitions in bone and soft tissue radio density caused aliasing because they could not be sampled adequately by the x-ray beams used to form the projections.

The key to dealing with aliasing is to recognise it and given any time series one’s first question should be “Is it aliased?”

JC comment:  My concerns regarding aliasing relate particularly to the surface temperature data, especially how missing data is filled in for the oceans.  Further, I have find that running mean/moving average approaches  can introduce aliases, I have been using a Hamming filter when I need one for a graphical display.  This whole issue of aliasing in the data sets seems to me to be an under appreciated issue.

271 responses to “Does the Aliasing Beast Feed the Uncertainty Monster?

  1. How many will “get it”? Perhaps a detailed example could be of help.

    • Not detailed but here is simple demonstration

    • In electronics this is resolved easily by placing an “anti-aliasing” filter before the ADC (analog to digital filter) which performs the interval sampling. This is simply a low pass filter that removes any part of the signal spectrum above the sampling rate of the ADC.

      This is a proper and necessary step to trust the data you analyze.

      If you have a signal with a lot of variation, you should either average it with a moving window filter (which is a low pass filter, or select a low pass of your choice) if you want to decimate it (reduce the data set size).

      Reading a rain gauge once a day is actually a low pass filter. It give you the average rain per day instead of say checking it every hour. Obviously it you summed 24 hourly readings you would get identical results to the daily gauge reading. However the classic aliasing mistake would be to take just one of the hourly readings, say 2 pm, and multiply it by 24 to get a daily average. This is clearly not equivalent and subject to error.

      All things being equal, it is usually best to not throw away data and analyze what you have in full.

      I believe, and in my experience find to be true, that many (most?) scientists are much more lacking in basic signal processing skills than the public would assume. In fact it was Mann’s lack of math skills so carefully identified by McIntyre that led my down the skeptic trail, and revealed to me how sloppy some of the academia research is done, and how defensive they are about admitting basic errors, to the detriment of the whole of science.

  2. Stern Caution:
    Aliasing doesn’t necessarily imply sampling error.

    Earth itself aliases.

    Vaughan, P.L. (2011). Shifting Sun-Earth-Moon Harmonies, Beats, & Biases.
    http://wattsupwiththat.com/2011/10/15/shifting-sun-earth-moon-harmonies-beats-biases/
    http://wattsupwiththat.files.wordpress.com/2011/10/vaughn-sun-earth-moon-harmonies-beats-biases.pdf

    To make it more complicated, the aliasing integrates (via circulation).

    Don’t forget about spatial aliasing. Too many signal processing expert contributors appear hardwired to the time-only dimension.

    Remember that marginal temporal & spatial summaries differ fundamentally from joint spatiotemporal summaries.

    There’s a spectacular logjam on the mainstream river to climate enlightenment, caused by failure to recognize the spatiotemporal version of Simpson’s Paradox.

    Might as well call it Simpson’s Logjam.

    Best Regards.

    • Don’t forget about spatial aliasing. Too many signal processing expert contributors appear hardwired to the time-only dimension.

      There’s a spectacular logjam on the mainstream river to climate enlightenment, caused by failure to recognize the spatiotemporal version of Simpson’s Paradox.

      Might as well call it Simpson’s Logjam.

      Well said.

      I think that the term “aliasing” dramatically understates the problem.

    • I think what you meant to say is that obtaining contradictory results does not always imply erroneous results due to sampling error.

  3. Richard S,
    Excellent post. Shifting the view to signal processing refocuses the discussion on real data and algorithm shortcomings.

  4. –>”… I will illustrate it using temperature records and suggest that it is not a trivial problem….”

    Ditto re the measurement of atmospheric CO2 at Mauna Loa, the site of an active volcano where the concentratin can swing 600 ppm in a single day–e.g., Tim Ball: “Elimination of data occurs with the Mauna Loa readings, which can vary up to 600 ppm in the course of a day. Beck explains how Charles Keeling established the Mauna Loa readings by using the lowest readings of the afternoon. He ignored natural sources, a practice that continues. Beck presumes Keeling decided to avoid these low level natural sources by establishing the station at 4000 meters up the volcano. As Beck notes “Mauna Loa does not represent the typical atmospheric CO2 on different global locations but is typical only for this volcano at a maritime location in about 4000 m altitude at that latitude.” (Beck, 2008, “50 Years of Continuous Measurement of CO2 on Mauna Loa” Energy and Environment, Vol 19, No.7.) Keeling’s son continues to operate the Mauna Loa facility and as Beck notes, “owns the global monopoly of calibration of all CO2 measurements.” Since Keeling is a co-author of the IPCC reports they accept Mauna Loa without question.” (“Time to Revisit Falsified Science of CO2,” December 28, 2009)

  5. Judith said

    ‘My concerns regarding aliasing relate particularly to the surface temperature data, especially how missing data is filled in for the oceans’.

    Using the word ‘missing’ hardly does justice to the way that some grid cells in SSTs are occupied by a handful of observations in a year which then fills in data for that cell for the entire year.
    Or by missing did you mean ‘made up?’

    Tonyb

    • Yeah yeah complain away, but I bet many of the skeptics that will scream bloody murder about missing grid cells in the temperature records will nevertheless be referencing hadcrut warming trends in the early 20th century in argument…

      You never hear a skeptic wonder if the early 20th century warming perhaps never happened….

      • Nullius in Verba

        Hmm. I wonder if the early 20th century warming never happened, and it actually cooled half a degree to the 1940s? That would mean we had just returned to normal…

        Happy?

      • Well, this agnostic has said several times that the data are so dodgy that one cannot confidently say anything about global warming, other than it might have occurred. And if that is the case, we cannot say that whatever happened in the 20th century was ‘unprecedented’, either.

      • Well it may be that some of the apparent cooling post-1940 never really happened…I seem to remember a series of posts at CA and maybe here about bucket adjustments…

  6. There are indeed spurious trends in published surface temperature series, but they aren’t introduced by aliasing. They’re introduced by “corrections” applied to the raw data. Details on request.

  7.  “… is this simply a load of pretentious, flatulent, obfuscating, pseudo-academic navel-gazing…”

    Rather than unabashed Omphaloskepsis, it is far more likely that wilful ignorance and unconscious incompetence underlies most of what we observe in the climatology of global warming alarmists, which outside Western civilization has been likened to the science of ancient astrology.

  8. Rather than looking at simulated data the obvious thing to do is to look at some real daily data and see if this actually matters in practice. You can get daily data from http://climate.usurf.usu.edu/products/data.php

    Of course there are subtle issues about how daily data is acquired, but that’s to some extent a different question.

    • Richard Saumarez

      Thanks. I did this after you suggested it, but I have not done an exhaustive study as the data is difficult to access in a convenient form (or at least I havn’t found out how to do it). One difficulty is dealing with drop outs and I have simply linearly interpolated across them.

      I agree that this is important when looking at trends, but to do this for the whole US data set would be a major undertaking.

      As one might expect, the effects vary from station to station. Wapato shows this effect with a .35 degree excursion between the data (as integrated) over a 20 year period. One problem is that there is a bias in the data,which is recorded at 1 degree intervals and the integral drifts over time – this is, Isuppose, inevitable when the data has been rounded, truncated(?).

      • Nullius in Verba

        For simple dropouts, you can construct an indicator series that is 1 where you have data and 0 where you don’t. The transform of that gives you something that gets convolved with your signal. It’s a generalisation of the finite sample length problem, and if serious might be addressed by some of the same methods, like tapering.

        A bigger issue is when dropouts are due to things like station moves, when the error is a step function or something worse. The sharp edges of the step contribute high frequencies, and its persistence low ones. As you’ve already done, you can Monte Carlo possible distributions (many of them are known or obvious) and look at the uncertainty this adds.

        Quantisation error is also tricky. I’m not convinced that the usual assumptions of independent, uniform distributions are valid – particularly after taking anomalies. But I don’t know what you could do about that.

      • I seem to recall that Lucia Liljegren took a long hard stare at quantisation error and decided that it probably didn’t matter? But I may be misremembering to the point of making this up…

      • steven mosher

        Yep. it doesnt matter either. That was a nice piece of work by Lucia

      • Richard Saumarez

        Yes, I agree it doesn’t matter. It was simply an observation.

      • Nullius in Verba

        “I seem to recall that Lucia Liljegren took a long hard stare at quantisation error and decided that it probably didn’t matter?”

        That is one thing that particularly irks me over the way climate scientists behave – and Steve Mosher on this thread is demonstrating it again – and that is this idea that you can say “the errors don’t matter” and everything is alright again.

        Even if they don’t make a difference to the conclusion, the errors do matter. It matters if you haven’t checked, and just got lucky. It matters if the data is OK for one purpose but not another, and this isn’t made obvious. It matters if you leap at the throat of everyone who tries to check these things for themselves, or even asks the question without checking, as if it were the conclusion rather than the method that mattered.

        It’s interesting and potentially educational to ask the question. If it’s already been thought about and there’s an easy answer, then a shortcut or guide would be gratefully received. But the only time errors “don’t matter” is when they had already been quantified and everybody made aware of it.

        I don’t know if quantisation errors matter. You’re combining flaky and intermittent data rounded to the nearest 1 C, putting it through the statistical meat grinder, and deriving results to 0.1 C precision or better. How is that possible? I know there are people who believe that if you just take enough quantised data to average you could resolve individual atoms, but real world errors don’t work that way. Mathematical assumptions and approximations that are good enough for most purposes are not perfect and cannot be pushed indefinitely. How far can you go? How close are you to the edge?

        While I tend to agree that the issues raised here are not a big deal for calculating trends (and disagree profoundly with the claim that such trends are necessarily meaningful), trends are not the only thing people are interested in. The same thing happened with Watts’ paper on station siting – everybody leapt on the fact that various large errors contributing to the trend happened to cancel, and ignored everything else. And then said “the errors didn’t matter”.

        Have we made sure everybody knows *why* they don’t matter?

      • steven mosher

        Lucia published her code.

        rather than argue whether it mattered or not she tried to prove that it didnt.

        So, go get her code, have a look. Maybe there is a problem. Maybe not.
        But simply because you have an argument ( words on a page) means nothing. It means less than nothing when people have given you the tools and you refuse to use them.

        What bothers me is that people forget why we fought for data and code. We fought for it so that people like you could have the tools to either prove your points or not. we fought to give you power. the power to look at the data yourself and process it yourself. Instead, you ignore the tools that people produced for you and resort to words on a page.

        man up. write some code and prove Lucia wrong. prove that it does matter and why? I used to think it mattered. hell go back to CA and see the idiot mosher blather on about all these things. working through the problem for yourself changes things. It makes you less tolerant of people who refuse to do the work, even when you make it easy for them.

      • Brandon Shollenberger

        Mosher, could you provide a link to where Lucia did this? I tried searching her blog, and I didn’t find it.

      • Richard Saumarez

        I think there are two issues here, which I alluded to, but didn’t make clear.

        1) If you want to compute a “global mean temperature” durin 1960-1970, aliasing of the data doesn’t matter a fig.

        2) If you want to use data as an input to drive a model or calibrate it, this is a more subtle problem, which concerns me much more.

        Suppose you have a GCM and these models, I believe are run at 1 hour steps. If you are going to drive them, you need data with a bandwidth of at 30 mins-1. So such data we have had, historically, isn’t sampled at anything like that rate. If you are going to use it, you have to filter it heavily and interpolate.
        Daily data, unless gathered instrumentally, with an anti-aliasing filter is also aliased, quite badly in fact, because it is full of jumps and rapid processes. Turning this is into a mean over a month, or even a year, gets rid of the problem as a statistical observation, However, you use this data to drive a short term model, you will encounter problems unless you are very careful.

      • Nullius in Verba

        “could you provide a link to where Lucia did this?”

        The best I can find is here.
        http://rankexploits.com/musings/2009/false-precision-in-earths-observed-surface-temperatures/

        There’s a spreadsheet, where Lucia models rounding errors with the usual perfect mathematical properties (0.499999 gets rounded down, 0.500001 gets rounded up) and a link to Chad who does something similar applied to some real data.

        They appear to be arguing about something slightly different – the idea that with 1 C quantisation, you cannot get an average to *any* better than 1 C or 0.5 C accuracy. As they say, that’s not true. You *can* improve accuracy by averaging, but not indefinitely.

        Maybe there’s something else at Lucia’s I’ve missed?

      • Richard,
        In what circumstances do you think temperature data are used to “drive a model”?

      • Brandon Shollenberger

        Nullius in Verba, surely there is some other thread or source where Lucia actually examines the problem you discussed. Otherwise, Mosher would have been rudely challenging you based upon a complete fabrication.

      • Brandon Shollenberger

        Nullius in Verba, a quick addendum. That spreadsheet made me interested in the effect of rounding, so I decided to check it out. Unless my math is off (which it very well could be), rounding values which are normally distributed will actually introduce biases in your results. The magnitude of the biases will cancel out, but it will skew the distribution of your results.

      • Richard Saumarez

        @Nick Stokes.

        SB2011 and D2011 use monthly intervals and HADCRUT to drive their “models”. They calculated results at monthly intervals although the processes involved may, I repeat may, have a shorter time scale than that. This is my interpretation of what they said and I may be wrong. As I pointed out in the main post, this will produce some rather odd results, even if you are postulating a lowpass filter as suggested by their first order ODE.

      • Nullius in Verba

        “Unless my math is off (which it very well could be), rounding values which are normally distributed will actually introduce biases in your results.”

        The usual analysis takes the distribution of the data (pdf of a Normal, say), chops it into blocks one quantum thick, and then adds all the blocks together. When the pdf changes only slowly compared to the quantisation size (i.e. you quantise at a much finer resolution than the spread of the variable) and it’s not too skewed or spiky, each block is fairly flat and the result is approximately uniform on the interval (-1/2,1/2). You get from that the usual basketful of statistical parameters – zero mean, 1/12 variance, etc. – that allows you to draw conclusions.

        But as you indicate, if the distribution changes quickly compared to the quantisation, you can get a significantly non-uniform distribution, with a non-zero mean. And if you examine it closely enough, it will *never* be exactly uniform.

        In the case of temperatures (not temperature anomalies) you get a distribution that looks more like sec(T-T_av) than a bell-shaped normal, with peaks at the ends where the temperature hangs around for a while near the maximum or minimum. You might expect some sharp changes near the ends. However, the middle of the distribution is fairly flat, and even the ends are blurred out, so it’s unlikely to be very obviously non-uniform. You most commonly get problems when you quantise *twice* at different resolutions. (e.g. round to the nearest degree F, then convert to C and round again.)

        But even with this analysis, it assumes that humans reading a thermometer round according to some exact, constant error distribution. That you can in principle get more and more accuracy by averaging more and more data. Eventually it becomes like the Emperor of China’s nose. You wind up measuring the measurers, and tracking fashions in the metrology instead of the meteorology.

      • SB2011 and D2011 use monthly intervals and HADCRUT to drive their “models”. They calculated results at monthly intervals although the processes involved may, I repeat may, have a shorter time scale than that. This is my interpretation of what they said and I may be wrong. As I pointed out in the main post, this will produce some rather odd results, even if you are postulating a lowpass filter as suggested by their first order ODE.

        It’s clear that using monthly data, it’s not possible to study processes of shorter time scale, and that using such data influences also results with a time scale of about one month. That has, however, nothing to do with aliasing unless there’s a periodic phenomenon of shorter period that is strong enough to make the average dependent on the relative timing of this phenomenon and the cutoff points of the months used in calculating averages (more generally the effect is present if there’s an autocorrelation that extends significantly to a time lag of one month, but which varies at a faster rate).

        This kind of problems are possible in principle, but very unlikely to be significant enough in comparison with other issues to warrant special handling. Using a filter to smoothen the data before it’s used in the analysis and in particular using it before another filter is introduced in calculating the average would only lose information.

      • Pekka,

        Please comment on what I wrote below at 9:40, if you’re interested.

        Bill

      • That has, however, nothing to do with aliasing unless there’s a periodic phenomenon of shorter period that is strong enough to make the average dependent on the relative timing of this phenomenon and the cutoff points of the months used in calculating averages

        Very true. If there was say a spike that happened every July 21st where everything momentarily got hotter and the sampling did not pick that up, then it would be an issue. But these are what are referred to as pathological or degenerate cases. One can spend a lot of time tracking down these phantoms if you are so inclined.

      • Brandon Shollenberger

        Nullius in Verba, thanks for your response. It’s good to know I understood the situation properly. Now if only Mosher could clarify his comment about what Lucia has actually done.

      • “I agree that this is important when looking at trends, but to do this for the whole US data set would be a major undertaking.”

        Judith – sounds like a job for the BEST Team

      • steven mosher

        nonsense.

        The data for 26,000 daily temperature stations is easily downloadable.
        Its takes a while, but the code is all written for you. And its free.

        its as easy as this. using the DailyGhcn package in R

        #### Downloads all the data that has either Tmin or Tmax
        if (!file.exists(DAILY.QA.DIRECTORY)) dir.create(DAILY.QA.DIRECTORY)
        if (!file.exists(DAILY.DATA.DIRECTORY)) dir.create(DAILY.DATA.DIRECTORY)
        if (!file.exists(DAILY.FILES.DIRECTORY)) dir.create(DAILY.FILES.DIRECTORY)
        if (!file.exists(MONTHLY.DATA.DIRECTORY)) dir.create(MONTHLY.DATA.DIRECTORY)

        cntryFilename <- downloadCountry()
        inventoryFilename <- downloadDailyInventory()
        metadataFilename <- downloadDailyMetadata()
        MinInv <- readDailyInventory(elements= "TMIN" )
        MaxInv <- readDailyInventory(elements= "TMAX" )
        all <- merge(MaxInv,MinInv, by.x = "Id", by.y = "Id", all = TRUE)
        dlist <- makeDownloadList(all)

        downloadDailyData(dlist)

        Takes a day to download 26,000 daily stations from around the world.
        its raw data with 14 quality control flags.

        removing data that fails for QA..

        convertRawtoDat()

        doing trends for the whole US maybe a days work unless you want custom work.. Not a big job at all.

      • Richard Saumarez

        Actually, I don’t think that this is quite as simple as you think. Had you noticed aliasing? Daily data is aliased.

        If you want to drive or calibrate a model, this is a problem. If you want to integrate the data over time and space, it doesn’t matter.

  9. There are minute-by-minute and hour-by-hour temperature data available by anonymous ftp from http://ftp.cmdl.noaa.gov. I have a program to extract specific data. It’s not refined or turn-key and requires hand-crafting and a ( shudder ) Fortran compiler to be useful. Let me know if I can assist with analyses of the data.

    • Dan,

      I managed to get to the website you linked by typing in ftp://ftp.cmdl.noaa.gov/.

      Interesting. I don’t know anything about signal processing, but what if you took one of the longest hourly series, Mauna Loa or Barrow Alaska and did successive filtering on it to try to find the highest relevant frequency and then compared it to a monthly average?

      I guess this is what Richard is talking about above at 12:18 pm but it’s too brief, I don’t know….

      Could minute-by-minute observations possible matter in this context (for T, w/r/t signal analysis?). I can see them mattering for some variables…is it reasonable to say a priori that the daily T cycle is the one with the highest frequency?

      • Richard Saumarez

        Replying to your post below.

        Let me say that I found aliasing very difficult to understand when I first encountered it. The fine details can be very subtle, even for people who have been trained in the subject.

        Aliasing is an unpredictable, non linear transformation of a continuous signal into a sampled signal. The way this is normally avoided is to use a low pass filter to remove high frequency components in signal before sampling it. If you have an analogue electrical signal, this makes the problem easy.

        However, if there were an impulse between samples, you would see evidence of this in the sampled signal because the signal going in to the Analogue to digital converter, would have the impulse response of the anti-aliasing filter imposed upon it.

        If you have a signal that is “sampled” by some other means, i.e.: reading a thermometer and recording the result, this safeguard is lacking. You may catch the physical effect of the impulse creating a slow increase and then a decrease in temperature, in say 6 hours. Now imagine that the next day, the same thing happens, but at a different time. You will capture a different part of the impulse response. The problem is that if you don’t sample this adequately, you can’t tell that the temperature has risen and fallen twice, but you will record what appears to be a slowly moving trend in the daily temperature.

        Therefore, in daily readings, you will get a spurious impression of what is happening between samples. If there is activity on a higher frequency than1/2days, you will not capture it with daily sampling. Features, such as impulses or step changes, have a high bandwidth, so in spectral terms, they can be aliased.

        One quick and dirty way to determine frequency content of a signal, and this is a variant on what you are suggesting, is to take a signal and decimate it. You then interpolate the decimated signal optimally, which is a convolution with the samples and sin(t)/t, and see if there are systematic differences between the original and reconstructed versions.
        You can’t tell whether a signal is aliased by filtering with increasingly severe low-pass filters, because the theory of the filter explicitly assumes that signal is not aliased. If, however the signal is genuinely not aliased, using filters of different bandwidths allows one to tell what features are located in which part of the spectrum. This is a valuable technique since, if say toy decide that the process you are interested in is a slow moving trend, or alternatively rapid excursions, this enables you to isolate the features of interest in the signal

      • I used to use a very good analog to digital converter and always found very strange harmonics. The stirring bar of my oxygen electrode was a good source. I eventually found that I could get rid of a huge amount of cyclical noise by sampling at a prime frequency, then filtering at a different prime.
        I really like this article.

      • If you know the frequency of your interference and the frequency of your signal of interest, you can usually select a sampling rate that moves all the energy of the interference away from your signal of interest.

        It does get complicated that anything but a sine wave will have harmonics and you must make sure the harmonics of both signals do not conflict. Using primes is a cool idea, although you have to be able to control the interference and signal source, which is rare in the real world in my experience.

        Usually my problems are related to 50/60 Hz electrical interference, or similar with light sources.

      • “Let me say that I found aliasing very difficult to understand when I first encountered it.”

        Here’s a homely analogy that I found helpful. I know you don’t need it, but some might. Suppose you see a snapshot of a long distance running race between two runners on a 400m circular track. One is leading by 40m.

        Or seems to be. But he could be 360m behind. Or 760m, or 440 m ahead. You don’t know, but you probably think the 40 m lead is most likely, as lapping is not common.

        Whenever you observe a frequency in a regularly sampled signal, there is a similar ambiguity. Corresponding to the lap length is the sampling frequency. Signals separated by the sampling frequency are indistinguishable. Usually, you expect that the lowest possibility is the one you want. So if the sampling frequency is high, that enhances the contrast.

        As you approach the Nyquist frequency, the distinction between lowest and next lowest fades. It’s like seeing that race with the runners 200m apart, on opposite sides of the track. You can’t guess who’s in front. The Nyquist frequency is half the sampling freq, just as the point of max ambiguity is half the lap length. But there’s some ambiguity at any frequency level.

        But the notion of aliasing results from your wish to interpret frequencies as being the lowest. Sometimes you don’t. If you’re timing an engine, you have a strobe light running at the target frequency. You see a fan blade slowly rotating. But if you know what you are doing you interpret this as the next frequency up, and adjust accordingly. If you don’t know, you risk injury – the lowest frequency interpretation is wrong.

      • One way to detect aliasing is to change the sampling rate, and then examine the spectrum to see if it changes “as expected”.

        If you slow the sampling rate down to say 90% of the data rate, and one of your signal peaks in the spectrum jumps, you can be fairly confident that the signal is aliased. You can even calculate the signals actual frequency with some ugly math and enough sampling rate changes.

  10. Norm Kalmanovitch

    Finally some real science instead of the usual arm waving conjecture that has kept this ludicrous AGW alive for so long even though it is refuted by all physical data.
    Global temperature is in fact nothing more than a time series and therefore is best evaluated through fourier analysis and aliasing is the least of the problems that those pushing AGW must reckon with.
    This is a bread and butter issue of my profession as an exploration geophysicist and with over 40 years of signal processing of siesmic data under my belt I can point to several areas of failure of AGW apart from the alias problem of this article.
    Reflections on the seismic record section are limited to the frequency spectrum of the input seismic source wavelet which means the time series must be limited to the frequency spectrum of the driver to claim that a particular driver is predominantly responsible for the time series.
    The HadCRUT3 global temperature dataset represents a 150 year time series with a dominant 65 year period as demonstrated by the 50 year moving average applied to thius data shown on http://www.climate4you.com under the heading “Global temperature” and the sub heading cyclic air temperature changes” (9th in the listing of contents).
    The increase in atmospheric CO2 concentration which is another time series shows a smooth accelerating increase ending in a linear trend of about 2ppmv/year over the past decade. This time series does not contain this predominent 65 year period so increase in CO2 concentration cannot be the driver of observed global temperature change; full stop!

    • Interestingly, Frank Lemke’s post on CE uses no theory or filters, and comes to the same conclusion:
      “The atmospheric CO2 at a time is described very well by the CO2 concentration observed 12 months before, exclusively (auto-regressive model). This model has been posted earlier. However – and this is a most important finding -, CO2 does also not influence any other of the system variables including global temperature. It remains completely autonomous.”
      See Figure 2, here.

  11. There are no limits in the number of errors that can be made in data analysis, but I fail to see, how the aliasing problem discussed in this post could be relevant.

    Judith mentioned the issue of filling missing data, where also errors related to aliasing are certainly possible. On that point I have been wondering, whether it’s really too difficult to develop methods that would not be dependent on filling the missing data, but use only the existing data directly in estimating the values that are being searched for. Methods based on the concept of maximum likelihood work often well also in presence of missing data points.

    • Pekka,

      What do you think of what BEST says about their methods?

      From the FAQ page, http://berkeleyearth.org/FAQ.php#agenda:

      “What is new about the statistical approach being used?

      The central challenge of global temperature reconstruction is to take spatially and temporally diverse data exhibiting varying levels of quality and construct a global index series that can track changes in the mean surface temperature of the Earth. This challenge presents no easy solution and we believe that there is inherent value in comparing different approaches to this problem as well as understanding the weaknesses intrinsic to any given approach. Thus, we are both studying the existing methodologies for averaging and homogenizing data as well as looking for new approaches whose features seem to incorporate valuable alternatives to the existing methods.

      The statistical methods that we use have been developed by Robert Rohde in close collaboration with David Brillinger, a Professor of Statistics at the University of California at Berkeley, and the other team members. They include the statistical approach called Kriging (a process which allows us to combine fragmented records in an optimum way), the scalpel (which identifies discontinuities and cuts the data at those points) and weighting (in which the program estimates numerically the reliability of a data segment and applies a weight that reduces the contribution of the poor samples). The methods all use raw data as input. There are no manual corrections applied; all the weights and scalpel points are determined using automated and reproducible methods.

      Our algorithms aim to:

      Make it possible to exploit relatively short (e.g. a few years) or discontinuous station records. Rather than simply excluding all short records, we prefer to design a system that allow short records to be used with a low – but non-zero – weighting whenever practical.
      Avoid gridding. All three major research groups currently rely on spatial gridding in their averaging algorithms. As a result, the effective averages may be dependent on the choice of grid pattern and may be sensitive to effects such as the change in grid cell area with latitude. Our algorithms seek to eliminate explicit gridding entirely.
      Place empirical homogenization on an equal footing with other averaging. We distinguish empirical homogenization from evidence-based homogenization. Evidence-based adjustments to records occur when secondary data and/or metadata is used to identify problems with a record and to then propose adjustments. By contrast, empirical homogenization is the process of comparing a record to its neighbors to detect undocumented discontinuities and other changes. This empirical process performs a kind of averaging as local outliers are replaced with the basic behavior of the local group. Rather than regarding empirical homogenization as a separate preprocessing step, we plan to incorporate empirical homogenization as a process that occurs simultaneously with the other averaging steps.
      Provide uncertainty estimates for the full time series through all steps in the process.
      The equations that provide a schematic outline of the approach we are currently pursuing are described in a summary document available here. Our ultimate algorithm will require additional features and modifications to address statistical and observational problems.”

      Bill

      • The issue is too complex for me to make any specific proposals, but it’s clear that the text that you picked from the BEST faq is written in the same spirit than, what I have in mind.

        The basic idea is to develop methods that are capable of using directly the existing data and avoiding all intermediary steps that would lose or distort information. The text mentions also the important point that data should be weighted based on the accuracy and information content. Doing all that involves technical risks as new tools must be developed and errors may be introduced in that. Even so the results should ultimately be more accurate and reliable.

        One essential point is, however, whether all that extra effort is justifiable. I’s not, if it can more easily be shown that the improvements will be too small to have significance for any final conclusions. That’s quite possible, but often proving that is more difficult than doing the better analysis and finding out that nothing changed on relevant level.

        A better methodology for handling the surface data may well have benefits by producing better data on local and regional level even, if it turns out that nothing essentially better is obtained on the global average temperatures.

        One methodology, where the aliasing errors are certainly a real problem concerns spatial analysis over the whole globe using some set of orthogonal functions. The details of geography may lead to significantly erroneous results, because continents and other specific features form too strong gradients for being handled with a reasonable number of orthogonal functions.

      • Pekka,

        Thanks. I didn’t expect a detailed analysis. :)

        “One essential point is, however, whether all that extra effort is justifiable. It’s not, if it can more easily be shown that the improvements will be too small to have significance for any final conclusions. That’s quite possible, but often proving that is more difficult than doing the better analysis and finding out that nothing changed on relevant level.”

        Especially in the current context….

      • Does anyone know why can’t remote sensing or thermal imaging fill in the gaps on the globe?

      • Yes, it can help, but the satellite record was only begun in the 70’s. There are studies which collect all data for a specific point (See for example http://www.arm.gov/). This includes lidar and cloud radar from measuring from below and satellites measuring from above, radiosondes and sometimes in-situ flights in between.

      • steven mosher

        Currently using the BEST-type methods ( methods closely related) I can pretty much say that the BEST methods don’t give you significantly different results when performed on the same data source. What the methods do allow you to do is to use fragmentary records.. so you get some better coverage ( which doesnt change the answer) you also get the standard errors, which helps with better CIs. So better uncertainty measures and some minor perturbations in trend estimates. Guess what?
        The warming in the 1930s doesn’t go away. neither does the warming of the past 50 years.

        go figure this: If I take the current data and use different methods, including BEST type methods I get the same answer.

        Go figure this: If I take the all those methods and cut my data in half…
        I get the same answer

        Go figure this; If I select 200 stations that have the longest records..
        I get the same answer

        Go figure this: If I add MORE data ( say from GCOS, or ghcn daily)…
        I get the same answer.

        So against the very real theoretical concerns stand the very practical results. The theoretical concerns are interesting for the technically inclined. But WRT the numbers that really matter.. not so interesting.
        At least they havent be proven to be scientifically interesting.

      • I have no doubt the pattern is in the data. This misses the point entirely. The uncertainties are with the data.

        By the way, I don’t believe there is any way to compute true confidence intervals in the area averaging method. Not CIs that capture the underlying averages used to compute the averages used to compute the average global temperature. You can treat the cell averages as measurements but this too misses the entire point. Every layer of averages has its own CIs, which you would have to combine and aggregate somehow. Nor can there be CIs when you do extensive interpolation and extrapolation. Area averaging on a grid is not statistical sampling. The math of statistical sampling just does not apply.

      • steven mosher

        There is no gridding in the BEST approach.

        Any prediction on how the mean from a gridded approach will compare with non gridded approaches?

        Any prediction on how the BEST confidence interval will compare with say Jones?

        Now’s the time to make a prediction about how important your concern is

      • What “warming of the last 50 years” are you referring to? First of all, HadCRU only shows a roughly 20 year warming spurt, from 1978-98. According to UAH the only warming during that period was a jump during the 1998-2001 ENSO cycle. There was no warming from 1978-1997. There was no warming after 2001, but the flat line is higher than before the ENSO. See: http://www.mediafire.com/file/a9tv9tad9e6216p/UAH_2011_06_19_two_regressions.pdf
        So what is this 50 years of warming?

      • You don’t get the “same answer”. Your answer is just not meaningfully different for its intended purpose. I know what you mean, and agree with you,and get your point. There is enough data and removing parts of it introduces little error.

        But I agree with a comment above that this tends to generalize too much with these kind of statements and can be misleading sometimes.

        What would be more complete and useful is if you told us at what point of cutting your data (2, 4, 8, 16?) that you did not get the “same answer”. Even a general statement like “things tend to fall apart at 64 stations” gives us a better feel for your hard work on the data.

        A visual example is a picture of a dog at say 2048 x 2048 resolution. When you cut the resolution to 1024 x 1024, you get the “same answer”, a dog. A meaningful data point would be to determine at what point it cannot be differentiated from a cat reliably.

      • Hmmm. So no matter how you measure the temperature (terrestrial or satellite), and no matter how you process it, you get the same result.

        Probably time to stop arguing about it then.

    • Richard Saumarez

      I would suggest that if you are attempting to validate a model, as in the discussion of my last post, if the input data is aliased and the output data isn’t or vice-versa, you are going to get some pretty odd results.

      Aliasing is the most basic concept in signal processing and ensuring that the data is not aliased is fundamental in processing any time series as it represents a highly non-linear, and unpredictable, transformation of the data.

      The difficulty arises when you don’t realise that a time series is a signal.

      • Norm Kalmanovitch

        A time series is a signal and signals have signatures so this might be a way to expose “Mike’s Nature trick” of adding thermometer data to the proxy data in the zone of overlap to make the proxy data perfectly match the thermometer data.
        There is about a 80 year overlap of proxy and thermometer data. If there was no fiddling with the proxy data the spectral signature of the 80 years of proxy data before the overlap should be identical to the spectral signature of the proxy data in the 80 year overlap and if there was some mischief the spectral signature of the thermometer data will show up in the 80 year overlap proxy data but not in the proxy data from the previous 80 years.
        Perhaps Michael Mann didn’t realize that a time series is in fact a signal and this signal may have left fingerprints on the crime scene.

      • Any systematic procedure or “intervention” with the data would impose a signal, and leave a “fingerprint”. Adventitiously or by selection or by design, this might be (like) a signal you are hoping/trying to detect in the data. And therein lies the rub.

      • Norm Kalmanovitch

        A seismic record is the convolution of a wavelet with a reflectivity sequence and the objective od seismic data processing is to reduce the wavelet yo a spike which would reveal the reflectivity sequence perfectly. This of course can’t be done because the wavelet is band limited and typically minimum phase commplicated with various phase distortions and amplitude variations.
        If we know the source signal we can extract the wavelet with a process called signature deconvolution which is essentially an operator designed on the phase and amplitude spectra of the source wavelet which convolved with the wavelet will produce a zero phase wavelet with a “boxcar” shaped amplitude spectrum.
        Typically we do not know rhe wavelet signature and determine it through various assumptions and use various deconvolution algorithms to replace the wavelet with as close to a spike as possible.
        If the data was “fiddled with” this type of analysis would show different phase and amplitude spectra fort the proxy data before and after the overlap and this can be compared to the measured data which will have its own spectral signature.
        The only rub is that the input data for the hockey stick is not made readily available as a time series for this type of analysis so I can only speculate as to its outcome.

  12. Can’t aliasing be simply avoided by just using the daily temperatures? Or, if you check your monthly averaged graph against your graph that only uses the raw daily data, it should be possible to see if aliasing is an issue. For example, it would be nice to see a graph of averaged monthly or yearly surface temperatures plotted against a graph using non-averaged data, and then to compare the underlining trends of both.

    • Daily high or daily low? What about hourly?

      The problem with samples is that the are only samples.

      • steven mosher

        How about 5 minute data? I’ve got hourly data, I can request 5 minute data. In the end you’ll see that concerns about aliasing are overblown. having compiled hourly data into daily data, daily data into monthly data, monthly data into yearly data, yearly data into decadal data… there is nothing there of SCIENTIFIC note. They are technically interesting, but scientifically un important. Simply, the estimates of warming trends are relatively insensitive to

        1. sampling period
        2. spatial coverage

        It’s still warming. GHGs play a role in this. The question is how much?

        Caveat: dan hughes work on Hurst and daily data IS interesting but for an entirely different reason

      • Richard Saumarez

        I’m sure that you are correct in what you say – but this is missing the point. We are presented with datasets in which every elementary theorem of signal processing has violated. These are the datasets on which we have to draw conclusions.

      • “We are presented with datasets in which every elementary theorem of signal processing has violated. “
        No, I think that is missing Steven’s point. A superficial investigation sees the data presented in a certain way. But it has been sampled monthly in what you see for convenience, mainly to reduce the size of data files. More frequently sampled data is available and has been analyzed. The importance of aliasing is known.

      • Nick,

        so if “The importance of aliasing is known.” it must be documented somewhere?

      • Richard Saumarez

        If the problem of aliasing is well known, may I ask why those producing the data sets didn’t apply a carefully designed digital filter to the data before decimating it.

        This comes back to my original point. If you are going to apply statndard statistics to averages of sets of data obtained at different places, this is fine. If you are goind to treat the averages as a time series, then it may not be.

        This is qualifed by the context of the question. If you are going to integrate the observations in time and space, aliasing effects will be negligible. If you are going to use them as model inputs (i.e.:SB2011), the effects may be important.

      • there is nothing there of SCIENTIFIC note. They are technically interesting, but scientifically un important. Simply, the estimates of warming trends are relatively insensitive to

        1. sampling period
        2. spatial coverage

        It’s still warming. GHGs play a role in this. The question is how much?

        The question of how much is a scientific question, and I think it likely that it can only be answered with data collected at least hourly on small grid scales. It has certainly not been shown to be independent of grid and time scales, and the potentially negative feedback effects of clouds happen on hourly scales. If an increase in CO2 were to cause an acceleration of daytime cloud formation and increase the duration of afternoon cloud cover and rainfalls in the tropics or American Midwest, then this effect would be totally missed by averaging over space and time. Not a lot of people think this matters , AFAIK, but no one has shown by data and analysis that it is irrelevant. I think almost everyone agrees that the role of clouds is the biggest unknown, and they are mostly transient phenomena, accumulating and dissipating over epochs measured in hours, and spatially spotty besides.

      • Matt – As I think Andy Lacis has mentioned in one of the other recent threads, these processes are evaluated at the grid scale level at less than hourly intervals – the need you mention is not neglected. The results for clouds come out from the GCM models as a net positive feedback, long term, in the longwave (greenhouse) component, and in some cases the shortwave (albedo) component as well, with an overall net positivity. HIRS and ISCCP cloud observational data are consistent with this, although the trends are not long enough to assure us that other factors are not also operating. In any case, those data tend to rule out a substantial negative feedback. Climate sensitivity can also be evaluated by methods independent of specific feedbacks by looking at forcing/temperature relationships, wherein the final results implicitly include the feedbacks even if they are not evaluated separately, obviating the need to ask what clouds are doing, or water vapor, lapse rate, ice-snow melting, etc..

      • steven mosher

        It would be silly to look for C02 effects on an hourly or grid scale basis.
        In fact the theory can tell you why you would not see anything.

        But if you want to look at hourly data it exists. Knock yourself out. Code is written and freely distributed to make that job easier for you. Support for that software is free. Now, arguing is easier than proving. Given that there is data, given that the software exists to get that data for you, given that it was done for free, I’d argue that I made proving your case easier. So, go prove it.

      • Stephen Mosher: But if you want to look at hourly data it exists.

        I am looking into it.

        It would be silly to look for C02 effects on an hourly or grid scale basis. In fact the theory can tell you why you would not see anything.

        The theory is incomplete and inaccurate. The effect of CO2, like the effect of clouds, is not constant throughout the day-night cycle.

      • The effect of CO2, like the effect of clouds, is not constant throughout the day-night cycle.

        Matt – Of course it isn’t, but the radiative calculations are time stepped to address the changing dynamics. If you claim that this isn’t done with complete accuracy, you would be right, but if you claim that it isn’t done at all, or is far off the mark, you need to justify that claim by going to the heart of the models, and pointing out exactly where you think the mismatches are. David Young has actually criticized models in regard to time stepping and suggested possible improvements. That doesn’t mean that their current performance is poor regarding radiative transfer. It’s probably pretty good, and there are greater problems elsewhere..

      • Also, I’m not disagreeing with Steven Mosher on this point. I interpret him to mean that you won’t see CO2-mediated climate change on an hour to hour basis. You will see changes in radiative up and down fluxes. You will also certainly see cloud changes, some of which are entirely unrelated to the concurrent changes in CO2.

      • Fred Moolton: You will see changes in radiative up and down fluxes. You will also certainly see cloud changes, some of which are entirely unrelated to the concurrent changes in CO2.

        Remember what it is that the CO2 does: it absorbs the upwelling radiated energy and transmits it via collisions to the adjacent atmosphere. Otherwise the N2 and O2 components of the atmosphere would not warm up. Since CO2 is densest near the surface, the effect of increasing CO2 will be to slightly increase the disparity between lower troposphere temp and upper troposphere temp, and increase the intensity of thermals and cloud formations. This could have the effect of increasing the speed at which the lower troposphere achieves its maximum temperature, increase the total duration of cloud cover, and increase the total mass of water that ascends, cools, and descends. The effect could be a net reduction of the daily mean temperatures.

        Not by much, of course, but the estimated (from equilibrium assumptions and calculations) change in the near surface Earth temperature is only 1% of the baseline, and the estimated transient climate response (Padilla et al, over 70 years) is half of that. The standard theory omits a mechanism that is potentially potent enough to reverse the estimated short-term and long-terms equilibrium-based projections.

        You are repeatedly asserting that what you know from equilibrium models means that these non-equilibrium transients are negligible, but they are not.

      • the effect of increasing CO2 will be to slightly increase the disparity between lower troposphere temp and upper troposphere temp, and increase the intensity of thermals and cloud formations

        Matt – Yes, convective changes and their consequences are accommodated in the GCMs but are not hourly consequences of increasing CO2 at current rates of increase but are far more gradual . Even for a small sudden CO2 increase, they would be measured more in days and weeks, although cloud changes unrelated to hourly increases in CO2 do occur in times measured in hours.. To say that increasing CO2 will increase clouds is an enormous oversimplification of the CO2 forcing/cloud relationship and is an inaccurate portrait of what happens. None of this is “missed” by the models, though. It seems to me that if you want to challenge how these are handled, you need first to find out how they are handled. Your assumption that they are overlooked because they are ultimately incorporated into averages is incorrect.

        Your statement that an effect could be a reduction in daily mean temperature is also unjustified by your speculations about heat transfer mechanisms and not easily reconciled with the relationship between temperature and heat loss from the surface via radiation, conduction, and latent heat transfer – at least for anything resembling our current climate. That the effect should be an increase based on known geophysics principles has been worked out quantitatively and is of course supported by observations (including radiative flux measurements) although not proved by them. The Trenberth/Fasullo/Kiehl energy budget diagram gives some clue to the relative strength of the individual phenomena (radiative, latent heat, thermals), but the important data are in the references. The general relationships aren’t controversial, although there are uncertainties about the exact quantitation.

        I think this is an area that you need to know much more about before you decide what is or isn’t being done, and at what timescales the different phenomena operate.

      • There’s also a subtle point regarding your claim that disproportionate warming of the lower troposphere via a CO2 increase could result in a mean temperature reduction because of increased upward movement of heat in various forms.The statement is wrong in its final effects, but it is true that when the lower troposphere is warmed disproportionately, the eventual convective adjustments remove some of that excess warmth and so the final result is a lower temperature than if the adjustment had not occurred. It is still a higher temperature than before the change in CO2 – a net warming but readjusted downward to maintain an adiabatic lapse rate.

      • Fred, I am flattered that you have read my posts. I do need to write something longer on these issues sometime, but alas I still have a day job at least for a few more years. I’ll try to summarize some points that are amply documented in the literature. I apologize for not including links. you can look up some of my publications if you want on fluid dynamics.

        1. Time stepping is just the tip of the iceberg. Usually the larger errors lurk in spatial discretization, the methods used to convert the partial differential operators to discrete representations on a finite grid. GCM’s use an essentially uniform grid and so far as I can tell finite differences. There are much more modern methods called finite element methods that enable solution adaptive error control through grid refinement and derefinement. We are not talking about factors of 2 here but orders of magnitude. This is critical for a problem as complex as the atmosphere or the ocean. For the ocean, the bottom profile is very complex and has a lot to do with the circulation patterns. Without grid adaptivity, you can get answers that are not just wrong quantitively, but qualitatively. The shape of the shore likewise has a huge effect.

        2. Its a coupled system. The dynamics and the radiative forcing are strongly coupled. Just think of cumulous convection. Yet convection is a very complex process. To say that “the details may be wrong, but the overall radiative balance is right” is purely conjecture until proven and in my experience is just wrong. If you get the details of the flow over an airplane wrong, the far field effects will also be wrong, such as downstream vortices and you will miss critical unsteady effects that can overwhelm the “long term” statistics.

        3. A common source of error in simulations is dissipation. Now there is real dissipation because the fluid is viscous. The problem is that naive numerical schemes, like leapfrog with RA filter introduce additional viscosity. This viscosity damps the real dynamics and washes it out. Basically, it can convert everything of real meaning into entropy and the result is totally wrong, not just off a factor of 2. This dissipation is often associated with the spatial operators as well.

        4. The doctrine of the modelers is generously stated thus: The attractor is the climate and if its strong enough, the trajectory you take to get there doesn’t matter. You will get sucked into the long term statistics. This is based SOLELY on the fact that the models seem to produce “reasonable” patterns run after run. It seems to have no theoretical basis whatsoever. Excessive dissipation would produce EXACTLY the SAME result. With the grid spacings being used, there is a lot of dissipation.

        5. There are simple numerical checks that most people do with simulations that don’t seem to be common in climate such as refining the grid and seeing if the answer changes, increasing the accuracy of the distribution of the forcings (they are not uniform you know), increasing the fidelity of boundary conditions, etc. and trying to get asymptotic convergence. There are also usually simple cases where analytic solutions are known that you can test your model against.

        6. The issue of subgrid models is also a problem that has tormented fluid dynamics for 100 years. Basically, turbulent processes are tremendously complex and impossible to resolve accurately with current computers. The attempts to model these things are always based on fitting special cases and often are just wrong for other situations. Trust me on this, we can’t even model a turbulent boundary layer in a pressure gradient even with very sophisticated subgrid models that have been worked on intensively for 50 years. Clouds or convection are much more complex.

        7. I like Andy Licas and he is a very good scientist. However, several of his statements raise big red flags. I’m paraphrasing here. If Andy wants to restate these things, I’ll stand corrected.
        a. “The numerical methods for climate models are a field in themselves.” If I had a nickel for every time I’ve heard this assertion about some field of computational physics, I’d be a wealthy man. My experience is that this usually means that the modelers have been too busy with modeling issues to upgrade their methods and that the methods are not very good.
        b. “The methods used tend to be constrained by computer speed.” Another red flag. The way to do this is to first find a method that is stable and accurate. Then, you speed it up. Andy says that various ad hoc filters are needed for example in the polar regions. That means the underlying methods are very fast but not very accurate.
        c. “Internal variability will be modeled poorly for the foreseeable future.” This means that the models in fact DO have LARGE ERRORS in the dynamics. To assert that the long term statistics or the radiative balance is nonetheless anywhere near the correct result is merely a naked assertion of experience with the models and not of anything that has been verified by rigorous testing.

        Andy, I love you, but I still think you need to rewrite your methods from scratch.

        Anyway, the kind of assertions you are making about the models require rigorous testing and I have yet to see any evidence whatsoever of this testing. You know in CFD, this was the situation for 30 years and finally in the last 10 years, NASA started doing this kind of thing. When I write something longer, I’ll give references, but the results shocked people. The subgrid models were worse than people thought and the prediction of small effects was very bad. Incidently since this work has been, there has been little significant improvement. By the way, there is a NASA Langley site on validation of turbulence models and its quite interesting.

        Anyway, my only point is that there are rigorous criteria by which to judge these things. They are well documented and used in many other fields of computational physics. In a field where policy is at stake, I would come down very hard on people who hadn’t done it. Certainly in structural design or aircraft design, the standards are much more rigorous and the consequences of missing some critical output such climate sensitivity by 40% as Hansen did in 1988 are much more serious.

        Sorry for such a long post. I’ll try to carve out some time to do this right, maybe at Christmas.

      • David – Thanks for your long comment. It requires a response from someone more qualified than I am to match your criticisms against the details of model construction and performance. Perhaps Andy will have a chance to respond, as well as others more directly involved in the numerics and/or the fluid dynamics..

        By “models”, we are essentially referring to GCMs. However, it is possible with much simpler models to arrive at similar conclusions about global temperature change and climate sensitivity, as discussed in various threads including the one on transient climate sensitivity. In these cases, if I understand the data correctly, the models are simply physics-based mathematical relationships between heat input in and out of the climate system and surface temperature, and don’t involve the complex GCM-type simulations of individual climate elements where errors in the numerical solutions to differential equations would be particularly problematic. They are based primarily on physical laws about temperature/radiation relationships and on observational data on climate forcings. They don’t address feedbacks because those are encompassed within the forcing/temperature relationship. With the GCMs, however, I would mention parenthetically that It is also possible to confirm some feedback estimates from observations.

        What I’ve concluded from all this is that we are left with a range of climate responses about which we can be reasonably confident. The range is too wide, and the suggestions you make should be seriously considered by the professional modelers in terms of narrowing the range – I can’t judge that. The GCMs also have important potential roles regarding phenomena other than global temperature change – precipitation, ocean and atmospheric circulation, tropical cyclones, extreme weather events, regional and short term projections, etc. Here, the simpler models are inapplicable, and since these phenomena are of great practical importance, improving the GCM simulations will be critical.

        Ultimately, I agree the models need improvement, and this will help greatly to improve all types of future projections as well as the use of the models to understand climate dynamics. However, I think we can even now make fairly solid judgments about climate behavior, including responses to anthropogenic greenhouse gases, based on current tools. Improvements will be desirable, but don’t contradict the conclusion that we already know a good deal about global temperature change.

      • Fred Moolton: Yes, convective changes and their consequences are accommodated in the GCMs but are not hourly consequences of increasing CO2 at current rates of increase but are far more gradual .

        Let’s leave it at that for now. You are saying that the increase in temperature toward the new equilibrium can’t be occurring in any particular hours. If the 1% change happens at an even rate over 70 years, then the hourly effect can’t be detected.

        Trenberth/ Fisullo/Kielh has holes, as they admit.

        We agree that I need to know more. Where are the references that describe the heat transfer in the summer squalls and such that I have been writing about?

      • Richard Saumarez

        I agree hourly data exists now. I presume that there are anti-aliasing filters so the data won’t be aliased and will it behave properly. That is not the point. Older data was sampled at daily intervals and this is aliased. If monthly averages are used improperly, this will lead to errors.

        The question is one of analysis. How much does this matter and in what context?

      • Fred, I hope this ends up in the right place since there was no reply button on your post. Anyway, I guess I agree that we already know quite a bit about the influence of GHG on climate. But you know as an outsider who has been out of the field for 30 years, it impresses me as a field where rigor is lacking compared to even fluid dynamics. I am really trying to convey a desperate plea for the team to get busy and really do fundamental research, rigorous verification and validation, and error control. It seems to me that the current pathetic state of climate science can only be addressed by doing that. I’m quite concerned that since a lot of these guys work for political scientists like Trenberth and Hansen, they may have little choice about what they work on. Hope I’m wrong.

        The other troubling thing is just the complete reluctance to admit uncertainty and possible errors. It’s bizarre and must be chalked up to the “leaders” in the field.

        David Young

      • Except that saying it is still warming and GHGs play a role is really not informative either. People want to know about discrete events, both causes and predictions.

    • Richard Saumarez

      Yes, you should use data sampled at the highest frequency possible. Of course, even daily data is aliased, but selective filtering enablesone to get out most of the aliased components to give a reasonably accurate monthly, yearly signal. With electronic thermometers, this can be easily corrected using using a digital filter bu oversampling, filtering and then resampling to giove a proper signal.

      • Richard,

        If you are interested in answering beginner-level questions can you respond to what I wrote to Dan Hughes above at 12:33pm?

        Bill

    • A better answer is to not “sample”, but “accumulate”. If you took the temperature every minute and averaged the result over 24 hours, it would likely be better than any sampling at a specific time per day.

      Using a type of “rain gauge” for temperature.

      But I’m with Mosher on this one, I doubt very seriously any of this matters much. That’s what data analysis is for, determining what matters and what doesn’t. I think we can trust this data (since 1850). Not so sure about the poles though.

      What I don’t understand is why they haven’t placed more instruments in the areas of poor coverage in the past twenty years?

  13. John Vetterling

    I am glad someone is bringing this up. But, in my opinion, and even worse aliasing problem is created by the use of “anomolies.”As I understand this process, all of the monthly averages for a 30 year perios are used to produce a reference value for each month which is then subtracted for the monthly average each year. This si such an absurd procedure from a statistical basis that it seems incoeivable that it is still used.

    Lets destroy 90 percent of the orginal detail and then attempt to reconstruct the signal from what is left over?

    • I confess I don’t see why the calculation of anomalies by this method is worse than the calculation of absolute temperatures by this method. Changes in long-term averages (non-stationary series) will only shift anomalies up or down by a constant, correct? I can however see that it adds one more step which must be properly recorded.

      • John Vetterling

        Its not anomolies v. absolute that is the issue. The issue is artifically imposing a 30-year reference cycle. Doing so risks introducing spurious cycles that are harmonic and masking cycles that are not harmonic to 30-years.

        For example if you detect a 10 or 15 year cycle is that actually present in the underlying temp data or just an artifact of the 30-years? Converserly a 17 or 18 year cycle is likely to supressed since you cannot get an exact number of cycles in the reference period.

      • steven mosher

        The simple fact is that there are methods which do not rely on a 30 year reference period. They give you the same answer as methods relying on a reference period. The selection of a reference period changes a couple things:

        1. the stations you can use
        2. the uncertainty

        Having tested CRUTEMP using a variety of reference periods, I can tell you that reference period doesnt matter, except to the extent mentioned above. I can pick almost ANY reference period and the trends dont change. I can break it by doing stupid things.. like a 5 year reference period.. I can generate a noiser signal by picking a 100 year reference period, but in general the reference period people use ( 1951-1980) or 1961-1990 doesnt matter. that is, the factual answer does not change.

      • That’s something, but you can not show that you have estimated the effect of CO2 accumulation accurately. I don’t think you can even eliminate the possibility that, starting from where we are today, and increase in CO2 will result in a slight average cooling (or reduced heating.)

      • steven mosher

        different Point entirely Matt.

        My point is simply about common misconceptions people have about the temperature series. There are some things that matter, and some things that dont. The reference period doesnt matter, except in the minor ways I’ve detailed. Anybody who thinks diffrently has all the tools to prove otherwise. I provide them for anyone who wants to prove differently. Most people DONT want to prove diffrently, they want to ARGUE that it should make a difference or might make a difference.

    • Norm Kalmanovitch

      John,
      There is a variation of absolute global temperature of approximately 3.9°C over the course of a year due to the seasonal changes imposed by the much larger northern hemisphere temperate landmass but the year to year variations in global temperature are only in the order of 0.006°C and this would be lost in the noise if the seasonal variation was not removed.
      There are five global temperature datasets three surface based and two satellite based that all use some sort of this correction to eliminate the seasonal variation, and all five global temperature datasets show identicaql trends with the only difference being in the fine details.
      All five datasets show no global warming for at least the past nine years, all show the effect from the 1991 Mt Pimatubo volcanic eruption, all show the 1998 el nino temperature spike and all show the 2008 la Nina temperature low. With the data being so consistant temporally fourier analysis is in no way affected by the method used to eliminate the seasonal effect.

      • John Vetterling

        The seasonal and latitudinal effects need to be preserved, but the method used is, at best, an archaism form pencil and paper days.

        There is a known annual cycle that can be accounted for, but you don’t do that by using a completely arbitrary 30-year frame. By doing that you risk interjecting spurious cycles that are harmonic to your frame length and masking cycles that are not harmonic. The correct way is to simply include a cyclic term in your regression to account for the annual variation. Ideally this would alos incoprorate the latitudinal effect as well. This is fairly elementary signal analysis/statistics.

      • Norm Kalmanovitch

        You seem to forget that the projections for AGW are for less than 2°C by 2050 but the annual seasonal variation over land is close to 14°C and averaged over the entire world it is still about 3.9°C
        The seasonal variations are not te same month to month which is why a 30 year average is applied to the monthly values to get the best possible differential from the norm. These are what is posted as monthly anomalies and then these can be averaged into yearly or 37 month or decadal or whatever average you want. Since all five datasets do this more or less the same and the different datsets produce more or less the same overall picture this method is about as robust as we can make.

    • steven mosher

      John, there are methods that do not rely on anomalies. You can find them in my R package Rghcnv3. These methods use least squares approaches and do not create anomalies. One was written by Tamino, one by the statistician formerly known as RomanM ( hehe) and the third by Nick stokes.
      Not an anomaly in the lot!

      The proof of course is in the pudding. How do the answers generated by anomaly methods compare with non anomaly methods?

      having tested that exact question I can say that there is no significant difference in metric we are interested in: Trend.

      That’s not to say there arent interesting little wiggles here and there. those wiggles are technically interesting to someone like me, but really un important in the debate over the size of the C02 effect

      • I agree, Steve. There are other ways to compute global temperature trends than anomalies. Anomalies are a convenience, though, because they allow us to see how individual locations are changing compared with a baseline period – the actual baseline chosen makes only a small difference. One of the conveniences of anomalies is that by comparing each location with itself on a monthly basis, they subtract out the fairly large variation in mean global temperatures that occurs over the course of each year due to differences in the land/ocean rations between the Northern and Southern hemisphere as well as other Earth/sun relationships. In fact, it is conceivable that in a stable climate with no forced or unforced trends or oscillations (except for seasonal variations), the mean global temperature will vary over a year by 3 deg C or more (highest in NH summer), while all the anomalies will be zero at every time point – of course, something that extreme would not happen in reality.

        I have the sense that the aliasing Richard describes will in fact corrupt some of the anomaly data, and might generate false trends on a short term regional basis. It seems less likely to me that this would have much effect on long term global trends, although I would be interested in other opinions on how that might be quantified.

      • Above, I should have said “all the changes in anomalies with be zero at every time point”.

      • “will be zero” – I’ll get it right yet.

      • I also notice that Norm Kalmanovitch made some of the same points before me about seasonal variation, but I hadn’t noticed them – my apologies for the repetition.

      • steven mosher

        ya, for people interested they can have a look at some of the aliasing effects you get with anomaly methods.. So, you can find regional effects or effects in certain time periods.. small scale.. But in the end when you create a “global average” all that melts away.

        Its very hard for people to understand. from the technical perspective I find this stuff fascinating.. little change here, little change there.. 5 hundreths here, 7 hundreths there.. but in the end you calculate a trend for the last thirty years and…. meh! all that detail amounts to bupkis.

        Its still fun to see the detail and a nice puzzle to keep an old brain active.. but as gavin told me back in 2007 ( and I agreed) you wont discover anything of scientific merit. that is, you wont discover that the laws of radiative transfer are wrong. You MIGHT make the models look a little better or a little worse.. you might see more natural variability.. but overturn an established science? nope. I can say 4 years later, thousands of LOC later, billions of data records later, 5 seperate analytical approaches later.. GHGs still warm the planet and no amount of fiddling around with the temperature record could possibly change that

    • steven mosher

      since there are roughly 7000 stations in GHCN.. 10% of that
      would be 700 stations.

      Suppose I picked 700 stations at random? what is your prediction?

      Suppose I pick 100?

      Suppose i pick 60!

      Yes, lets pick 60 stations. Now part of the reason we pick 60 is because the literature suggests that 60 optimally placed stations are all we need to reconstruct the global average ( Shen)

      What’s your prediction?

      Suppose that GISS has an average of .081C per decade ( 1939-2009)
      Suppose that CRU has an average of .12C per decade.

      And they use thousands of stations. I’m gunna use 60. only 60.

      man, will my answer suck or what? predict.. how badly will my answer suck?

      Actually, its Nicks answer. And this is the horrible effects you get by
      decimating the signal.

      https://sites.google.com/site/moyhudocs/pics/mar11/giss.jpg

      Opps! Not so horrible. maybe people need to understand that the signal we are interested in is the global trend.

      • Opps! Not so horrible. maybe people need to understand that the signal we are interested in is the global trend.

        We are interested in at least 2 other signals: the contribution of CO2 to the trend, and the contribution of clouds to the trend. Probably you should add the effects of different attributes of the constantly changing sun. When I write “interested” I refer to the recommendations that large amounts of energy investment be channeled away from fossil fuels and toward renewable sources – immediately.. It isn’t just a nerdy specialist interest in the faintest squiggles.

  14. Nullius in Verba

    Interesting post.

    I recall reading somewhere of how the cycle of leap years introduced a gradual shift and then jump in the relationship of calendar dates to the astronomical year. This would appear to be somewhat related.

    Another interesting question would be weekends and holidays. Do gaps in the data occur on some days of the week more than others? For sea surface temperature records taken from ships, are sailing times and schedules regular, with periods indivisible by calendar months?

    And it seems to me that the problem would be made worse by taking anomalies. Subtracting a large fixed signal is OK if the relationship is exact, but if the variation shifts slightly in time, part of the seasonal signal that taking anomalies is supposed to remove leaks through. I have wondered about that a couple of times with the Arctic sea ice extent series – after a certain point, it seemed the anomaly had acquired a bit of an annual cycle.

    The set of normals based on some past interval has a spectrum of its own – the annual frequency and its harmonics – which are subtracted. If the sample does not give an exact baseline expectation, which of course it won’t, some of the aliasing will find its way into the normal, and some of the harmonics of the error in the normal could cause further aliasing.

    I never did like anomalies – subtracting one big number from another, both of them subject to error… And there’s too much temptation to treat anomalies as if they were the data itself. Interesting problem.

    • Maybe I don’t understand everything about calculating anomalies, but if you subtract a constant, it shouldn’t introduce artifacts, and has nothing to do with the topic of aliasing. What it does do though, is somewhat like presenting a graph with a nonzero origin – it leads people who aren’t paying attention to think that the effect is bigger than it really is. If we really want to maintain perspective, all temperatures should be presented in kelvin.

      • Nullius in Verba

        Up to a point, yes – but remember it’s not just one constant, it’s a different constant for each month, and you got them from a finite sample.

        If you’ve got some high frequency signal cycling slowly through the months, and you take your set of constants from one extreme of the cycle, and apply it to data at the other, what happens? The weather varies with the day of the week, and some months have more weekends than others. If you take your normals from years where the extra weekends fall in some months, and subtract it from data for years where the pattern is completely different, you’ll get extra noise in the later part of the record compared to the baseline period. You get more extremes, but it’s not a change in the weather, it’s because of something silly like you’ve got more weekends in months towards the end of the year compared to your baseline.

        I’m not sure if you would. It’s just the sort of thing that *might* happen, if you don’t check.

    • steven mosher

      Thats not how anomalies work. The anomaly method calculates a monthly anomaly during a reference period. Like so: take a station. pick a month

      station 1, month is jan.
      collect all januaries during your reference period. say 1961-1990.
      12,13,14,12,13,12,11,10,15,12,11 etc

      Average all the januaries; 12.25

      Subtract that january figure from ALL januaries for that station.
      That gives you the “departure” from the mean 1961-1990 january

      If you dont like anomalies you can use a least square approach where anomalies are not even calculated. How do they compare? same damn answer.

      • Nullius in Verba

        What if I don’t like a least squares approach, either?

      • steven mosher

        Well, you can take it up with the nobel prize winner

      • Nullius in Verba

        Which one? Al Gore or Barack Obama?

      • Oh, that was a good one. ROFL. Man he walked right into that one. I bet you have been saving that line for a while. Ha. ha. This place gets pretty boring without a little humor.

        Snappy answer to appeal to Nobel authority, check.

      • Nullius in Verba

        Well, I mean really! He wasn’t very specific, was he?

        Least squares is an appropriate method with Gaussian, zero-mean, heteroscedastic, independent errors, and in a few other circumstances. It’s less than ideal when the error distributions are not Gaussian. But it’s a case of the hammer problem – students are taught least-squares in school because it’s simple to do and commonly applicable, and likewise with linear fits. It’s taught mainly because its teachable. But because it’s the only tool they’ve got, they assume it’s ok to apply it to everything. If all you’ve got is a hammer, everything looks like a nail.

        But if you’ve got a dirty great seasonal oscillation in your data, the errors aren’t Gaussian and least squares isn’t appropriate. That’s why they’re normally removed first.

        I have to say, I didn’t think there was anything much here; it was an interesting post with a few interesting technicalities to poke around in, but it obviously wasn’t going to be any sort of grand debunking triggering the collapse of the entire edifice of climate science. But when the authorities instantly show up saying “Move along. Move along. Nothing to see here,” with such enthusiasm it makes me want to stop and see what’s going on.

        It’s probably nothing, but what could reduce the Mosh to invoking the Authority of (unnamed) Nobel prize winners?

      • Nullius in Verba

        Tch. Now I’m doing it. Should have said “homoscedastic”. Apologies.

      • Least squares is really a pet peeve of mine. The math involved actually weights the outliers much higher than points closer the median. This drives me crazy as when doing a curve fit on experimental (lab measured) data, the source of the “error” in many cases is measurement error, not a valid data point.

        So you must throw out the outliers to get the best “true” fit when using this method. Judgment must be used. Try explaining the intricacies of how this is proper to a FDA reviewer. Does not compute.

        Too many people see this as a magic black box that regurgitates the mostest bestest goodest answer. It is not in many cases.

        I think I would really prefer a “least square roots” method in most cases that minimizes outliers instead of enhancing their effect on the final curve fit.

      • Nullius in Verba

        Tom,
        “Try explaining the intricacies of how this is proper to a FDA reviewer. Does not compute.”

        :-) I’ve been in a similar situation.

        You probably know all this already – but it’s interesting so I’ll talk about it anyway.

        As I said, least squares is based on the measurement errors having independent Gaussian distributions. The idea of the method is to find the line such that the probability of the set of errors implied is more likely than for any other line. The probability of a particular set of errors is the product of the probability for each individual error. We can simplify by taking the logarithm which converts the product to a sum. So we want to maximise the sum of the logarithms of each error probability. If the distribution is Gaussian, then the logarithm of the probability is minus the square of its deviation from the mean (divided by twice the variance and plus a messy function of the variance, which if it is a constant we can ignore). So to maximise minus the sum of these squares, we minimise the sum of squares. Hence least squares.

        If the errors really are Gaussian, then least squares does actually give the best answer. But if you can show that your error distribution is not Gaussian, because it’s got a bunch of fantastically unlikely extreme outliers, least squares is clearly not correct. That’s not to say, though, that simply dropping the outliers is always the right answer either.

        One approach that comes close to this is to assume the distribution is a mixture of Gaussian and uniform, estimate the density of the uniform distribution from the outliers you can see, and then knock out the outliers and a few others picked uniformly so the remainder will be something Gaussian-like. (This is a fudge, but it’s quick and easy and vaguely justifiable. If the ones removed are not the actual outliers they at least ought to have roughly the same effect on the result.) Or if the sample size is too small for that to be safe, you can take the sum of messy logarithms and minimise numerically.

        But certainly I’d agree with the reviewers that outliers should not be dismissed casually. You need to confirm they’re to be expected and explainable, check their distribution if you can, and consider whether they might be telling you that something is not as you thought.

      • Richard Saumarez

        If I can just add a point about least squares, which stems from making adaptive filters to get the noise out of cardiac signals from x-ray sets.

        Least squares is a convenient measure because it linearises problem the objective function – parameter is approximately parabolic. This is why you can use the Gauss Newton method in many situations and get rapid convergence to a solution.

        Unfortunately, a least squares approximation, i.e. power is not a good measure for x-ray noise and one has to use a different set of measures, based partly on the magnitude of inteferenece with a particular time domain characteristic. This results in a non-linear adaptation of the filter and the geometry of the parameter – objective function surface is awful and one has to use either a steep descent or a simplex.

        My impression is that least squares is a useful measure when you can linearise the proble, which is rather tautologous. With very odd signal features, leasts squares breaks down rapidly.

  15. This is truly an advanced statistical issue, but even my 40-year out-of-date university level mathematics skill can recognize that the real temperature time series is a daily cycle between highs and lows and that taking that time series and averaging it into a monthly average temperature then represented as a time series of monthly values leads to answers that may not represent the actuality of temperatures experienced at the site, especially if one is going to take that graph (time series) of monthly averages and use it to develop a time series of yearly averages which one then calls The Temperature Record for Las Vegas, Nevada. Worse yet when all the yearly records for hundreds of sites are then combined to arrive at a continental average….eh?
    I have read in some analyses that this is a perfectly valid way of looking at things…as ‘the errors average out and disappear’.
    This piece says to me that this type of sampling actually may introduce false trends in the resultant time series.
    Have I got that right?

  16. Arcs_n_Sparks

    Dr. Saumarez,

    Thank you for another excellent post. This electrical engineer appreciates you bringing this most fundamental element of sampled data analysis to bear.

  17. Interesting refresher on the topic, although no real new results yet obtained.

    Frequency domain analysis of climatic data is something many from engineering background, like me will find interesting. My impression is that classic signal processing tools are not part of classic studies in e.g. physics and meteorology, which are probably most common backgrounds in this discipline, and that’s why the frequency domain view on the data is not commonly presented. There are, of course, also other reasons for this.

    As Pekka Pirilä and others pointed out above, I’m also suspicious whether aliasing is a real issue with temperature data.

  18. steven mosher

    If you like you can use my R package GhcnDaily to download 26,000 files of daily data and have at it. You can also download hourly data from 200 stations in the US ( with triple redundant sensors) by using my package “crn”. any questions or problems, just mail me.

    Warning 26K files takes a long time.

  19. Judith Curry wrote:

    Further, I have find that running mean/moving average approaches can introduce aliases, I have been using a Hamming filter when I need one for a graphical display.

    Puzzling…

    Calculating an average of data is not the same thing as taking samples of data. In the first case one is characterizing data that is already known by calculating a statistical parameter, in the second case one is sampling a parameter at a given frequency. Aliasing is caused by under-sampling – it is impossible to introduce aliasing by averaging.

    In fact, the simplest anti-aliasing filter is a low-pass filter – the effect of which is precisely what taking a running average does to data.

    What is the reason for choosing a Hamming filter? This filter is optimized to minimize the nearest side lobes in frequency-space. Why is this not biasing the output?

    • This is correct. Averaging is basically a low pass filter, and cannot introduce higher frequencies.

  20. Richard Saumarez

    When confronted with aproblem there are two approaches:

    1) I’ve got a lot of data, I’ve analysed with “R” packages and I don’t think there isn’t a problem.

    2) You analyse the problem from fundamentasl.

    Aliasing is a non-linear transformation of a continuous signal into a sampled signal.

    I have to say that many of you sound very confident that you have the answer without a rigorous analysis. I was brought up in a hard school of engineering mathematics, and I am not convinced that you have made a serious analysis of the problem,

    • Richard Saumarez

      I should add that hindcasting is a common method of validating climate models. If the dats is of low quality, can we be certain that this is valid procedure?

      • Norm Kalmanovitch

        There is actually no problem with the global temperature data aliased or not. The problem is with the computer models which to date have yet to properly predict temperature based on the relationship used for CO2 forcing.
        The 1988 Model of Hansen perfectly matched the data from 1960 to 1988 but has failed to predict the global temperature even once in the last 23 years. The driver for the model is a relationship that projects increased global temperature from increased atmospheric CO2 but over the past 9 years CO2 has increased and the temperatures have dropped indicating that the basis for the model temperature projections is simply wrong; i.e. CO2 whether human or naturally sourced is not the prime driver of global temperature change.
        The nice thing about signal theory is that it is pure methematics and leaves no room for the input statistical biases thatn are the prime driver of the AGW conjecture.

      • observations are consistent with Hansen’s 1988 projection if climate sensitivity is about 3C/2xCO2

      • Not since 1998. It was 360 ppm then and now its 390 ppm and cooler now than it was then.

      • The only scenario in Hansen’s 1988 published model results that came close to matching (then) future temperatures was “scenario C [which] assumes a rapid curtailment of trace gas emissions such that the net climate forcing ceases to increase after the year 2000.”
        http://www.realclimate.org/images/Hansen06_fig2.jpg
        http://pubs.giss.nasa.gov/abs/ha02700w.html

        The effects of the actual increase in CO2 since 1988 matches closely to what the model predicted if there were a reduction in emissions. In other words, Hansen’s model was accurate only if CO2 had at most a negligible effect on global average temperature.

        I don’t know why CAGW fanatics love to misrepresent Hansen’s 1988 model and its predictions.

      • “Not since 1998. It was 360 ppm then and now its 390 ppm and cooler now than it was then.”

        Correct for ENSO and it is warmer now than it was then

      • SkepticalScience explains it as:

        “Total Scenario B greenhouse gas radiative forcing from 1984 to 2010 = 1.1 W/m2

        The actual greenhouse gas forcing from 1984 to 2010 was approximately 1.06 W/m2 (NASA GISS). Thus the greenhouse gas radiative forcing in Scenario B was too high by about 5%”

        “In other words, the reason Hansen’s global temperature projections were too high was primarily because his climate model had a climate sensitivity that was too high. Had the sensitivity been 3.4°C for a 2xCO2, and had Hansen decreased the radiative forcing in Scenario B slightly, he would have correctly projected the ensuing global surface air temperature increase.”

        http://www.skepticalscience.com/A-detailed-look-at-Hansens-1988-projections.html

      • Lolwot, 0.19 degrees/decade is hardly a prediction to be alarmed with since that is near the rate it has been for the last 100 years.

      • Norm Kalmanovitch

        Hansen not only predicted the temperature wrong he was wrong about CO2.
        Inn his 1988 paper he projected CO2 emissions to increase by 1.5%per year and have CO2 concentration increase accordingly
        In 2007 his projection for CO2 emissions for his most dire scenario A works out to 29.262gt but actual emissions were 31.641gt.
        This would have led to a prediction of CO2 concentration of 466.50ppmv but the actual concentration in 2007 was 383.71 (MLO)
        CO2 is increasing at a near perfect linear rate of 2ppmv/year and not at Hansen’s geometric progression of 1.5%/year.
        Hansen got both the concentration wrong and the temperature projection wrong so the only time his projection even came close was for a few months in 1998 when the temperature peaked. You seem to have a funny perception of what the word consistant means

      • “CO2 is increasing at a near perfect linear rate of 2ppmv/year and not at Hansen’s geometric progression of 1.5%/year.”

        Hansen’s scenario A was that the increase in CO2 per year would go up by 1.5% per year, not the level of CO2 increasing by 1.5% (the latter would mean CO2 increasing by over 5ppm per year in the 1990s)

        The concentrations for the scenarios are here:
        http://www.realclimate.org/data/H88_scenarios.dat

        hansen’s scenario A CO2 for 2007 was 385ppm

        “the actual concentration in 2007 was 383.71”

        Scenario B has it at 383.41

    • steven mosher

      basically I did #1 after seeing the concerns of #2.

      But the data is all out there. 5 minute data, hourly data, daily data, monthly data. Anybody who has a thesis can test it. Anybody who thinks there are fundamental issues can put that to the test.

      We kinda fought to get the data free and the code free so that people could
      look at these very technical concerns that have been around for years.

      Now that the data is free and the code is free.. heck I spent a year or more making it easy for anyone to do.. the theoretical concerns can be put to a test. no reason not to.

      • Richard Saumarez

        I am not sure what you are implying. If you are saying that good data can be analysed in a way that stands up, then obviously it will.

        If you have access to the daily data underlying the HADCRUT series, I would be interested to see it. At Professor Jonathan Jones’ suggestion I looked at some daily temperature series, and the effects that I have described are present in some of the records.

        The question is much deeper than one might think at first sight. Historical records have been analysed in a way that seems to violate every principle of signal processing. I am not sure what the overall significance of this is, and it certainly depends on the time scales (or bandwidth) that you think is significant.

        Nevertheless, to anyone who uses signal processing, and is aware of the errors that aliasing can introduce, this is potentially a real problem. I encountered this particular problem in an entirely different field, obstetrics. Nevertheless the same principles apply.

        If we are going to validate models by hindcasting, I would suggest that a knowledge that there are artefactual low frequency components in the data used to hindcast models is a non-trivial problem. If we are going to use a potentially aliased data set to determine parameters of a “model”, (SB2011, D2011), this is an important problem.

        I find the comments that aliasing doesn’t matter and it comes out in the wash, fantastic. I agree that one can analyse,up to a point, the effect of aliasing on the low frequency components of a signal.

        I would suggest that this is a more serious problem in climatic datasets than might be supposed. The answer to this problem is analysis.

      • I think he is saying, that he has analyzed the data, and he has concluded it does not matter. The data does stand up. He is giving you the opportunity to do so yourself, or check his results. Throwing theoretical rocks at it does not stand up when there is no reasonable argument that aliasing in this case leads to important errors.

        What high frequency temperature events are we missing here and why does it matter?

        Now hindcasting is complete garbage as a validation mechanism IMHO. I suspect there is quite a bit of tuning going with forcings to accomplish this. A 40% tweak to aerosols here, a 10% tweak there. Good results!

      • I am very glad this issue about frequency content has been made in to a ClimateEct post.
        Richard Saumarez raises excellent points about Aliasing. Not to put two fine a point on it, the Nyquist theorem is bare minimum 2 samples per max frequency because the theorem assumes an infinite time domain. With finite time domains, I would not trust frequency content higher than 4x dt – four samples per highest trustworthy frequency. Under-sampling I believe is a huge problem in the argument over feedbacks. The sample dt for feedback processes needs to be not in months, but might have to be in quarter-hours.

        What might be a bigger problem is what temperature record processors might have done and are doing to the Low Frequency content of the data. I made a case for this point in
        Rasey (WUWT 3/31/11, Expect the BEST…)
        Posit: The whole issue of Global Warming is to be found in the low frequency part of a temperature time series.

        … Suppose now that we take temperature records and using a scalpel of any kind, we take N*dt and make it into n1*dt and n2*dt were n1+n2 = N and n1 < N, n2 < N. Then each of the parts now have a LARGER dw, higher minimum frequency, which means a SHORTER resolution time per cycle than the original. The lowest dw from the original series is now in the bit bucket.

        [When you splice them series back together, the lowest] frequency data you see are purely contributed by the mental model of how you spliced them together. You have chased your tail. In the use of scalpel and suture you have thrown away the important stuff and substituted your preconceptions of how they should fit [as the sought after GW signal.]

        We have seen comments like
        Thus, although poor station quality might affect absolute temperature, it does not appear to affect trends, and for global warming estimates, the trend is what is important. – Dr. Muller quoted in WUWT 3/31/11 “Expect the BEST…”

        A saw-blade has lots of little trends that bear little resemblance to the long term trend. Without long, un-interrupted, unspliced, records, the low frequency content to uncover the long term trend does not exist.

        Richard, do you share concerns about the low side of the spectrum?

      • I hope this is the corrected html link to WUWT 3/31/11, Expect the BEST…)

      • Richard Saumarez

        Yes, I do.

        The issue is what you use the data for.

        For calculating mean global data between 1950-1970 and comparing it it mean global data 1990-2010, it doesn’t matter at all.

        If one is trying to validate models over a short period with potentially aliased data, then this is one hell of of a problem.

        I was struck when I wrote my “feedback” post, using a highly advanced and sophisticated model, that SB2011 and D2011 appeared to be using data that were were sampled at different frequencies. (If I have got this wrong, I apologise, but the details are opague). If they genuinely used monthly data sets they were trying to work out short term dynamics with an aliased model.

      • Rather than try to correct for the discontinuities in the records, we simply sliced the records where the data cut off, thereby creating two records from one. Dr. Muller, WSJ Eur 10/20/20.

        Low frequency data content into the bit-bucket. If you want to tease out long term trends out of noisy data, this is wrong, wrong, wrong.

  21. Richard – I would be curious to know your response to the quantitative issue. For global temperature trends, as I understand the process, each of thousands of locations is compared each month with baseline data for the same month to see how much has changed from one year to the next for that month and location. The anomalies (comparisons with baseline) are averaged globally. Monthly changes can be averaged into annual changes, and these can be smoothed into changes over longer intervals to develop a trend. At the same time, we know that real (non-artefactual) climate noise occurs that causes dips and bumps in the climate record punctuating long term trends.

    In light of this, what level of aliasing occurring concordantly at how many stations would be likely to create spurious trends on a global level over multiannual intervals? My intuition tells me that there is no likely physical mechanism capable of creating sufficient distortion to make much difference, but I may be underestimating that possibility.

    • Richard Saumarez

      I think that is very interesting problem and I did look at it briefly. The question is do you think that mean global temperature is a useful measure? If you integrate the problem over ancreasing time scale and globally, assuming that this is done accurately, the problem becomes less significant, to the point of being lost in errors and many other effects.

      If you want to drive a model with aliased data and use unaliased as the output, in order to test assumptions about what is going on in the model, then you will run into problems and, since aliasing is “an unpredicatble non-linear transformation of a continuous signal into a sampled signal”, you will immediately conclude that any system is non-linear.

      My basic point is that climate science should be rigorous. If there is aliasing in data and this is not recognised, I would suggest that this is not a rigorous approach. The effects have to be analysed in the context in which the data is used.

      From a basic signal processing standpoint, one would expect classical datasets to be aliased. If one is looking at dynamic situations, this is potentially serious. If one is looking at situations involving long term integration of the data, one would surmise that the effects would be less serious. I would predict with modern systems, the aliasing problem will disappear because everyone knows about it and will eliminate it. However, in the case of drawing conclusions and validting models with questionable historical data, the problem may be real. I have merely pointed out that this a potential problem, it should be recognised as such and its effects should be analysed.

      • Can’t anomaly computations produce aliasing? The monthly constants added to the signal impose a grid, but in the time domain rather than the spatial domain. The constants are a cyclic step function. Assuming anomalies are a good idea, it’s unclear (to me) that twelve partitions aligned with the Gregorian calendar is optimal.

    • The location anomalies are not averaged globally. They are averaged within grid cells. Thus the weight of each location depends on how many are in each cell. Also, cells differ in size. Cells without location data are estimated by interpolation. Also, in many locations the data is SSTs, not thermometer readings. Then the grid cell averages are averaged globally. In no way is this ordinary statistical averaging, nothing like it.

      • David – I used the ambiguous term “locations” rather than “stations” to avoid the issue of within-grid averaging. I agree that this introduces weighting uncertainties, and as I understand it, BEST is trying to avoid the gridding process to circumvent these problems. I didn’t see this issue though as making much difference regarding the consequences of aliasing for deriving long term temperature trends from monthly data.

  22. My instincts are pretty much with Fred Moolten and Steven Mosher, that this will all cancel out in the wash. Still a nice little question though.

    • Richard Saumarez

      The question is what do you want to use the data for?

      1) You want to calculate the mean “global temperature” 50 years ago and compare it with today – it’s a non-problem.

      2) You wish to validate models over a 60 year period? Then it may be a problem

      3) You wish to look at short term climate dynamics (<10 years)? I would suggest that it is definitely a problem

      • Richard thanks for answering my question above.

        As Major Tom said elsewhere above:

        “Calculating an average of data is not the same thing as taking samples of data. In the first case one is characterizing data that is already known by calculating a statistical parameter, in the second case one is sampling a parameter at a given frequency. Aliasing is caused by under-sampling – it is impossible to introduce aliasing by averaging.”

        I think I agree, inasmuch as I understand.

        So if we have a given temperature station, and we first calculate monthly averages from hourly data, a straight arithmetic mean, I don’t see how that counts as sampling of the type that can cause aliasing as described here.

        However if large periods of the “record” (HadCRUT for one example) have “raw’ data collection that is MUCH MORE SPARSE then it could be a big deal and I have no idea right now how much that is. By much more sparse I am thinking of say once a day or something.

      • Richard Saumarez

        Aliasing is not caused by averaging. It is caused by undersampling. The problem is that averaging over 1 month does not remove all high frequencies, so when you decimate at 1 per month, you aliasing.

    • Jonathan Jones,
      “Cancel out in the wash” sounds a whole lot like a distraction.

  23. I have not looked a this entire thread and not fully parsed your post Richard, but I am really intrigued by your contributions here. I am a professional composer and work with DAWs (Digital Audio Workstations) and aliasing and within sample rates is very much familiar to me from the technical side of my work. In general (Nyquist himself said that) the optimum sample rate for audio to take account of high frequency artefacts was around 60 khz wrt the 20 khz upper audible limit. We normally sample at 44.1 k (CD quality) 48k (Broadcast) and 88.2 and 96 (Hi def). Putting low pass filters to counter aliasing is also familiar to me at the AD-DA end but aliasing also occurs with sample rate conversion. Good sample rate converting algorithms are expensive but do my make a perceivable difference (just). They are judged by the amount of aliasing that occurs in the conversion process – less is better.

    One of the most important differences we consider for audio quality is bit depth. CD quality at 44.1 khz is sampled at 16bits, and when we record or sample a signal we choose higher bit depths (24bit) in order to improve the resolution of the sample. This has a much much more profound effect on the audio quality.

    Taking the analogy further, since you are dealing with signal processing in much the same way, what would you regard as the equivalent to bit depth? The amount of data per sample? So for example rather merely take the Tmax and Tmin, you take more data for the day – perhaps hourly and increase the depth from 2 bits to 24?

    I have long wondered about this. Recently in Perth Western Australia (from where I originate) they have had one of their hottest summers on record. Yet I don’t think they had a single day over 40 Celsius. Most unusual. It was the night time temps that were so elevated, and the Indian dipole bringing very warm water to the north of WA and providing northerly winds and suppressing the Fremantle doctor (sea breeze).

    It makes me wonder just how informative an average of Tmax and Tmin really is. For the purposes of broader climate change, does it matter?

    • Resolution in audio relates to the amount of possible sample values. 8bit provides a range of 256 possible values, 16bit gives 65,536 and 24bit gives 16,777,216. Larger values provide scope for greater detail and clarity in the audio.

      I would think the equivalent in climate data would be the amount of significant figures for each data point/sample.

    • If the data has a spread of say 64C from max to min and is recorded in integral units, then there are 6 bits of resolution (2^6) = 64.

      However things get more complicated when averaging over time or area. You can achieve “processing gain” during averaging. In short form you gain an extra bit of resolution every time you double the number of samples you average.

      So if I averaged 16 of these 6 bit values together, then my resolution changes to 6 bits + square root (16) = 10 bit resolution.

      Of course it is never that easy and the specific details of the characteristics of the signal matter in how much gain you can achieve. Other functions such as FFT’s have inherent averaging and processing gain.

  24. Willis Eschenbach

    First, outstanding post, my thanks to you, Judith, and to the author.

    Next

    …Problems occur when you can’t simply filter out high frequency components. For example, this was a problem in early generation CT scanners where abrupt transitions in bone and soft tissue radio density caused aliasing because they could not be sampled adequately by the x-ray beams used to form the projections.

    The key to dealing with aliasing is to recognise it and given any time series one’s first question should be “Is it aliased?”

    JC comment: My concerns regarding aliasing relate particularly to the surface temperature data, especially how missing data is filled in for the oceans. Further, I have find that running mean/moving average approaches can introduce aliases, I have been using a Hamming filter when I need one for a graphical display. This whole issue of aliasing in the data sets seems to me to be an under appreciated issue.

    The problem is that nature is not continuous. It is discrete. There is no gradual transition between water and air, between land and ice, between cloud and clear, between the condition inside and outside the thunderstorm. Nature exists in what Gerard Manly Hopkins called “pied beauty” in reference to horses with splotches of color. So as Richard says, because of these abrupt transitions, you can’t just filter out the high frequency components.

    But all of our attempts to model climate assume smooth continuous distributions of temperature and humidity and pressure and all of the other variables of interest, with nothing pied or dappled about them at all. Then the models are tweaked until they give the best possible fit to raw, jumpy, discontinuous nature.

    The crowning touch, the final indication of chronic modeler disassociation syndrome, is when the result is called “data”. People discuss using the “NCAR Reanalysis Data” in capital letters, when it is 100% model output.

    This is especially a problem, as you point out, Judith, when engaging in the dubious practice of “in-filling” data. All such data should be hunted down and painted dayglow green so it is never mistaken for the real thing.

    In any case, Richard, thanks for a very interesting piece.

    w.

    • For starters though — before we bother looking for more accurate methods of modeling the climate over the last 50 years — we must first assume that Mann’s graph is correct and that there has been no global warming or cooling over the last 900 years.

    • The problem is that nature is not continuous. It is discrete. There is no gradual transition between water and air, between land and ice, between cloud and clear, between the condition inside and outside the thunderstorm.

      An interesting take, Willis. Where Nature is discrete, the models are continuous. Where Nature is continuous (temperature, pressure, concentrations) the models are discrete.

      What a stunning asymmetry!
      Black is White. Up is Down. Hot is Cold.
      http://en.wikipedia.org/wiki/Doublethink >War is Peace, Freedom is Slavery, and Ignorance is Strength.

  25. Another outstanding post by Richard. Judith, you are bringing to the table all of the issues this non-scientist has wondered about in climate science since I began my interest in the subject.

    Sensitivity – check
    Non-linearity – check
    Spatial and Temporal aliasing – check

    Thank you again.

  26. Courtillot has commented on the information loss incurred by averaging.

    • gyptis444

      This isn’t so much an analysis of aliasing as introducing the Yule-Simpson Effect to an unsuspecting audience.

      Dr. Courtillot takes two sets of data which have underlying mechanisms that are quite clearly present (for what are after all two very small parts of the whole planet) and different, and shows that the signal for each of these mechanisms overwhelms any other signal.

      He then argues that these two individual cherry-picked cases are more important than the grouped data for the entire planet — some twenty times the total areas of the two cherry picked regions — without acknowledging that the confounding variable underlying both his smaller datasets is exactly temperature increase due rising CO2.

      Pfft.

      How could anyone be fooled by that?

      • Ah, yes; “grouped data”. Consisting of a pot-pourri of exactly the kinds of regional effects he speaks of, all blended and truncated and thereby more information-rich and reliable?

        “And why the sea is boiling hot, …”

      • How could anyone other than Brian be fooled by that?

  27. So.. Will BEST speak to aliasing?

  28. From the earlier screed:

    As Professor Curry asked me to give some biographical detail, I should explain that after medical school, I did a PhD in biomedical engineering, which before BME became an academic heavy industry, was in an electrical engineering department.

    How many of these electrical engineers pretending to be scientists is Dr. Curry going to promote? (See also
    Lemke, Frank). It’s becoming quite a specialty at Climate Etc. Is the psuedoskeptic bench really so thin that even a popular blog like this one cannot attract actual scientists?

    • Richard Saumarez

      THank you for that kind comment. I spent 25 years using engineering techniques to invetigate the mechanisms of sudden cardiac death. There was a body of opinion that was science.

      If you have an argument to make – I suggest you make it.

      • Richard Saumarez

        Thank you for this well argued piece. This post is about aliasing – and its potential application to climate data. This is a subject on which electrical engineers tend to be well informed.

        Should we have a point of view? You apparantly think not. Interesting. I wonder how satellites work, how measuring instruments work, how computers work, how the internet works? I wonder how the lights turn on when you throw a switch? I wonder how much of the mathematical techniques used in climate modelling stems from engineering analysis of physical problems. Clearly engineers have had no input in this and are unqualified in every respect.

        You may be unaware that the UK is facing an energy crisis. 25% of the population are predicted to be fuel poverty this winter. We are likely to energy blackouts. Taxes are going up throughout Europe. Biofules are creating starvation and ecologcal damage.

        All these stem from the predictions of climate science. Are you really saying that no educated person is allowed to question this? I will predict that an increasing number of properly trained scientists will start to scrutinise climate science with a rigour that should have been applied to it long ago.

        You can think what you like and say what you like in ad-hominem attacks, which you do not post under your real name. If you have got anything of substance to add to this argument on this post, I suggest suggest you do so. If not I suggest you simply go away and leave adults in peace.

      • Amen to that!

      • Richard,
        You are properly showing the results of the AGW social mania, but the maniacs are not really in a receptive mode.
        The interesting thing to me is how did AGW become such a powerful social force so as to shut down the thinking of its believers so well?

      • “The interesting thing to me is how did AGW become such a powerful social force so as to shut down the thinking of its believers so well?”

        Some people think that embracing stupidity (and then spreading it around)is going to confer a political advantage. True in some circles, not in others.

        Andrew


      • All these stem from the predictions of climate science.

        Yes. Especially the increased European taxes. Nothing at all to do with trillions of Euros worth of bad lending that national governments have covered, at taxpayer expense, in order to save their friends in the financial sector. And the USA has it worse, because they listened to the predictions of climate science more than anyone in Europe. The predictions of climate science definitely caused the collapse of Bear Stearns and Lehman Brothers. In fact, the predictions of climate science caused everything that’s wrong with the world today, including Al Gore.

      • Richard Saumarez

        I started this post on a technical thread and I definitely regret having, in a moment of irritation, responded outside the topic.

        I would simply say that climate science has widespread econmic and societal implications

      • At the risk of offending delicate sensibilities – so do epidemiology, robotics, and the internet.

        Are you trying to make some sort of cogent argument by pointing out these implications?

      • Major Tom,
        The social idiocy that led to accepting AGW also led to the idea that governments could borrow their way to wealth.
        AGW is merely the part of the catastrophe we are discussing (more or less) here.

      • Richard, Your posts and analysis are highly valued by me and most of the denizens here. The peanut gallery is not worthy of your or my time!!

      • Richard Saumarez

        Thank you. I very much appreciated your post on the problems of CFD. As a strictly part-time PDE solver in a very non-linear system, I have sensed some of the problems that you have outlined with great authority. I’ve reluctatntly come to the conclusion that in my field, modelling cardiac excitation, the solutions are not as accurate as one would would like to believe. I think the concept that they “seem all right” or “coincide with reality” is very important and perhaps one, meaning myself, should be a lot more critical of the results that have been obtained so far. One difficulty is generating “true” test cases.

      • I wonder how satellites work, how measuring instruments work, how computers work, how the internet works? I wonder how the lights turn on when you throw a switch? I wonder how much of the mathematical techniques used in climate modelling stems from engineering analysis of physical problems. Clearly engineers have had no input in this and are unqualified in every respect.

        Shades of the Apology. How little things have changed in 2,500 years:

        At last I went to the artisans, for I was conscious that I knew nothing at all, as I may say, and I was sure that they knew many fine things; and in this I was not mistaken, for they did know many things of which I was ignorant, and in this they certainly were wiser than I was. But I observed that even the good artisans fell into the same error as the poets; because they were good workmen they thought that they also knew all sorts of high matters, and this defect in them overshadowed their wisdom . . .

      • Nebuchadnezzar

        “Thank you for this well argued piece. This post is about aliasing – and its potential application to climate data. This is a subject on which electrical engineers tend to be well informed.”

        Hi Richard,

        Thanks for taking the time to post. Sadly, I don’t follow what it is you are trying to say. It seems to be an introductory post pitched at people who already know what aliasing is. A gentler, jargon-free intro would be much appreciated (by me anyway).

        I would also really like to see a practical demonstration that aliasing is a real problem in real climate data. The list of ‘potential’ problems in climate data is practically infinite so its hard to get too excited about any particular one when no evidence is given for it being an actual problem.

        Cheers,

        Nebuchadnezzar

      • This is probably going to show up out of order, but it was in reply to Nebuchadnezzar | October 19, 2011 at 2:41 pm :

        Here:

        http://www.youtube.com/watch?v=YpMtanpsVeE

        Visual demonstration of aliasing, a.k.a. beat frequencies, a.k.a. hetrodyne. The video image does this because of the framed sampling that all video does.

    • I should just speak up for electrical and electronics engineers. I happen to be one myself. We aren’t all right wing Libertarians who froth at the mouth when Al Gore’s name is mentioned.

      Engineers are usually such cautious people. Engineers like to have everything under control. As them, normally, if it’s safe to double any particular parameter in a complex system and they’d almost certainly say definitely not. At least not until conclusive testing and a rigorous theoretical evaluation had been conducted and it was safe to conclude otherwise.

      Unfortunately on the CO2 issue, many engineers tend to throw caution to the winds. A type of carelessness seems to take over. Double CO2 levels? Sure “”she’ll be right, mate” !

      It’s all very odd.

      • I should just speak up for electrical and electronics engineers.

        I have nothing against electrical engineers as people. But just look at this quote:

        As Professor Curry asked me to give some biographical detail, I should explain that after medical school, I did a PhD in biomedical engineering, which before BME became an academic heavy industry, was in an electrical engineering department.

        He is practically begging us to confuse him with a scientist. The awkward phrase, “electrical engineering,” is there and gone in a flash. It just doesn’t inspire trust, does it?

        And I have no problem with non-scientists having opinions on climate change — I have a few myself, and I’m not a climate scientist. What I find offensive is the effort to shroud their mumbo-jumbo in a cloak of borrowed authority — i.e., the full Monckton.

      • Well, Robert, I haven’t really intepreted Richard’s talks as “begging to confuse him with a scientist”. As he actually is one, unlike you and me.

        Engineers study and apply mathematics, physics, and of course specialized topics like signal processing discussed in this thread – of which you don’t seem to have anything to say. Of course engineering emphasises on real-world applications rather than theory. And as you probably didn’t know, engineering is taught in same universities as “real sciences”, like maths and physics, quite often by same lecturers and books. Polytech engineers are a different bunch, at least where I come from (this is not to downplay them at all).

        And finally, unlike me (and most obviously also you), Richard has a PhD, which has exactly the same – or even stricter – scientific standards as “traditional” sciences have for the degree. Climate scientist, no, nor does he claim to be.

      • Many engineers start off having studied Physics, like myself, and have then applied that scientific knowledge in an engineering discipline. The UK’s Royal Society says ” We aim to expand the frontiers of knowledge by championing the development and use of science, mathematics, engineering and medicine for the benefit of humanity and the good of the planet.”

        So, there is no intrinsic difference between engineering and science. One is the application of the other. Whether the climate is analysed from an engineering or a scientific viewpoint the conclusion is still the same. CO2 is an important GH gas. Varying its concentration will change the Earth’s temperature.

        But you’re right. Many engineers who spout forth on climate just don’t know what they are talking about. In many cases they’ve made up their minds first and looked at the evidence later. They’re the worst of all, especially when as you say they falsely claim to have used their scientific knowledge to reach an impartial conclusion.

      • “… Many engineers who spout forth on climate just don’t know what they are talking about. In many cases they’ve made up their minds first and looked at the evidence later. They’re the worst of all, especially when as you say they falsely claim to have used their scientific knowledge to reach an impartial conclusion.”

        If you replace the word “engineers” with “IPCC climate scientists” I think you would have a point with your gross generalisation.

        This thread seems to show that there is a need for much more rigourous interpretation of data and how it is presented and from what I see on this site in general most people have drawn their conclusions after reviewing the shoddy evidence provided by the IPCC and their cohorts.

      • I suppose you are a good person, but you don’t seem to have anything to say that is on point.

  29. Does Aliasing feed the Monster?
    No.
    Aliasing only undersamples the Monster.
    It camouflages its true size.
    The Uncertainty Monster is BIG.
    It lurks and slinks between the sparsely sampled data points.

    “Just when you thought it was safe to go into the models.”
    “I think we are going to need a bigger computer.”

  30. Tomas Milanovic

    This is an excellent post and an interesting discussion.
    Not only because of the technical point of aliasing but also because extracting a discrete series from a continuous process which happens both in the spatial and temporal domain is an important research field.
    The point that I would add is that it is true as Richard says that this problem exists both in the spatial and in the time domain.
    Typically in time domain one deals with extraction of time series and generally deals with spatially uncorrelated processes.
    In the space domain one deals with stationary waves.
    However in the atmosphere/ocean system we deal with an aditionnal problem – wave propagation. The processes are not steady and while the time processes are relatively continuous with qualitatively identified periodicities, the space processes present more discontinuities (ice/water, continent/ocean etc).
    That’s why the classical signal theory which is a disciplin dealing with the spectral behaviour of a set of functions Fi(t) becomes much more complex when we deal with fields Fi(x,y,z,t) what is the case with the Earth system.
    David Young has written a very relevant post touching on this issue and I agree with every word in his post.

    Judith, on a more site management issue :

    As I said above I, like many other participants, have found this post and the discussion interesting.
    At least untill the appearance of a “Robert” who barged in with nauseating, trolling posts like :
    I don’t have any burning desire to enter into an argument with yet another electrical engineer with delusions of grandeur. Your lack of qualifications speaks for itself. Perhaps what I can add in the political bias that informs your pseudoskepticism
    or
    How many of these electrical engineers pretending to be scientists is Dr. Curry going to promote?

    In France, a country I know well, the system of Grandes Ecoles which is parallel to the University is issuing “Engineering degrees”.
    Among others electrical engineering degrees.
    The training in math and physics of these Engineers is of an extremely high level and in fact equivalent if not superior to the training given to University students following the classical way (PhD, post doc etc) .
    It is only a matter of taste whether a French engineer (electrical or otherwise) decides to do scientific research or work in industry. There is no difference in scientific skills .
    There have been and will be French physics Nobels and math Field medals who are “mere” engineers.
    Seen from Europe the statements of this Robert sound arrogant, irrelevant and nauseous ad homs.

    As they only pollute and disrupt the thread, I would suggest that you ban him or at least issue a warning that if he continues to disrupt the threads he will be banned.
    I am sure that Climate Etc will be a much more interesting place with less trolls.

    • Tomas, I will clean the thread, i haven’t been keeping up over nite

      • Norm Kalmanovitch

        Judith,
        Signal processing is essentially the “bread and butter” of my profession as a geophysicist specializing in seismic data. (All digital seismic data recorders which have been around for over 40 years have always had a built in anti alias filter set to the recording sample rate).
        This gives me a rather interesting perspective on the commentary in this thread because it is quite simple to determine which of the commentators have actual hands on experience or just academic knowledge of the process based on what they read or have been taught, as well as those who have absolutely no clue about signal theory but make absolute statements based strictly on percieved ignorance.
        The fundamental premise of AGW is that CO2 concentration has increased and atmospheric temperature has increased over the 20th century. this is a 100 year period defined by just two points; “then” when the temperature was lower and the CO2 concentration was lower and “now” when the temperature is higher and the CO2 concentration is higher. This two point correlation is near perfect and forms the case for AGW but signal theory requires two samples to define the shortest period and these two points can therefore define a period of no less than 200years making this correlation invalid for a 100 year period.
        On the other hand sampling the past 130 years since 1880 on a yearly basis reduces the minimum period down to two years but when this is done we find a steady increase in CO2 concentration since 1880 but a cyclic overall increase in global temperature with cooling to 1910, warming to 1942 cooling to 1975 warming to 2002 and cooling since.
        This correlation between CO2 concentration and global temperature falls well below the point where R is sufficiently high to constitute a meaningful correlation.
        Spectral analysis of the past 130 year of temperature shows that there is one period related to the recovery from the Little Ice Age which represents the overall global temperature increase and on top of this is superimposed a 65 year period cycle likely driven by solar cycles. There is no temperature cycle that in any way correlates to either CO2 emissions or CO2 concentration other than the recovery from the Little Ice Age with its associated ocean heating causing increased outgassing of CO2 from the oceans in response to the lowered saturation point from increased ocean temperatures.
        Fourier analysis is free of any ideological bias and, provides incontrovertable proof that there is no possible significant relationship between CO2 and global temperature as claimed by the IPCC.
        This thread is based on the same signal theory that refutes AGW and the contributors to this thread can easily be identified as those who know what they are talking about and those who don’t on the basis of their belief in AGW.
        One does not have to make personal disparaging remarks; signal theory does that for us.

      • The Fourier analysis proves only that there is no such dependence that nobody among the main stream scientists expects to exist. The main effect is expected to be in the trend with a little deviation from that at multidecadal scale. The Fourier analysis is totally powerless in testing these expectations.

        It’s amazing that this kind of totally wrong arguments are repeated here time after time – or is it just what we should expect?

      • Norm Kalmanovitch

        The expectations are based on what fourier analysis demonstrates does not exist which is why AGW supporters expected warming from 2002 to today because of increased CO2 emissions but have only been presented with cooling for the past nine years and counting.
        If the expectation is that CO2 emissions will cause catastrophic global warming when is this warming expected to resume ?
        In 2006 the Hathaway NASA expectation for solar cycle 24 was that it would be greater than cycle 23 allowing for the dismissal of solar influence as being the prime driver of climate change.
        The Hathaway NASA 2011 prediction now shows that 24 is less than half them 23 cycle amplitude.
        It seems that expectations are wrong when they are improperly based on fabricated computer models and non existant correlations so what exactly do you mean by “this kind of tatally wrong arguments”

    • At least untill the appearance of a “Robert” who barged in with nauseating, trolling posts like :

      We all know that 95% of the skeptic commenters here are the real trolls and Robert is doing his best to stem the tide of the stupid. I only have a PhD in Electrical Engineering and have a thick enough skin to understand that what Robert is suggesting is that you have to earn your worth. I don’t see Suamarez doing anything like getting his hands dirty in real exploratory data analysis here, instead satisfied with making broad-brush assertions and acting like a concern troll.

      When Suamarez then said this today:

      You may be unaware that the UK is facing an energy crisis. 25% of the population are predicted to be fuel poverty this winter. We are likely to energy blackouts. Taxes are going up throughout Europe. Biofules are creating starvation and ecologcal damage.

      All these stem from the predictions of climate science.

      which is ridiculous when you consider that the North Sea is well past peak in oil and natural gas is falling with it, and this has absolutely nothing to do with climate science. That an engineer that lives in that country can’t figure that one out indicates that Robert has a point.

      • Richard Saumarez

        Oh Really?

        Wind power? Sound, cost effective, reliable energy production?
        Heavy Industry up in arms of carbon reduction policies? Major industries threatening not to invest in the UK because of excessive carbon reduction policies?
        This is off the thread, and yes I agree that we having to import gas because of the run down of North sea gas and yes the price has risen. This hides the elephant in the room that the “green economy” now accounts for 17% of fuel bills that will have to rise even further to pay for the massive infrastructural costs of an 80% (by law) reduction in CO2 emissions. Planning of the UK’s energy system has been completely skewed by the low carbon agenda and, to my mind ,unrealistic targets for renewable energy production. I am struck by the irony of saying that we have to import gas to make up the shortfall of NSea gas depletion when large deposits of shale gas have been discovered that can be exploited more cheaply than NSea gas. This will not be permitted because of the low carbon agenda and fuel costs will rise even more sharply (C Huhne.)

        If you really believe that cost of the the low carbon agenda, put, optimistically, at £18bn/year is not having economic consequences and the bill for for this will be paid by a burgeoning “green” economy, as opposed to increasing energy bills and green taxation, I think you are mistaken.

        Do you imagine that a substantial portion of the World’s food production would be diverted into biofuels without the low carbon agenda? Do not feel that this policy might be mistaken? Are you completely happy that this is a proportionate response to unimpeachable science?

        These policies stem from the predictions of climate science, whether you like or not. If it were not for climate science predictions, we would not be going down this route. Whether these predictions are correct and whether the policy response is correct, I cannot say. However, given the huge economic burden that CO2 reduction is going to place on the World’s economy, I believe that the assumptions on which they are based should be challenged, as would any other proposition of this scale. This is certainly no right-wing polemic from a denier, it is a rational response to a policy that is based on a prediction that may not be justified.

      • These policies stem from the predictions of climate science, whether you like or not. If it were not for climate science predictions, we would not be going down this route.

        What a bizarre notion. The reality is that climate science concerns are just a smokescreen for the outcome that people do not want to face. Most people in the know understand that portraying climate change as a concern works as a great placebo for the masses. That outcome is exemplified by what is happening in Pakistan with rolling blackouts. If you think for a moment that those countries have concerns about climate change or green energy above that of not totally regressing economically, I don’t know what to say. You have a very provincial outlook.

      • Richard,
        You are getting close to the heart of the aGW madness, so do not expect rational resonses from the believers.

      • Richard Saumarez

        These comments are rather strange. All I did was make a post on aliasing and I seemed to been sucked into something that one would rather avoid. I will not reply to any further comments by these two individuals.

      • What you’ve just witnessed is the interface between climate science and social science. Dr. Curry has dabbled in that topic, too. It’s fascinating, but as you say, strange.

      • Off topic

      • Norm Kalmanovitch

        Richard,
        You will notice that there was no response to my commentary above that the two point correlation of CO2 and global temperature is limited to a minimum period of 200 years and since we are only concerned with CO2 emissions since preindustrial times which is only since 1880 (which was used for the 100 year comparison from 1880 to 1980 by Hansen inm his 1981 paper this according to signal theory is not valid.
        By bringing up aliasing you have exposed the fundamental weak point of AGW that cannot be explained away by any so called peer reviewed study so the only thing left is personal attack, or some attempt to disasociate the temperature and CO2 time series from climate science by stating that the aliasing problem is irrelevant.
        You have not stated anything about climate one way or another and all that you have done is exposed a mathematical weakness that those pushing AGW conjecture must address and they are unable to do so.
        You have nothing to defend because everything that is presented in your posting is perfectly correct, so any who attempt to show that you are wrong are merely exposing themselves as being driven by either ignorance or some agenda and in all liklihood both.
        It is best not to respond to these commentaties and just let those who make these ludicrous comments expose thier ignorance for those of the same bent to see as support and for knowledgeable people to see it for what it is.
        Note that most of these people do not post comments using their full name and in most cases some made up name.

  31. Richard,

    as a fellow engineer, a couple of comments:

    1) doesn’t the fact that individual stations will report measurements at the same time each day remove the 24 hr cycle from the figures?

    2) doesn’t the use of anomaly remove the annual cycle?

    If there was a sampling frequency completely decoupled from these, then I think the analysis would be interesting, but I suspect that these two remove it as a real issue?

    Have I missed the point somewhere?

    • VTG and Richard,

      I want to tack my latest thoughts/questions on here, this is responding to VTG and Richard’s reply to my last above.

      -Sure, averaging doesn’t cause aliasing. If the original data source reports an hourly temperature measurement, it is my understanding that will in most cases adequately protect against aliasing IF the signal is on the order of 1 cycle per day. Obviously it won’t be adequate for frequencies less than 2 hours and maybe not less than 4-6 hours, if they exist. I would interpret a conclusion from what some of Steve Mosher says above that very high frequencies (several cycles per day) don’t result in aliasing in temperature data. If, however, a temperature station records only at 12 noon every day, there is a very real possibility that the signal is aliased.

      What I don’t know is how much of the record is in which type of condition? Here I suspect that at any given location, as we go back in time, we are more likely to find the latter condition.

      At any rate, monthly averaging and calculation of anomalies will neither remove nor introduce aliasing, the aliasing either is or isn’t present in the original signal.

      I don’t know how the use of multiple individual stations reporting in sync in different time zones helps this cause, if that’s what you mean by #1, VTG, because now we’ve introduced a spatial averaging issue that requires different analysis methods.

      -Finally, w/r/t Spencer and Dessler, I note this from Wikipedia regarding the satellite data: “During any given 24-hour period there are approximately 16 orbits. Almost the entire globe is observed in either daylight or nighttime mode, many in both. Polar regions are observed nearly every 100 minutes.” So with the temperature soundings at least, we are sampling about every 1.5 hours. Hopefully this is adequate. To me, this would suggest that if the other data used in these analyses is sampled at a similar interval, and then the monthly averages are computed simply by taking
      Sum(month)/#(month), aliasing should be avoided. I wouldn’t think this could be referred to as “decimating”. However, I don’t think this addresses possible issues with the comparison of different data sets (T, Rf) as in Spencer and Dessler because they could be operating on different frequencies.

      • Narsh. I see where I am wrong above w/r/t satellite. We’re not sampling the same locations every 1.5 hours, just the same latitudes. So aliasing is a possibility with the “raw” satellite data!

      • & finally, I see this on Spencer’s website:

        http://www.drroyspencer.com/2009/11/some-comments-on-the-lindzen-and-choi-2009-feedback-study/

        “ANOTHER VIEW OF THE ERBE DATA

        Since I have been doing similar computations with the CERES satellite data, I decided to do my own analysis of the re-calibrated ERBE data that Lindzen and Choi analyzed. Unfortunately, the ERBE data are rather dicey to analyze because the ERBE satellite orbit repeatedly drifted in and out of the day-night (diurnal) cycle. As a result, the ERBE Team advises that one should only analyze 36-day intervals (or some multiple of 36 days) for data over the deep tropics, while 72-day averages are necessary for the full latitudinal extent of the satellite data (60N to 60S latitude).

        Lindzen and Choi instead did some multi-month averaging in an apparent effort to get around this ‘aliasing’ problem, but my analysis suggests that the only way around the problem it is to do just what the ERBE Team recommends: deal with 36 day averages (or even multiples of that) for the tropics; 72 day averages for the 60N to 60S latitude band. So it is not clear to me whether the multi-month averaging actually removed the aliased signal from the satellite data. I tried multi-month averaging, too, but got very noisy results.

      • This thread is about aliasing and the opening post specifically about aliasing related to processing time series type of imformation, which is not likely to be important for many issues related to climate science, but aliasing is just one problem from sampling and other problems related to sampling are very essential problems for climate science. Many of the comments have, indeed, discussed such other problems that I would not classify as aliasing, while some others might include more to that concept.

        Observing the temperature at one or two fixed times every 24 hours is not likely to have problems of aliasing, but it may certainly provide an erroneous time series for the annual variations of daily average temperatures at that location, because the relationship between the temperatures at those specific times of day and the 24 hour average calculated from data measured every minute may vary from season to season and from weather pattern to weather pattern. Similarly using a few points of measurement to estimate temperatures for a cell in collecting gridded data may have many different types of errors.

        Estimating SST variations from data collected by ships of varying number and using routes that cover the ocean sparsely creates even more difficult problems of sampling.

        It is, however, unclear to me, how far and where other fields of application can provide useful tools and approaches to climate science. We have seen on this site a large number of posts by people from various fields. There’s certainly something valuable in each of these postings, but unfortunately the discussion and sometimes even the posts themselves get interpreted by many as proofs of additional uncertainty in climate science, while they should be read as a source of inspiration that may lead to some useful applications rather as any kind of statements on the present climate science. Most of the authors of these posts don’t know enough about the state of art of climate science to make such statements, but many skeptics find evidence for their views from everything. I’m sometimes a bit disturbed by this effect, but I don’t agree with those critics of Dr Curry, who don’t see at all the positive side of such posts. (I think this is one of the main reasons for the most harsh criticism by some.)

      • Richard Saumarez

        Without wishing to be provocative and reponsing to your earlier post about monthly data in SB2011.

        I am not a denier and I am not a whole hearted believer. I am not a climate scientist. However, if one has used certain tools, one can see the application to other fields. You will please note that I have never ventured an opinion on CO2 and its feedback effects.

        Turning to my earlier post. I wrote that after reading the Spencer /Dessler controversy and I thought that a) the mathematics was wrong and that was being referred to as feedback, could not possibly be feedback in their equation. I produced an absolutely minimialist model of feedback, which was linear as were the assumptions of S&S and D, and analysed to show that the feedback term in S, D could not possibly be feedback. I also thought that the analysis of their data did not make sense within the mathematical framework in which they were operating.

        This is not being a denier and it is not being an AGW promoter. It is simply scientific criticism. I was struck by the number of people who told me that I shouldn’t model climate as black boxes, the whole things is incredibly non-linear etc. I have no intention of modelling climate at all, it is not my field. What I was saying is that if you wish to establish the presence of a feedback from the observation of inputs and outputs (and assume linearity) this places restrictions on the type of model you use and the way you analyse data. You may remember that I emphasised that I was not building a climatic model but putting up the simplest possible model that did contain feedback. I also expressed considerable scepticism about whether the approach to parameter fitting of the model was possible in light of the data.

        I was struck at the time (and did allude to) the potential aliasing problem in using monthly data to analyse a model as I felt that it would be aliased. I have therefore analysed the potential for aliasing in the processing commonly used to turn daily data into monthly averages, which shows that if one is going to do this and use it to drive a model such as has been discussed, there is a potential problem, and in fact that is what figure 11 shows. I am not denying anything and I am not proposing anything about the generality of climate science. I have commented on the analysis that has been used in particular instance to make a conclusion. Whether this conclusion is correct or not, I cannot say, because my feeling is that it would require a different experiment to test the hypothesis.

        I have simply said in this post that the methods used to decimate times series are important, if they are ignored they may lead to trends in decimated data and this is a problem that should be carefully considered and analysed. I have said nowhere that I think the whole temperature record is wrong and I have stated that if one is considering the global mean temperature, the effects are small, to the point of negligible. I have pointed out that if you attempt to characterise a system and that you use aliased data, you will draw false conclusions.

        If you feel that these posts are inappropriate because of my lackof climate science, I would respond that I have made posts about scientific logic and procedures that have a strictly limited scope. If you feel that are fuel for “deniers” then is there any scope for any publication that does not echo what you believe to be the truth? If what I saying is wrong it should be refuted as in any normal scientific dialogue.

      • I am not a denier and I am not a whole hearted believer. I am not a climate scientist. However, if one has used certain tools, one can see the application to other fields. You will please note that I have never ventured an opinion on CO2 and its feedback effects.

        That’s not true.

        Comment by Richard Saumarez on December 28, 2009 at 11:42 pm

        Dear Dr Clark,
        I am sure that you are now developing a very serious climate policy. The problem that I have is with the AGW hypothesis. While this is regarded as setlled science, it is now clear that the scientific basis for AGW, apart from a modest increase due to CO2 forcing, is becoming increasingly uncertain.

        You remember that comment; it the same one in which you praised the Wegman Report as “a model of objective analysis” and begged God for a Conservative victory in the coming elections.

        Given your record, you claim that you merely, as if by chance, discovered “certain tools” with applications in “other fields” — “tools” that you claim cast doubt on science you have vilified pretty regularly for years — is just not credible.

      • Robert,

        Stop it.

      • If you feel that these posts are inappropriate because of my lackof climate science, I would respond that I have made posts about scientific logic and procedures that have a strictly limited scope. If you feel that are fuel for “deniers” then is there any scope for any publication that does not echo what you believe to be the truth? If what I saying is wrong it should be refuted as in any normal scientific dialogue.

        In a normal scientific dialogue, ideas may be supported by evidence, undergo peer review, be published, and then be critiqued, commented and expanded upon, and challenged and if possible replicated.

        An argument that has been through that is part of the normal scientific discourse, and should be engaged with on that basis. An argument that is put forth as a blog post has a different set of standards to meet. Does the author have any credentials in this area? Have they published in the field? Are they reasonably unbiased, or are they seeking to justify a preexisting political agenda? (I recommend the Climate Etc post on “Meta-expertise” as a good discussion of these and other tests of “experts.”)

        You fail on all counts. Hence, there is no need to refute you. Rather, you have the burden of making the case that your argument deserves the time and attention a formal refutation would require. Given that you have been less than honest about your previous statements on climate change, I’d say you have a tall hill to climb before anyone who does not already agree with you takes you seriously.

      • @BillC

        What specifically do you object to and why?

      • I object to your whole line of discussion on this thread because it is off topic. Perhaps it is relevant elsewhere, like on some other thread on this blog where we discuss motivations. This one is about signals and aliasing.

        -Bill

      • @BillC

        I suggest you review the thread on Meta-expertise.

        The qualifications, prior work, and potential for bias are all very relevant to evaluating any source, particularly when they are presenting unpublished and non-peer-reviewed critiques of the science.

        Now, I apologize for my unnecessary snarkiness about electrical engineers. Fine people, one and all. But the credibility of the source is relevant, especially when there have not been other forms of quality control (like peer review).

      • OK, but – you’re presumably keeping Richard from answering technical questions.

        I read your link, and I don’t care. And I don’t care that Roy Spencer is a creationist. I care what they have to say about technical matters. I reserve my right to judge based on that.

        If a non-peer reviewed critique by a climate science outsider on a climate science blog has flaws they will be found. Witness the dragon slayers thread.

      • OK, but – you’re presumably keeping Richard from answering technical questions.

        How on earth have I done that? As the blog rules state “Only respond to comments that you feel are deserving of your attention, and ignore the rest.” Richard is free to ignore me.

        I read your link, and I don’t care. And I don’t care that Roy Spencer is a creationist.

        Which of course is your prerogative. Personally, I care that he claimed never to have expressed an opinion on CO2 forcing, when in fact he had. I care when people aren’t honest in how they represent themselves. The fact that you don’t care doesn’t mean no one cares or that it isn’t relevant.

        If a non-peer reviewed critique by a climate science outsider on a climate science blog has flaws they will be found.

        That sounds like a lot of investment of time and energy for very little payoff. And the dragon threads are a perfect example of that. I think evaluating the credibility of the source is a necessary part of evaluating an argument.

      • So everyone’s favorite doofus, Robert, has taken a break from his travails at “The Idiot Tracker”, Robert’s loser-blog that no one reads, and gone on one of his usual tears through this blog (hey, Robert!–thanks for the spam-link to your loser blog that no one reads in your opening comment).

        And, as we’ve come to expect from Mr. Screw-Up, himself, Robert’s blunder-buss pot-shot at electrical engineers inadvertently ended up nailing two of his buddies–WebHub and tempterrain. Which then prompted an indignant tempterrain (temp’s always at his best when in an indignant snit) to reveal the critical, “meta-expertise” criterion by which the worth of electrical engineers can be judged–the good ‘uns don’t “froth at the mouth when Al Gore’s name is mentioned.” (St. Al’s B. A. was in Government Studies, incidentally, for you “meta-expertise” mavens, out there.)

        And to give the flavor of Robert’s loser blog, that no one reads, one may profitably consider the title of the somewhat disturbing post which Robert linked to in his initial comment on this thread. Robert’s blog post, of which he is the author, is entitled, “Frank Lemke Mathturbates in Public–Judith Curry Watches”. Quite the snappy title! And that (heh-heh) image of Judith Curry watching Frank “mathurbate” publically–well, that’s the sort of edgy snark that thrills Robert’s greenshirt comrades (while leaving the rest of us wondering just where the lefties dug-up this very strange Robert idiot). And that, in turn, allows one to reflect on the salient feature of the “meta-expertise” of CAGW advocates–they are almost all really, really weird, creepy people. Like Robert.

      • Richard Saumarez

        Yes, I have found the Idiot Tracker. Interesting.
        How do you kill a troll?
        You raise them to the height of their egos and let go. When they impact at the level of their IQ, death is usually instantaneous.

      • Pekka, Your doctrine could be summarized just as the Penn State committee summarized its findings on Mann. It was treated as a joke in the Atlantic. It runs like this: Mann gets research grants and gets his stuff published ergo there is no misconduct.

        I personally believe that climate science is correctly characterized by Lindzen as “a small primitive field beset by immense uncertainty.”. You have too much respect for authority. In any case, this is a policy issue of huge importance and more mature fields have huge contributions to make. More selfishly for the team if they can’t convince people like myself and Richard, they will fail in their crusade.

      • The difference between me and many others seems to be that I judge independently, whether I believe in some particular theory or not. I don’t summarily think that everything the climates scientists do is bad or that they are all crooks.

      • There’s certainly something valuable in each of these postings, but unfortunately the discussion and sometimes even the posts themselves get interpreted by many as proofs of additional uncertainty in climate science, while they should be read as a source of inspiration that may lead to some useful applications rather as any kind of statements on the present climate science. Most of the authors of these posts don’t know enough about the state of art of climate science to make such statements, but many skeptics find evidence for their views from everything

        I am at least half guilty of that charge, or guilty of a reduced variation. I do think that some of what I have studied casts doubt on the reasonableness of the claim that the equilibrium climate sensitivity and the transient climate sensitivity can be accurately known. On the other hand, I took some inspiration from the Padill et al atricle posted here a few weeks ago, and I have undertaken two projects for next year’s Joint Statistical Meetings in San Diego, one a data analysis and the other (hoped for) a session of invited papers.

        As everyone knows: (1) it is easier to point out a problem than to solve it; (2) experts in a field almost never recognize a problem that is pointed out to them by experts in other fields; (3) with many more ways to be wrong than to be right, many efforts to correct a problem, when it is properly identified, will themselves be problematical. Here, Richard Saumarez has presented aliasing as a problem. In the frequency-domain analysis of stationary time series, undersampling results in too much power attributed to low frequencies; from an autoregressive point of view (mine and Padilla et al’s), undersampling results in failure to identify important covariate relationships (linear or nonlinear.) If I understand the posts of Stephen Mosher, he claims that undersampling in the time and spatial domains is not a problem. I don’t believe that the analysis of within-daily measurements has ever shown that he is correct. Previously I identified the equilibrium approximations as sources of model inaccuracy, and I think that the majority response is that the inaccuracies are too small to matter, though that has never been shown either.

        Right now, the GCM-based predictions of temperature increase are running too high. It could be that the inaccuracies are due to something unimportant (and the 50-year prediction will somehow turn out to be accurate), but with aliasing and equilibrium assumptions having been pointed out as potential sources of the error, it should at least be admitted that the evidence to date is insufficient to show that those potential sources of error are in fact negligible.

      • Richard Saumarez

        No, the point is that if you average and then take averages at intervals of 1/month this will alias the signal, if you use these samples at one month to represent the underlying continuous signal. I think that this may be potentially important if you are using this data for system identification as per SB2011 and D2011.

      • Whether there is a problem in that depends on the use of the averages. To claim that there is a problem implies that the data would be used for something it doesn’t fit for. Whether the problem can be called aliasing is even then a matter of interpretation.

        Personally I cannot see there any problem that would be described well by stating that it’s aliasing. I don’t claim that there wouldn’t be anything that can be called aliasing, but I don’t see that using the concept would help in any way in either recognizing the problem or solving it.

      • Richard and Pekka, you guys seem to be disagreeing here:

        Let’s say I have one temperature station that records every hour. Every month I go collect the hourly meaurements, add them and divide by the number of hours in a month, around 720 depending on the month. I report that number as a monthly mean. I then use that monthly mean to compare against the same number, calculated the same way, for each occurrence of that month in the past XX years I have been running my statio, in order to determine the long-term trend at my station.

        What have I done wrong?

      • Pekka is afraid that we ignorant Internet masses will lose faith in science and civilization if you start talking about ways that errors manifest in systems

      • Kermit – no, I don’t think so. But I would add that I think I disagree with Pekka in the sense that if satellites sample each point about 1.5x daily, as seems to be the case at least for the tropics, from the “16 orbits in 24 hours” (32 longitudinal slices) that in the context of Richard’s post, it does not exclude potential aliasing.

      • BillC, in Pekka’s 10:27 am post he says the people are going to assume this means additional uncertainty proof but Richard is clearly saying this is a potential area worth considering. I don’t get Pekka’s point.

        Steve has been the most helpful and says he couldn’t find any evidence of this in his experience.

        Nick said this was considered but never explained further.

        Robert hates creationists and EEng PhDs.

      • BillC,

        If you check all my comments in this thread, you’ll notice that I made a similar comment on the possibility of aliasing in satellite data.

        Some satellites have orbits that are synchronized to pass the same location at the same time precisely to avoid that problem. Thus solving the problem is not done only, when the data is processed, knowledge of it has been the basis in choosing the orbit as well.

        These issues are so obvious that I doubt that there are any serious scientists working with the satellite data who would not be fully aware of them and whose methods would not take the issue into account to the extent possible. With asynchronous orbits it may be impossible to remove all the spurious effects without giving up valuable information as well. That leads to making some compromises in data handling.

  32. Threading’s broken. Maybe a continuation thread is in order?

  33. Aliasing is a problem when analysing data from polar orbiting satellites, like the NOAA satellites that house the AVHRR and MSU instruments. These cross the equator at “fixed” local times separated by 12 hours. However, the orbits drift slowly so the “fixed” equator crossing changes slowly over the lifetime of the satellite. This means that the satellites sample different points in the diurnal cycle at different points in their lifetime.

    (This image shows the local equator crossing time for NOAA satellites, but may not be visible to all (sorry) http://media.wiley.com/wires/WCC/WCC80/mfig005.jpg)

    These effects are known and accounted for (not necessarily perfectly) in things like the UAH and RSS lower tropospheric temperature series. The same problem affects sea-surface temperature retrievals from the AVHRR instruments (amongst others) so a lot of work has gone into understanding the diurnal cycle in near-surface sea temperatures. e.g.

    http://ghrsst-pp.ab-hosting3.co.uk/GHRSST-PP-Diurnal-Variability-Working-Group-%28DV-WG%29.html

    The GCOS climate monitoring principles list constancy of diurnal sampling as one of the key principles for satellite monitoring:
    http://www.wmo.int/pages/prog/gcos/index.php?name=ClimateMonitoringPrinciples

    The ENVISAT satellite housing the AATSR instrument (Advanced Along Track Scanning Radiometer, also used to retrieve SST) is in a much more carefully controlled orbit (as were its predecessors) so the equator crossing time doesn’t drift by more than a few minutes.

    I’m not sure how much of a problem spatial aliasing is for SST measurements given the quasi random nature of the sampling. Ships take a measurement wherever they happen to find themselves at fixed times relative to GMT so high frequency spatial info turns up as noise. Comparing satellite (integrated) and in situ data (point measurements) which give similar results suggests it’s not a huge problem. There are sampling issues with in situ SST data that are likely to be more important.

    Someone mentioned leap years above. There’s this paper which looks at the effect of the shifting of the calendar relative to the seasons:

    Cerveny et al looked at “Gregorian calendar bias in monthly temperature databases”:
    http://www.agu.org/pubs/crossref/2008/2008GL035209.shtml

    (Cerveny et al. 2008 GEOPHYSICAL RESEARCH LETTERS, VOL. 35, L19706, 4 PP., 2008 doi:10.1029/2008GL035209 )

  34. Arcs_n_Sparks

    @ Major Tom,

    I would merely point out that I left California (among other reasons) when my marginal electricity costs exceeded 39 cents/KWhr. Apparently, a lot of heavy industry doesn’t like it either. Climate change, at least as perceived by the California legislature, governor, and regulatory bodies, has substantial impact in that state. Much more so than epidemiology, robotics, and the internet.

    However, this is all off-topic from from a post intended to bring a questioning attitude to how sampled data is treated, with aliasing being something that should be handled with understanding and care. I think that is the simple proposition from the original author.

    • Totally agree that aliasing is something that should be handled with understanding and care. Easier said than done, however.

      In case you missed my first comment on this thread:

      http://judithcurry.com/2011/10/18/does-the-aliasing-beast-feed-the-uncertainty-monster/#comment-123840

      • I’ve read them all; very fascinating. Although something may be easier said then done (and this is a tough problem), it needs to be done. If nothing else, to lessen doubt or uncertainty in the data & analysis. In the case of California, the cost has been enormous.

      • Actually, I think aliasing is pretty well understood. The common MP3 player would not be possible without a sound theory of the phenomenon.

        As to the (off topic) topic of costing climate change, “enormous” would have to be compared to costs incurred if we do nothing.

        This economic ‘costing’ problem involves far more uncertainties and hand-waving ‘analysis’ than the mathematics of aliasing.

      • Aliasing is well understood by data acquisition, control, and signal processing engineers. The question posed by the OP was: does everyone else that manipulates sampled data understand that as well? From the comments here, that is not clear.

        I certainly would like to have more confidence in the original proposition: man is warming the planet, before I embark on the second phase of your comment (cost/benefit). Apparently, California decided today that the evidence is clear and passed cap and trade. Thank goodness I moved.


      • The question posed by the OP was: does everyone else that manipulates sampled data understand that as well? From the comments here, that is not clear.

        It is very clear that not everyone who manipulates sampled data understands aliasing. For example – Dr. Curry’s comments below the OP claim that averaging can introduce aliasing – a mathematical impossibility. This sort of mistake is precisely why peer-review is a Good Thing.

  35. Following some of the discussion here, I thought I’d offer my perspective about two of the participants. Richard Saumarez and David Young are individuals whose contributions to this and other recent threads I’ve found very valuable, because I felt that I had a great deal to learn personally from their expertise in control theory, aliasing, fluid dynamics, and problems in the numerical solutions to differential equations – all of which I understood at a superficial level and hoped to understand better.

    That is my primary conclusion about their participation. I have some secondary conclusions. The main one is that I perceive both of them to be unduly skeptical about the confidence we can have in current climate science assessments on such things as long term global temperature change as a consequence on anthropogenic greenhouse gas emissions. Although not a climate science professional myself, I think I may have more knowledge about climate science than they do, and if we sat together for several hours with references handy, I might convince them that some of their concerns, while legitimate, are excessive. Perhaps not, but I would at least alert them to evidence they may be unaware of.

    I would try to avoid arguing about the policy implications. That’s not because I don’t have opinions, but because I don’t have special qualifications to justify those opinions, and my time would be better spent learning from them about their areas of expertise and offering them some evidence on climate change.

    For David Young in particular, I’m concerned that his contributions here are in part wasted if he doesn’t also engage in dialog elsewhere with GCM modelers. David has offered several criticisms of how models handle a number of critical issues, implying the potential for very serious errors in model output. I have the sense that he is probably right about their potential seriousness, but I don’t know how much that translates into actual performance inaccuracies. The modelers can address that better than I – for example, in terms of how parametrizations can be matched against observations to limit errors. Andy Lacis, for example, has mentioned that the models must try to get the seasons right – summer must be warmer than winter. Fairbanks Alaska must be colder than Miami, winds that blow east in the world must blow east in the models, and so on. Since these are not model inputs but emergent properties, Andy suggests that this provides some confidence the models will simulate certain climate changes reasonably well, but how well, we don’t know. (I have also mentioned earlier that GCMs are not the only tool at our disposal for quantitative climate change assessments).

    Andy is only one modeler, however, and the numerics and fluid dynamics are not his area of focus. I would very much like to see more dialog involving more than a single individual and including those with specific interest in the issues David raises. He complains of a resistance to his points at RC, but RC is not the only place in the world for such a dialog, and this is something that should be pursued further.

    • regarding your “future work” section Fred, hear hear. JC – are you listening?

    • Fred, Thanks for your comments. In fact, I have contacted several climate scientists privately and given them some references. They have promised to read them with “great interest.” We’ll see what happens. I won’t name names, because I discovered on RC that public discussions about specific scientists can be counterproductive. Unfortunately, the intrepid Dr. Schmidt has not responded, perhaps because he is too busy “communicating” the latest political spin of RC. Please forgive my sarcasm!

      My perspective is colored by my recent discovery that the literature in a lot of fields is corrupted by the tendency to report only positive results, and incidently to keep the funding rolling in. My brother says this is true of the medical literature and has numerous examples some of which have resulted in billions of dollars wasted on worthless procedures. In CFD, over the past 5 years I ahve started verifying some of the literature and found that the actual situation was exactly the opposite of the impression one got from the literature, in short all the respected people in the field had been using an assumption about modeling that was just wrong. I can’t go into this here because its far too technical. It will play out over the next few years. Influential people are already starting to pay attention. You know, I’m not particularly influential or not all that brilliant as a scientist, but I do tend to argue effectively when I have the facts and data. Trust me on this, CFD as a field for fundamental research has been defunded and its all due to the overselling of the science itself.

      You must forgive me, but the level of rigor I see in the climate literature is actually lower than that found in CFD or medicine. I also see a level of blatant involvement of the scientists in politics that is unprecedented in any other field. Quite frankly, I have been angered by these things because the issue is so important. I view it as a moral and professional obligation for climate science to clean up its act. If I can contribute to that in some measure, I will be pleased.

    • Fred, I must say that unless you watch the Paul Williams presentation at Isaac Newton Institute I am quite concerned about YOUR being open to clearly documented issues. Williams is very well documented and devastating. He shows example after example from both simple model problems and from the literature itself of instances where model outputs are sensitive to numreical methods and time stepping. Watching this again angers me that people haven’t been more careful and it makes me question whether you are being honest about the facts and data. The facts and data are pretty damning. My central concerns are based on convincing evidence, Williams results and Andy Licas’ posts here.

      • David – I did watch it – you probably forgot that I mentioned doing that – and I found it interesting, but that hasn’t been my point. I’m certainly “open” to the issues you raise, but I’m not the person to address them, because trying to find the most accurate numerical solutions to differential equations is not something I do every day – you have to take that up with the modelers who focus on that aspect of GCM development.

        You have indeed convinced me that the issues deserve to be addressed. What you haven’t shown, as far as I know, is that the potential errors you mention have significantly eroded GCM performance – they may have, but maybe not if there are ways to keep the models within bounds despite the error potential. I also don’t know whether your issues are being ignored to the extent you imply. These are things I recommended you discuss with the modelers, along with your other concerns, and I still think it’s a good idea.

    • Forgot to mention that the stonewalling and denial machine at RC also angers me. It’s an ethical issue.

    • Richard Saumarez

      Fred,
      I am not sure why you think David Young and I are unduly sceptical. My thread on feedback was stimulated by the SB2011 and D2011 controversy, which I perceived to be based on trivial, poorly understood physical modelling and incorrect data analysis. Although I put the argument in rather simplistic terms, and got generally hammered for it, this is scientific criticism. I would point out that the SB2011 paper is a “sceptical” paper and quite frankly, I am unimpressed by it. I do not see how this categorises me an AGW believer or a denier.

      The problem of aliasing is absolutely fundamental. It is signal processing 101, yet seems to have been ignored, or at least inadequately analysed. It is particularly important, although I didn’t discuss it, in the spatial reconstruction of a signal. Again, this scientific criticism of the methodology.

      I am lukewarm as regards CO2 and global warming. I fully acknowlege the radiative effects of CO2, I accept that there has been a rise in temperature sine the beginning of the instrumental record. I am less certain about the concept of accelerated global warming.

      If we are going to make far reaching decisions on the basis of science, we have to make sure that the science is correct. Were we to have a discussion over some papers, my first instinct would be to criticise those papers and try to find holes in the logic. If this were found, I would question whether the conclusions were correct. If I couldn’t find a hole in the logic, I would conclude that the paper was correct. My instinct then would be to question how one would test the hypothesis raised by the paper. This, I believe, is known as the (pre-post-normal) scientific method. What you appear to be saying is that we suspend disbelief.

      One of my perceptions of climate science, as an outsider, is that it is very difficult to make an objective argument without being pilloried on social grounds and any argument made in the scientific literature that does not conform to the view that catastrophic climate change is inevitable, brings forward a barrage of emotionally loaded and completely inappropriate comments from some of the illuminati of the discipline. As a medic, having worked in a difficult area, I am well used to this, but I am unimpressed by much of the science I have seen and the topics that I have researched because I think it is slipshod, not because it doesn’t conform to my particular view of the world.

      • Richard – My perception, which might be wrong, is based on my knowledge that most of the important climate science conclusions, including the range of climate sensitivity, are based on more than a single line of reasoning and evidence, which independently converge toward the same result. The recent thread on transient climate sensitivity, which does not require GCMs, was one example, but there are many.

        I also sense an inherent bias is your implication that mainstream climate science claims catastrophic climate change to be inevitable. To me, that signifies a fairly deep ignorance of what mainstream climate science actually reports in its literature, because it’s based on a caricature with no merit. I don’t think you can judge a scientific discipline without having a good sense of its content. That requires an assiduous attention to the weekly and monthly literature in a multitude of journals. If you are willing to engage in that, I think it will provide a sounder basis for conclusions you draw.

      • Richard Saumarez

        No it isn’t bias. Having seen the behaviour of some of the leading lights in the field, I simply wondered why they felt the need to behave in the manner that they did, if their science was as solid as they claimed.

        To take an example, you will doubtless remember the furore over Spencer’s paper and the reaction of Trembath, the “repudiation” by Dessler. I published a relatively simple criticism of the analysis of short term “feedbacks”, which in my view is completely wrong. Many people with the appropriate background agree with the basic idea of that criticism – wrong “mathematical” model and slipshod analysis. My question is: Since, I, with a PhD level science/engineering background can mount a criticism of the study, why are Trembath et al,. incapable of do so? It would carry far more weight than the “climate response team” who simply engaged in ad-hominem attacks?

        In Sunday Times (UK) last sunday, the chief scientific advisor to the Department of Food and Agriculture, published a letter, which stated that the best estimates of climate science was that would be a temperature rise of between 2 and 8.9 degrees C, this century. Given minimal temperature rise in the last ten years, this corresponds to .22 to 1. degree per decade. This is from a senior scientific advisor to UK government. Are we simply expected to believe this? Are you saying that there are no predictions from the mainstream scientific community that do not predict severe, unprecedented global warming?

        I do not claim to have a profound knowledge of climate science. My field is biomedical engineering and medicine. If I were asked for a critiques of prophylactic ICD implantation in young patients or the implications of cardiac electrophysiology in testing potential pro-arrhythmia (convential EP is useless), I could give a highly educated, nuanced and backstopped argument. You will notice that my posts have not commented on the dynamics of ENSO, the effects of agriculture on local precipitation, implications of cosmic rays on clouds and temperature ……. etc because I do not have the specialised knowledge to do so. I have confined my comments to areas, in which I have specialist knowledge, that have some impact on climate science.

        One thing that I should make clear is that have a very high regard for much of the work being done in climate science. I regard the satellite remote sensing programs to be of the highest scientific and technical quality and having spent some time studying it, I lost in admiration. I can think of many other examples of superb science performed in climatology that is based on proper measurements. I have considerable issues with some of the theory that has been erected on the basis of sound work and I see no reason why it should not be cricised.

        I have an open mind on the question of CO2 and global warming. The question of climate sensitivity is critical and I do not believe the results are particularly certain. Given the difficulties in performing critical experiments, I expect the estimates will remain uncertain.

        I am quite capable of reading scientific papers and criticising elements in them that I am competent to analyse. I would agree that one’s comphrehension is limited by lack of experience in the field. However, expertise from other fields are often very useful. Are you suggesting that, given the “Hockey – Stick” fiasco, the actions of McIntyre were wrong?

      • Richard – With all due respect, the fact that you continue to attribute to climate science predictions of inevitable catastrophe, that you confuse Trenberth’s criticism of Spencer/Braswell with Dessler’s criticism, and that you are unaware of the multiple lines of evidence supporting the current estimated range of climate sensitivity implies that you are biased, because it shows that you draw conclusions in a particular (unfavorable) direction in the absence of adequate knowledge. This has nothing to do with what your are “capable” of, but a lot to do with what you haven’t yet done the work to learn. Since I have already concluded that you are quite “capable”, I’m willing to predict that when you learn more about what is actually being done in this scientific discipline, your views will change substantially in a more favorable prediction. I could be wrong, but why don’t you try it so we can find out?

      • Richard Saumarez

        I’m sorry but I think your arguments do not stand up. I suggest you make a rational argument. Appeals to authority are not acceptable in the broader scientific community.

  36. Nebuchadnezzar

    Thanks PE.

    I get the general idea – helicopters on TV, strobing monitors on TV, moire patterns, the life cycles of cicadas etc – but what I’m missing is the connect to climate science. In what circumstances is this actually a problem.

  37. Richard,
    When you generated your synthetic data set , at step 2 you added “…a random, amplitude modulation of 2.0 oC to simulate variability of peak summer and minimum winter temperatures.” Did you filter this so that its spectrum matched the actual spectrum of daily temperature variations? Or did you assume white noise?

    The real world has built in low pass filtering – due to the finite rate of conduction, convection, and latent heat transfer, and the non-zero heat capacity of air, dirt, oceans, ice, et cetera. This leads to the maximum daily temperature when the change is being sun driven (i.e., when there aren’t energy drivers of rain, variable cloud cover, or rapid cold/warm front changes) lagging maximum insolation. Have you checked the effect of these physical low pass filters on the potential problems of aliasing?

    Likewise, real heat waves and cold snaps usually don’t suddenly start on one day, but build up and decay over variable time frames – and the maximum rates are lower over the oceans than over land masses. Does your synthetic data take this into account?

    • Richard Saumarez

      Yes, Modulation accounted for the variabilty in the UK met office data over a 20 year period.

      The whole record was low pass filtered to avoid aliasing so hot and cold snaps rose and declined over a period of several days. It is basically a southern UK model. Other regions, particularly in the US show much greater variability, with temperature jumps of 10oC in one day.

  38. Late to the game, but this is an interesting post that dovetails well into some thoughts I had recently regarding what is the appropriate timeframe for developing climatologically important data series. Our current calendar year is not linked directly to what I would consider climatological dates, i.e., the equinoxes and solstices. Further, from a climatological standpoint, months are essentially meaningless. Certainly, it is much easier for calculations and presentations to use months and/or calendar years for averaging purposes. However, from a climatology standpoint, it seems the “climatological year” should start on one of the equinoxes and, for shorter term averaging use a timeframe linked to the change in the overhead position of the sun, such as the time for the Sun’s overhead position to change by 1 or 2 degrees of Latitude. Whether using such a method would have much, any effect on the overall estimated trend, I do not know, but it might make some of internal quasi-cycles easier to detected and interpret as well as address some of the aliasing concerns expressed above.

    This post also reminded me of a little analysis I conducted a couple of years back trying to look at the impact of using min+max/2 versus hourly data on the value of the daily mean. I looked at hourly air temperature data from the USGS gaging station in Lambertvile NJ (no this is not USHCNN sitee, but it was convenient for me to get ahold of). Over a five month period, I downloaded the hourly data and calculated the daily means both from the daily mins and maxes and using all 24 hours worth of data. What I found, for this particularl location was that there were potentially significant differences in the min-max average vs. the 24 hour average. The daily difference ranged from about -.8C to +2.5C. Using these values to calculated monthly means, the min-max values ranged from 0.4C to 1.1C higher than the 24 hour values. Clearly, to me at least, it seems that using the min-max average may be a source of aliasing bias.

  39. The beauty of straight, uniformly weighted averaging with full length decimation (“boxcar” filtering), however, is that it does not alias any components to dc – only zeros of the transfer function alias to dc. This is particularly important when the signal in question is numerically integrated, as it leads to small likelihood of a spurious long term trend. If you filter with a different weighting function, you would be well advised not to perform full length decimation.

    For a space of frequencies around dc, attenuation is severe enough that the region is substantially unaffected by aliasing generally. A rule of thumb is that the information is probably good about a decade below the Nyquist frequency. Above that, things start to get less and less reliable.

    Of course, a monthly average composed of daily samples can be tainted by even higher frequency aliasing for signal components above 0.5 days^-1. But, if the inputs in this range are highly variable, it is often reasonable just to consider the higher frequency stuff to be white noise. And, aliased white noise is… white noise.

    • It follows that it is also generally OK to drive a simulation with boxcar filtered data if the bandwidth of the system being driven is at least an order of magnitude lower than the Nyquist frequency. When the higher frequency aliased portion is heavily attenuated, it does not significantly affect the outcome.

      You have to confirm these conditions for the system you are simulating, but, they are usually not difficult to satisfy.

      • “…they are usually not difficult to satisfy.”

        Well, usually not because the sampling regime has been specifically chosen so that it is not. But, for climate models in which the time constants are on the order of years, monthly averaged data should be OK.

    • Richard Saumarez

      A “boxcar” filter doesn’t alias a correctly sampled signal. It is decimation of the signal that causes aliasing. One cannot simply assume that the effects are negligible at low frequencies. The DC component is simply the mean of the signal over the period of the record. We are interested in signal dynamics.

  40. Richard Saumarez I have a 2 MB Excel Spreadsheet with the high and low temperatures for MPLS/St. Paul from 1820 (Fort Snelling Data).

    If you Email me at 3spot (at) aol.com I’ll send it to you. So that at least a human has to think here, the 3 is SPELLED OUT and no space before the ‘spot’.

    There is an overall about 1.5 degree F shift upwards from WWII. However, we can be assured of two things – One, the data gathering was AUTOMATED and much more likely to catch the real HIGH and or real LOW during this time, that before WWII.

    Two: The Urban Heat Island effect, probably accounts for most of this change. (Comparison with Hutchinson MN station, shows the 1 to 2 degree F forward bias quite clearly…and 70% of the time, the wind is blowing west to east in this region, so Hutchinson is a good proxy to use to compare.)

    I have your sort of EE background (BS ChemE, BS Metallurgy, MS Mech E., and because I had to move to a distant state, the 5 other courses in EE to finish my EE degree were never completed..however, I took the PE in EE, passed first time. Nyquist Criteria, Fourier, Laplace Transforms, Z Transforms, DSP, aliasing of signals, noise, statistical analysis of data, Signal to Noise ratios, etc. ALL part of the background. )

    I’d LOVE to have you look at the MSP data, and by doing some “signal analysis”, and also some processing to estimate the errors caused by the type of sampling, and the UHI, come up with a “probability” that there is or is not a “significant change”.

    Frankly, I think it is nil.

    Yours,

    Max

  41. “Then before WWII”…too fast tying. Sorry.

  42. Richard Saumarez I have a 2 MB Excel Spreadsheet with the high and low temperatures for MPLS/St. Paul from 1820 (Fort Snelling Data).

    This is what it looks like:
    http://www.climatestations.com/images/stories/minneapolis/msptemp.gif

  43. Frequency aliasing of discrete-time sampled continuous signals is well-known analytically. It can be shown that for any frequency f in the baseband range 0 to Nyquist (r/2, where r is the sampling rate) there is an an infinite set of frequencies above Nyquist, specified by nr – f (n = 1,2,3…), that can produce the identical time-series. Thus if there are such components in the continous signal, they appear in the data series under the entirely predictable alias of f. In effect, the entire signal spectrum is folded in pleat=like fashion into the baseband range.

    The pernicious aspect of aliasing introduced during data acquisition is that it cannot be supressed by digital filtering without supressing the corresponding true frequencies in the baseband range. It’s necessary to minimize aliasing at the initial stage by choosing a high-enough sampling rate. This is seldom the case in temperature records that typically have a strong diurnal component along with harmonics. With historical records, only rarely are daily averages based on more than 4 equi-spaced readings per day, which aliases all the harmonics above order 2. (Max and Min readings, of course, are not obtained at equi-spaced intervals and only provide the mid-range value, although some services resort to empirical formulae to estimate the average therefrom.) The fourth harmonic thus aliases into zero frequency, biasing the daily average. Experience with hourly readings from sensors that average over a minute shows that the fourth harmonic is by no means always negligible. That is the crucial aliasing that adulterates the lowest frequencies in climate records.

    Decimation of daily averages into monthly ones can introduce further aliasing, not only from the range between Nyquist and r = 1/month, but from farther sidelobes of the averaging filter as well. Mercifully, there is no aliasing into zero frequency, because all of its aliases correspond to zero amplitude reponse of simple averaging. Moreover, there is little spectral content between 1 and 2 months. To be sure, HADCRUT3, along with most other global temperature indices, is heavily afflicted by various problems of data coverage and integrity. Aliasing from decimation of daily averages into monthly ones is a relatively minor one by comparison. That is not to say that the decimation should not be improved, however.

    • P.E. | October 19, 2011 at 2:46 pm | linked to this video. Many here will accept that acceleration & deceleration alter perception in that context. Earth samples temporally nonstationary solar cycles quasi-discretely via summers of opposite poles & hemispheres. Are participants willing to recognize aliasing in the latter context [ http://wattsupwiththat.files.wordpress.com/2011/10/vaughn4.png ]? The spatial aggregation kernel in that context is asymmetric (distribution of continents) & nonlinear [T(K)^4, ocean-continent contrast, thermal wind] and thus subject to differential leverage (spatiotemporal version of Simpson’s Paradox). Rough sketch volunteered here: http://wattsupwiththat.com/2011/10/15/shifting-sun-earth-moon-harmonies-beats-biases/ . Acceleration & deceleration of the solar drive-wheel is modulating the fractal dimension of northern hemisphere westerly flow (& thus diurnal venting). This multidecadal variation rides on top of global climate changes manifested as latitudinal jet shifts that I speculate relate to the integral of solar activity (one clue being 30S-90S SST, which falls in the band where fractal dimension is much closer to 1 due to the circumpolar Southern Ocean, which is relatively free of deflecting obstacles & zonally differentially-leveraging physical contrasts). Please take some time to think about this carefully.

  44. Richard Saumarez: A “boxcar” filter doesn’t alias a correctly sampled signal. It is decimation of the signal that causes aliasing.

    Monthly averages do two things, as shown. First a running mean of “boxcar” filter, then monthly decimation. The aliasing problem arises because a 2 month filter is required before decimation, not a 1 month filter.

    The running mean adds further distortions since, not only does it let though some higher frequencies but it also _inverts_ them:
    http://climategrog.wordpress.com/2013/05/19/triple-running-mean-filters/

    Then add the fact that it’s not even the same “average” which is being done with monthly periods being variously 28,29,30 or 31 days long, sprinkled around the year in an arbitrary fashion.

    It is telling that this sort of simplistic and basic data processing error is still the norm in climate “science” after three decades of intense effort and massive funding.

    (P.S. References in this article to HadCrut temperatures presumably refer rather to the CRUTem3 land temp timeseries. HadCrut is a composite of CRU land and Hadley SST datasets. There are no “station” data in SST.)