by Judith Curry
Why are weather forecasters succeeding when other predictors fail? It’s because long ago they came to accept the imperfections in their knowledge. – Nate Silver
Nate Silver has a forthcoming book entitled: The Signal and the Noise: Why So Many Predictions Fail But Some Don’t. Book description from Amazon.com:
Nate Silver built an innovative system for predicting baseball performance, predicted the 2008 election within a hair’s breadth, and became a national sensation as a blogger—all by the time he was thirty. The New York Times now publishes FiveThirtyEight.com, where Silver is one of the nation’s most influential political forecasters.
Drawing on his own groundbreaking work, Silver examines the world of prediction, investigating how we can distinguish a true signal from a universe of noisy data. Most predictions fail, often at great cost to society, because most of us have a poor understanding of probability and uncertainty. Both experts and laypeople mistake more confident predictions for more accurate ones. But overconfidence is often the reason for failure. If our appreciation of uncertainty improves, our predictions can get better too. This is the “prediction paradox”: The more humility we have about our ability to make predictions, the more successful we can be in planning for the future.
The book is slated to be published Sept 28; why am I discussing it now?
The Sunday New York Times Magazine has an article written by Nate Silver, entitled The weatherman is not a moron, which draws on material from his forthcoming book. The entire article is superb, I strongly recommend reading the entire thing. It provides a very informative perspective on weather forecasting. I excerpt here some statements related to uncertainty:
But if prediction is the truest way to put our information to the test, we have not scored well. In November 2007, economists in the Survey of Professional Forecasters – examining some 45,000 economic-data series – foresaw less than a 1-in-500 chance of an economic meltdown as severe as the one that would begin one month later. Attempts to predict earthquakes have continued to envisage disasters that never happened and failed to prepare us for those, like the 2011 disaster in Japan, that did.
The one area in which our predictions are making extraordinary progress, however, is perhaps the most unlikely field. Jim Hoke, a director with 32 years experience at the National Weather Service, has heard all the jokes about weather forecasting, like Larry David’s jab on “Curb Your Enthusiasm” that weathermen merely forecast rain to keep everyone else off the golf course. And to be sure, these slick-haired and/or short-skirted local weather forecasters are sometimes wrong. A study of TV meteorologists in Kansas City found that when they said there was a 100 percent chance of rain, it failed to rain at all one-third of the time.
But watching the local news is not the best way to assess the growing accuracy of forecasting. It’s better to take the long view. In 1972, the service’s high-temperature forecast missed by an average of six degrees when made three days in advance. Now it’s down to three degrees.
Perhaps the most impressive gains have been in hurricane forecasting. Just 25 years ago, when the National Hurricane Center tried to predict where a hurricane would hit three days in advance of landfall, it missed by an average of 350 miles. Now the average miss is only about 100 miles.
Why are weather forecasters succeeding when other predictors fail? It’s because long ago they came to accept the imperfections in their knowledge.
Our views about predictability are inherently flawed.
Chaos theory does not imply that the behavior of the system is literally random. It just means that certain types of systems are very hard to predict. Perhaps because chaos theory has been a part of meteorological thinking for nearly four decades, professional weather forecasters have become comfortable treating uncertainty the way a stock trader or poker player might. When weather.gov says that there’s a 20 percent chance of rain in Central Park, it’s because the National Weather Service recognizes that our capacity to measure and predict the weather is accurate only up to a point.
In a time when forecasters of all types make overconfident proclamations about political, economic or natural events, uncertainty is a tough sell. It’s much easier to hawk overconfidence, no matter if it’s any good. A long-term study of political forecasts conducted by Philip Tetlock, a professor at the University of Pennsylvania, found that when political experts described an event as being absolutely certain, it failed to transpire an astonishing 25 percent of the time.
The Weather Service has struggled over the years with how much to let the public in on what it doesn’t exactly know. In April 1997, Grand Forks, N.D., was threatened by the flooding Red River, which bisects the city. Snowfall had been especially heavy in the Great Plains that winter, and the service, anticipating runoff as the snow melted, predicted that the Red would crest to 49 feet, close to the record. Because the levees in Grand Forks were built to handle a flood of 52 feet, a small miss in the forecast could prove catastrophic. The margin of error on the Weather Service’s forecast – based on how well its flood forecasts had done in the past – implied about a 35 percent chance of the levees’ being topped.
The waters, in fact, crested to 54 feet. It was well within the forecast’s margin of error, but enough to overcome the levees and spill more than two miles into the city. Cleanup costs ran into the billions of dollars, and more than 75 percent of the city’s homes were damaged or destroyed. Unlike a hurricane or an earthquake, the Grand Forks flood may have been preventable. The city’s flood walls could have been reinforced using sandbags. It might also have been possible to divert the overflow into depopulated areas. But the Weather Service had explicitly avoided communicating the uncertainty in its forecast to the public, emphasizing only the 49-foot prediction. The forecasters later told researchers that they were afraid the public might lose confidence in the forecast if they had conveyed any uncertainty.
Since then, the National Weather Service has come to recognize the importance of communicating the uncertainty in its forecasts as completely as possible. “Uncertainty is the fundamental component of weather prediction,” said Max Mayfield, an Air Force veteran who ran the National Hurricane Center when Katrina hit. “No forecast is complete without some description of that uncertainty.”
Unfortunately, this cautious message can be undercut by private-sector forecasters. Catering to the demands of viewers can mean intentionally running the risk of making forecasts less accurate. For years, when the Weather Channel said there was a 20 percent chance of rain, it actually rained only about 5 percent of the time. People don’t mind when a forecaster predicts rain and it turns out to be a nice day. But if it rains when it isn’t supposed to, they curse the weatherman for ruining their picnic. “If the forecast was objective, if it has zero bias in precipitation,” Bruce Rose, a former vice president for the Weather Channel, said, “we’d probably be in trouble.”
JC comments: The decades of weather forecasting experience provide a wealth of understanding for the process of increasing the accuracy of prediction. The day-to-day testing is invaluable for improving forecasts. Effective forecasts use both a model and expert judgment. And making a forecast is easy; characterizing the uncertainty and assessing confidence in the forecast is the hard part.
The article provides some interesting examples on how forecasts are framed in the context of perceived user biases and sensitivities. The Weather Channel’s rainfall forecast example is an interesting one, where probabilities are deliberately biased high. If you carry an umbrella and it doesn’t rain, no big deal. But if it rains and you get caught without an umbrella, then that hurts. So the rainfall forecasts have a precautionary bias, based upon user sensitivity. The Red River flood is an interesting (and painful example). The Weather Service made a terrific forecast, within 10% of the actual magnitude of this extreme event. Their forecast was close to the disaster threshold. The disaster happened in spite of a good forecast; the dam managers should have been concerned about any forecast that came close to the disaster threshold. The forecasters were worried about the dam managers not paying sufficient attention to the forecast, and as a result were not explicit about the uncertainty.
Since climate models have their heritage in weather prediction models, all these lesson learned from weather forecasting should trickle up to those making climate projections for the 21st century, right? Well, in the U.S. anyways, the weather forecasting and climate modeling communities are completely separate (this is not the case in the UK). And until very recently, there have been no attempts to evaluate the climate models against decadal scale variability (which is where the weather forecasters would have started, if they had been assigned to work on this problem). So, climate modelers and the IPCC provide projections for the 21st century using models that are untested in prediction mode.
Ok, people want to know what might happen with the 21st century climate, so climate models can provide some scenarios of what might happen. The problem arises in how these projections are communicated to the decision makers. It seems like both the rainfall and the flood forecast examples are relevant here. The user sensitivity/cost is much higher if the climate changes than if it doesn’t change, so the focus is on scenarios that will produce change (a precautionary bias). The challenge to getting people to pay attention to the projections has arguably been met by downplaying uncertainty in the projections, analogous to the Red River flood forecast.
So, 40 years from now, will climate projections fall on the ‘failed’ or ‘successful’ side of the ledger? Well, time will tell, but in 2050 there may be no simple way to evaluate this. Some elements of the projection may turn out to be correct, even if many aspects are judged to have been not correct by some criteria. The bigger societal issue is whether these projections motivated better (or worse) decisions, in hindsight. And what is ‘better’ depends on values and politics. ‘No regrets’ policies would tilt this in the direction of ‘successful’ even if the climate projections do not verify well in the coming decades.