I measured high temperature and predicted highs for three national weather websites to determine which was most accurate. In general, forecasts longer than 4 days in advance were considerably worse than those less than 4 days in advance. weather.com appeared to have the greatest accuracy on the longer forecasts (within 6 degrees for a 9 day forecast) and shorter forecasts (within ~3 degrees for 1 day forecast), while The Weather Underground was least accurate in the short term (off by ~ 6 degrees just 1 day in advance). Accuracy averaged across forecast length and website was greatest for cities in a continental climate (all around 3 degrees), and worst in Anchorage (almost 10 degrees off on average) and San Francisco (off by ~8 degrees).
For some time I've been wondering just how accurate weather forecasts are. I'm especially curious about these extended forecasts by places like weather.com or Accuweather, which purport to predict the weather 10 and 15 days in advance. Really? The local news guys can hardly get it right 1 day in advance, and these online forecasters are shooting out 10-15 days? I'm skeptical.
How how far in advance are forecasts accurate?
Sub 1: Do different websites have different accuracies?
Sub 2: Do different locations have different accuracies?
For this first swipe at answering those questions, I tackled the easier number to collect: High Temp. The three sites I looked at where weather.com, Accuweather, and the Weather Underground. I picked 8 cities completely non-randomly. 3 of the cities were ones I frequently visit and therefore specifically address Sub question 3 (Kansas City, KS; Wichita, KS; Pittsburg, KS). Since KS is a continental climate I also picks some cities from coastal climates (Seattle, San Francisco, Anchorage). I threw in Atlanta randomly, and South Bend because I used to live there (and the local forecasters were awful).
I then recorded the predicted high temp from 9 days in advance until April 26th. I screwed up on Day 5 and didn't get a prediction, and I was moving on April 26th itself, so I didn't get a 'same-day' forecast. I also didn't realize that Weather Underground doesn't predict highs for Anchorage.
Weather Underground has a 5-day prediction for high temp; weather.com has a 10-day; and Accuweather has a 15 day (!) prediction. Holy cow. However, these are all actually 1 day less, because they include the same-day forecast, which to me makes these 4-day, 9-day, and 14-day predictions (that's how they are referred to on the following data.
Let's take a look at some of the raw data. This is the first time I've used google docs to present data, I hope it looks ok.
Here we see the raw data for weather.com. I can't figure out how to change the y-axis (if anyone knows, leave a comment), but this doesn't actually look like a lot of change over time. The 9-day forecast is only marginally different than the 1-day forecast for several cities (noticeably the KS ones). This figure doesn't really address the questions easily, though, so let's take a look at the average accuracy of all the websites:
I don't have enough data yet to draw any definitive conclusions (that will take doing this multiple times), but for April 26th, weather.com seemed to be a lot more accurate, both early on and right in advance. I missed the 5-day forecast, but it looks like any forecast over 4 days in advance is pretty sketchy, while those under 4 days in advance are pretty good from Accuweather and weather.com. Weather Underground is up in the air.
There are some differences in the reported 'actual' values for these different sites as well (presumably they are using different locations to measure the temp). For the most part those are minor differences, but for some cities Weather Underground had much higher reported actuals than the other two sites. Knowing why that might be could account for some of the poor predictive ability of Weather Underground.
Ok, what about the different cities? Where are we most likely to get accurate weather forecasts. The following values are averaged across date and website.
As you can see, predictions for the Kansas cities are a lot more accurate than anywhere else. I'm not sure if this is because of this particular date, or if this is a general trend. I would not have predicted this, because I was under the impression that continental climates are extremely variable. However, as I think back, the only really weird weather we've had in the past week occurred on the afternoon of April 26th (!), when we had a 20 degree (F) temperature drop within a 15 minute span (I was standing outside when this happened. Amazing!). So maybe the forecasters for KS got an easy stretch.
On the other hand, San Francisco and Anchorage are apparently hard to predict right now.
Well, collecting this data and subsequently thinking about it have revealed a couple of insights, but mostly just a lot more questions. First of all, this isn't enough data to actually evaluate anything. I need to repeat this several times to figure out if the differences we saw here are real. Essentially, this is just a single data point (which is why I left off estimates of variation), and, at least for the KS sites, I have a hard time believing that all three sites aren't just being affected by the same weather system (they make a 170-mile-a-side triangle).
Secondly, I'm not sure daily high temp is the interesting thing to measure. Sure that's the number that gets thrown around on the websites in big, bold type, but probably more relevant to planning activities is rainfall or sunshine. All of these sites do appear to make those predictions too, so I'll have to collect data on that next time.
Finally, I really have no idea where these data are generated. Are these just aggregates of local forecasts or are they independently generated (or both)? There were variations in the accuracy of predictions, but overall the predictions tended to be fairly similar (variation in the recorded actual high temps contributed to overall variation in accuracy as well). Before drawing any definitive conclusions, I should probably incorporate that information into my evaluation.
If anyone has any thoughts or suggestions, I'd love to have them.