With the weather man, I could keep track of when it rains and doesn't. If it rains significantly more or less than 80% of the time they say there's an 80% chance of rain, I can say that their model is bad.
I can't think of a similar way to evaluate Silver's election forecasting model. They very clearly aren't independent probabilities, and his model changes significantly from cycle to cycle. Was his model good in 2012 when every state went to where he predicted the likely probability was? Was it bad when his model said Hillary had a 71.4% chance of winning?
They do not only predict top-line presidential results, but every race, in every state, for president, House, and Senate. Non-independence is accounted for in the model, so they have no qualms about you judging them by the calibration of their predictions, I. e. You want roughly 60% of their “60%” forecasts to be right, and 40% to be wrong. If all of the races they predict 60/40 go to the more likely candidate, they themselves consider this a failure: https://fivethirtyeight.com/features/how-fivethirtyeights-20...
> I can't think of a similar way to evaluate Silver's election forecasting mode
You bucket every prediction, look at the outcome, and then confirm whether the favoured outcomes in the 8th decile actually occurred 70% to 80% of the time.
I can't think of a similar way to evaluate Silver's election forecasting model. They very clearly aren't independent probabilities, and his model changes significantly from cycle to cycle. Was his model good in 2012 when every state went to where he predicted the likely probability was? Was it bad when his model said Hillary had a 71.4% chance of winning?