Get Better Election Predictions by Combining Diverse Forecasts

August 23, 2016

3146

Imagine you are among the legions of pundits and political commentators striving to predict the outcome of November’s presidential election. You’re not just interested in who will win – most citizens can predict that, it turns out. You want to forecast the candidates’ exact vote shares.

You would likely seek insight from the results of political opinion polling. Perhaps you would also look at forecasts from other methods such as betting markets, statistical models or judgments of fellow pundits. But there is a problem: Different methods can offer different forecasts, and it’s hard to determine which one will turn out to be most accurate.

It’s not a good idea to simply rely on methods that have proved reliable in the past. My research has found that forecasting models that were among the most accurate in one election tended to be among the least accurate in the next. But my research has also identified a way to make predictions much better.

Every election is different

One reason that past results aren’t good evaluations of prediction methods is that every election is held in a different context and has its own idiosyncrasies – such as the first woman major-party nominee running against the first reality-TV star nominee. These anomalies are particularly challenging for statistical models, which try to base their predictions on patterns in electoral history.

Another reason is that the conditions under which certain methods are expected to work can change over time. For example, response rates in traditional phone surveys have dropped below 10 percent in recent years. This makes it harder to believe that respondents form a random and representative sample of the population, and raises concerns about whether traditional polls can still meaningfully measure public opinion.

So how should you make your best prediction? Advice from half a century of evidence from forecasting research is clear: Combine forecasts from different methods that rely on different information. The combined forecast is usually more accurate than any single prediction. It’s also often more accurate than even the most on-target individual forecast. And combining distinct predictions avoids the risk of making large errors.

When forecasters can use more information in an objective way, their predictions get better. In individual forecasts, there is always some amount of bias that creeps in, because of what data were used or excluded, and the methods used to analyze them. But when various methods using different data are combined to make a forecast, those biases tend to cancel each other out.

Picking what to include

You, the pundit, now know what to do. But many questions remain. For instance, where can you find different forecasts? And which ones should you trust and include in your combination? And how should you weight the different forecasts?

The good news is that you do not have to make these decisions yourself. In 2004, we developed the PollyVote, an evidence-based formula designed to forecast election outcomes by combining multiple predictions. In particular, the PollyVote system combines results from six different forecasting methods that use various kinds of information: polls, betting markets, expert judgment, citizen forecasts, index models and econometric models.

When combining these different results into the PollyVote forecast, we use a two-step procedure. First, we calculate a combined forecast for each of the six component methods. For example, the PollyVote currently averages results from eight different poll aggregators into one combined polling projection.

In the second step, we average all the combined component forecasts to calculate the final PollyVote prediction. This equalizes the significance of each component method, whether an element includes many forecasts, like polls and statistical models, or only a few, like the lone prediction market dealing with the national popular vote.

The use of equal weights in combining forecasts is supported by a large body of evidence, including my own research, which shows that the simple average often provides more accurate forecasts than complex approaches to estimating “optimal” combining procedures.

If we knew that a particular method was likely to be most accurate, we could give it more weight when calculating the combined forecast. But again, because the accuracy of each forecast changes over time, it is difficult to know which is best at any given moment. So the safest approach – not to mention the simplest and easiest to understand – is to treat them all as equally likely.

Past performance

Since we launched the PollyVote, it has provided accurate forecasts over the last three presidential elections. On average, across three periods in the election cycle (Election Day eve, one month before and three months before), the average forecast error is less than one percentage point. As far as we know, this record is unsurpassed by any other forecasting formula.

Every election is different

Picking what to include

Past performance

2016 forecasts

In early January, we launched the PollyVote for the 2016 election. Since then, the method always predicted a Clinton victory; since mid-February it has Clinton winning by at least four percentage points. As of this writing, the PollyVote predicts Clinton to gain 53.5 percent of the major party vote, which excludes those voting for third-party candidates. Trump is predicted to get 46.5 percent.

Five of the six component methods included in the PollyVote predict a Clinton victory. The only exception is econometric models, which predict the election outcome based on what are called “fundamentals.” These include the state of the economy, the sitting president’s popularity and the amount of time the incumbent party has been in the White House. But they do not capture candidate characteristics. In predicting a narrow Republican win (50.9 percent for Trump), the econometric models component essentially suggests that the Republicans should have an advantage had they nominated an “average” candidate.

The index models component, on the other hand, includes forecasts that specifically focus on candidate characteristics – such as their prior experience, their leadership skills or their issue-handling competence – and thus incorporate information about the candidates themselves. On average, the index models predict Clinton to win 53.7 percent of the major-party vote. One of these models, which looks at the candidates’ biographies, even has her at 58.8 percent.

This year, for the first time ever, the PollyVote also looks at state predictions to forecast the outcome of the Electoral College. At the time of writing, Clinton is predicted to gain 347 electoral votes, versus 191 for Trump.

Apart from providing accurate forecasts, the PollyVote has another important benefit: education. By collecting and aggregating forecasts from different methods, the platform allows readers to learn about different election forecasting methods and to compare their results. We encourage anyone interested in learning more about the PollyVote to visit www.pollyvote.com, where the election forecast is updated daily and the complete data are publicly available.

Feature image source: variety.com

About the Author:

Andreas Graefe: Research Fellow at the Tow Center for Digital Journalism (Columbia School of Journalism) and at LMU Munich, Professor for CRM at Macromedia University, Munich, Germany, Columbia University.

This article was originally published on The Conversation.