Case Studies
Cornell Professor Takes Nate Silver Analysis of U.S. Senate Midterms a Step Further using Monte Carlo Simulation

Statistician super-star Nate Silver has gained fame for his spot-on predictions around elections; he correctly predicted the winner of 49 of 50 states in the 2008 presidential election, and all winners of the 35 U.S. Senate races that same year. He followed that success with accurate predictions for all 50 states during the 2012 presidential election, and correctly projected 31 of 33 Senate races. He currently is editor-in-chief of FiveThirtyEight, a polling aggregation and data journalism website that makes predictions around political elections as well as sports.

Silver does not have the power of precognition, although to many it may seem that way. Instead, he uses refined statistical methods—including weighting various political polls—to make predictive models.

Recently, Silver and FiveThirtyEight put out a forecast for the 2014 Senate race which examines the races on a probabilistic basis. Silver’s analysis, which simply sums the probabilities of each side winning all its races, projects that the Democrats are slightly more likely to lose control of the chamber than to retain it.

Lawrence W. Robinson, Professor of Operations Management at Cornell University’s Johnson Graduate School of Management has taken this prediction one step further by adding Monte Carlo simulation to the mix. Monte Carlo simulation is a computational method that considers ranges of possible values for multiple input variables in order to calculate many different possible outcomes. In the case of Senate races, a Monte Carlo simulation considers thousands of different win-loss combinations among all races to come up with a range of possible Senate compositions, and the probabilities each will occur.

Robinson set out to determine, in his words, “the probability that the Democrats hold at least 50 seats in the new Senate.” Only 50 seats are needed because Joe Biden will, in his role as president pro tempore of the Senate, break ties in the Democrats’ favor. “What we really want to know is, what chance will the Democrats have to retain control”?

To answer this question, Robinson used data from the FiveThirtyEight forecast, which gave probabilities of outcomes for every Senate match-up as follows:

Figure 1: Table excerpted from FiveThirtyEight

Running this data in @RISK, a Monte Carlo simulation tool from Ithaca, New York-based Palisade Corporation, Robinson created an initial model that assumes the races are independent from one another (or in statistical terms, have 0% correlation). The resulting graph shows the probabilities (in the y-axis) of the Democrats holding different numbers of Senate seats. If one totals the sum of all the right-most probabilities in the graph, you can see that the Democrats have only a 41% chance of retaining control of the Senate.

However, Silver has warned in previous articles that to assume races are uncorrelated is “dubious.” As Silver puts it, “In plain language: sometimes one party wins most or all of the competitive races. If we had conducted this exercise at this point in the 2006, 2008 or 2012 campaigns, that party would have been the Democrats. In 2010, it would have been the Republicans.”

Silver also assumes that Monte Carlo simulation requires variables to be uncorrelated, writing, “We’ve sometimes seen people take our race ratings and run Monte Carlo simulations based upon them, which assume that the outcome of each race is independent from the others.” In actuality, it is very possible to include correlation in Monte Carlo analyses, which is how Professor Robinson next approached the problem.

Determining correlation coefficients (that is, the extent to which two values are dependent on each other, expressed as a percentage) between the races is very difficult. However, by examining different correlation values between the races – that is, relationships from zero correlation and total correlation -- it is possible to calculate a more nuanced view of the true probability of the Democrats retaining control.

The “lower bound” value of the Democrats’ chances is the 41% determined with the zero correlation assumption. Robinson calculated the “upper bound” value by next assuming total, 100% correlation among the races. Notes Robinson, “This upper bound shows the Democrats have no more than a 50% chance of retaining control.”

With the upper and lower bounds (41% - 50%) in place, Robinson went on to create a more sophisticated model that allows the coefficient of correlation between every pair of elections to vary between 0% and 100%, and found the probability that the Democrats will hold the Senate for each different correlation coefficient value.

As Robinson says, “It would be very difficult to determine the correlations among all the different Senate races. However, if the coefficient of correlation is anywhere in the wide range between 20% and 85%, then the probability that the Democrats will retain control of the Senate (i.e., hold 50 or more seats) will be in somewhere in the very narrow band of 45% ± 0.1%.”

This now gives us a much more accurate probability of the Democrats retaining control.

Robinson continues, “It would be discouragingly cumbersome to run these simulations without @RISK, even in the simple cases of 0% and 100% correlation. Moreover, it would be essentially impossible to calculate the probabilities for intermediate levels of correlation. @RISK allowed me to show my MBA students how useful Monte Carlo simulation can be, in the relevant and timely application of the 2014 Senate races.”

UPDATE: Interested in playing political prognosticator?
Read the latest update, check out the models and run the @RISK simulation yourself.
“@RISK Helps Zero-In on US Senate Race Outcomes”

130 East Seneca Street
Suite 505
Ithaca, NY 14850
800 432 RISK (US/Can)
+1 607 277 8000
+1 607 277 8001 fax
+61 2 9252 5922
パシフィック東京事務所
+81 3 5456 5287 tel