As a discipline, Political Science -- the analysis of the theory and practice and politics -- has been around for quite a while. (Our own CEO here at Revolution Analytics, Norman Nie, has been a leading academic and author in the field for over 40 years.) But it's only in recent years that a deluge of data about politics has erupted: detailed demographic information about constituents; tracking polls taken on a daily basis from dozens of polling firms; campaign donations; information from just about every walk of life that can give insight into a voter's intentions or reactions to policy. Around election time in particular, the new data is captured and published on a minute-by-minute basis. As a result, advanced statistical techniques that lend themselves to drawing nuances from disparate data streams are increasingly being used to forecast the results of elections.

Take one recent example: the British parliamentary elections. Professor Simon Hix and Nick Vivyan of the London School of Economics and Political Science used R to analyse polling data. Their Hix-Vivyan Prediction method pools data from numerous national polls to infer the elections of MPs in each constituency, and thereby predict the outcome of the election. R is an ideal system for this kind of analysis: not only does it provide the advanced statistical techniques to do the analysis and make the predictions, but because it's a scripted language they were able to re-run the analysis on a day-by-day basis as new polling data was released and present the results as beautiful graphics like these:

On the day before the election the Hix-Vivyan model predicted the Conservatives would win 293 seats, shy of the 326 required to avoid a hung parliament. (A hung parliament was indeed the result, with Conservatives at 306 seats, eventually forming a coalition with the Liberal Democrats.)

This is just one example of political scientists using advanced statistical techniques to predict election outcomes. Nate Silver at fivethirtyeight.com also tracked the UK election closely, and his in-depth analyses of the US House, Senate and Presidential elections are must-reads for any junkie of the US election system. (Incidentally, Nate has also recently branched out into ranking the World Cup Soccer teams using statistical techniques.) And Andrew Gelman regularly posts about political analysis (always with a Bayesian perspective, and often using R), for example on the recent primary elections in the US. And Boris Shor (from the University of Chicago) often publishes in-depth analysis of individual races in US elections at his blog (click here to download a case study on how he uses Revolution R Enterprise for the analysis). In fact, there's so much going in statistical analysis of US elections that I think I'll we'll to come back to the topic in a follow-up post.

[**Update**: Corrected spelling of both Hix and Vivyan. Apologies to both.]

British politics and policy at LSE: One day to go: Hix-Vivyan Prediction up to 3 May

A couple of updates: I should also have mentioned two political scientists at NYU with some great analysis done with R and that we've featured on this blog before: Adam Bonica and Drew Conway. Also, Nate Silver of FiveThirtyEight.com has just signed with the New York Times (including contributions to the print edition), so we can expect to see these in-depth statistical analyses getting more mainstream media coverage.

Posted by: David Smith | June 03, 2010 at 09:13

Simon HIX

Posted by: Vincent | June 03, 2010 at 09:41

Thanks Vincent, I've corrected the post (I also mispelled Vivyan).

Posted by: David Smith | June 03, 2010 at 09:48

The oldest and best known R political graphics on the Web were done by Charles Franklin (originally at his own blog politicalarithmetik.blogspot.com), but in recent years at pollster.com (http://www.pollster.com/blogs/charles-franklin/). The pollster.com tracking graphics are now done in Flash for interactivity, but identical in appearance to the R versions that circulated for many years.

The weekly Economist/YouGov poll (http://www.economist.com/blogs/democracyinamerica) also includes R graphics. (In fact, all of the tabulations posted on the Economist site are also produced by R, which are then output to LaTeX and then PDF.)

Posted by: doug rivers | June 04, 2010 at 06:16

Thanks for the info Doug, very interesting! I was particularly interested to learn that the Economist uses R for graphics and tables.

Posted by: David Smith | June 04, 2010 at 08:56