The polling firm Strategic Vision, LLC conducts regular public opinion polls on elections, public policy and other issues and releases the results to the media. Despite having the results of its public opinion polls published in outlets like the AP, Fox News, and even the New York Times, the firm is unlike most polling organizations in that it refuses to reveal even the most basic details of its methodology (sample size or dates of surveys, for example). For this it has recently been censured by the American Association for Public Opinion Research. Nonetheless, the firm claims that the polling data released is accurate.

Nate Silver of fivethirtyeight.com has tackled this claim by looking at the distribution of trailing digits of the rounded percentages in the published polls. (For example, a poll reporting that Barack Obama leads John McCain 48-43 contributes one 8 and one 3 to the data.) While a collection of 3000 political polls from various firms reveals a somewhat uniform distribution of trailing digits:

(However, a Chi^{2} test for uniformity is rejected with a p-value of 0.0065.) The same chart based on trailing digits from over 100 polls (each with 15-20 questions) from Strategic Visions reveals a rather less uniform pattern:

Does this constitute evidence that some of the Strategic Vision polls were "tinkered with" by hand? Nate Silver thinks it might:

Certain statistical properties of the results reported by Strategic Vision, LLC suggest, perhaps strongly, the possibility of fraud, although they certainly do not prove it and further investigation will be required.

There are some lingering questions though, many raised in the comments of this post. Perhaps some sort of selection bias -- polls are done on "close" topics, so perhaps a preference for trailing digits near zero (representing 50-50) would be expected. Perhaps a bad or unusual rounding algorithm is to blame. And should one expect the trailing digits to be uniform, anyway?

Taking another angle of attack, Michael Weissman (a physics professor at the University of Illinios) has applied Fourier analysis to the data. It's a new technique to me, based on the concept that any pattern of the digits must be circular (in the sense that 9 is as close to 0 as 1 is) and that the transitions between adjacent digits should be smooth. His analysis, written in BASIC (!), calculates a p-value of 0.000019, rejecting the hypothesis that pattern is generated by a combination of Fourier components.

Is this compelling? It's interesting, certainly, and it does have the advantage of not assuming *a priori* that the distribution of trailing digits must be uniform. I'm also a little concerned that the fundamental premise of the analysis -- that the Fourier components represent "smooth" transitions between the digits -- could be undermined simply by a component with a high enough frequency. Maybe someone out there has the expertise I lack to interpret the estimated coefficients. Personally, I'd love to see a more traditional, rigorous statistical analysis of the data.

FiveThirtyEight: Seen Through Sharper Statistical Lens, Anomalies in Strategic Vision Polling Remain

Very interesting. Looks they they have been caught out!

I just wanted to I love your blog. I am a stats newbie, although trying to learn more and am also learning R. For one the graphing capabilities are miles ahead of excel.

Keep up the great work. I know how much hard work blogging is!

Posted by: Steve | October 06, 2009 at 16:52

I was curious how Benford's Law fits into this. The law states that numbers are not actually uniformly distributed, but instead the lower digits are more prevalent. I'm not exactly sure of the technical assumptions behind this so it's possible it doesn't apply at all in the polling fraud case.

Posted by: Keith | October 07, 2009 at 01:57

Keith, I don't think it applies. Benford's law covers the first significant digit, which in this case is always going to be within a very limited set of numbers.

I am a math/stat newbie so I am probably wrong ;)

Posted by: Steve | October 07, 2009 at 03:10

A flawed political poll in the US never. "Dewey Wins!"

Posted by: K | October 08, 2009 at 07:01

hi friends,

I am new to R.I would like to know R-PLUS.Does any know where can I get the free training for R-PLUS.

Regards,

Peng.

Posted by: peng | January 27, 2010 at 02:19