« REvolution in Europe this Summer | Main | Batch mode in R: a primer »

June 18, 2009


Feed You can follow this conversation by subscribing to the comment feed for this post.

Another analysis, somewhat different, supports the fraud hypothesis: here.

second_digit <- floor(x * 10^-ceiling(log(x,10)-2) - 10*floor(x * 10^-ceiling(log(x,10)-1)))

Thanks, Corey. Now, for comparison, how would one do that in Excel or SAS? :)

Incidentally, I was wondering why Mebane analyzed the second digit for the Benford's Law analysis (rather than the first as Roukema did -- see also Gelman's comments on this paper). I found the answer (see comment by "tomi"):

"Another important issue concerns whether Benford's Law should be expected to apply to all the digits in reported vote counts. In particular, for precinct-level data there are good reasons to doubt that the first digits of vote counts will satisfy Benford's Law. Brady (2005) develops a version of this argument. The basic point is that often precincts are designed to include roughly the same number of voters. If a candidate has roughly the same level of support in all the precincts, which means the candidate's share of the votes is roughly the same in all the precincts, then the vote counts will have the same first digit in all of the precincts. Imagine a situation where all precincts contain about 1,000 voters each, and a candidate has the support of roughly fifty percent of the voters in every precinct. Then most of the precinct vote totals for the candidate will begin with the digits `4' or '5.' This result will hold no matter how mixed the processes may be that get the candidate to roughly fifty percent support in each precinct. For Benford's Law to be satisfied for the first digits of vote counts clearly depends on the occurrence of a fortuitous distribution of precinct sizes and in the alignment of precinct sizes with each candidate's support. It is difficult to see how there might be some connection to generally occurring political processes. So we may turn to the second significant digits of the vote counts, for which at least there is no similar knock down contrary argument." (From a 1996 paper by Mebane.)

A more readable implementation might be
second_digit <- as.numeric(substring(x,2,2))

The comments to this entry are closed.

R for the Enterprise

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid

Search Revolutions Blog