Random numbers are useful for all sorts of things: running lotteries, randomizing data, simulations, even playing games. R has an excellent random-number generator, which we've looked at before. Given only a single integer to start with (called a seed), it will generate an endless stream of random numbers. Well, almost: these are pseudo-random numbers. Although they look random, and satisfy objective tests for randomness, they are generated by a mathematical formula rather a "true" random process. This isn't a problem for most applications: in fact, the fact that you can regenerate the same stream of pseudo-random numbers just by using the same seed is often useful (say, to recreate a useful simulated data set). You can find out all the details about R's random number-generators (yes, plural!) in the documentation for .Random.seed.
But sometimes, pseudo-random just isn't good enough. If you have to generate lots of random streams (in parallel, say), it can be tough to generate enough seeds while ensuring independence of the streams. (
Solutions exist, though.) And while RNG streams are long, they're not infinitely so: eventually, they will start to repeat. Security can be an issue, too. Using a pseudo-random numbers for lotteries is risky: I recall a story -- possibly apocryphal -- of a casino Keno game where the jackpot was won on the third day after a clever punter noticed a pattern of repeated numbers in the first drawing of the day (the programmer had apparently failed to set the random seed at the right point in a loop). A recent
discussion on r-help highlighted a similar security issue: a professor was considering using a random-number generator to
generate questions for a student exam, until he learned that random number stream (and therefore the answers) could be reverse-engineered in a
matter of hours. (Then again, a student capable of that reverse-engineering task should be able to pass the exam with ease.)
You can find sources of truly random numbers, though (and without having to toss coins yourself). Some newer computers have a
random number chip that generates random data from amplified thermal noise. If you don't have one of those,
Random.org is a website that supplies streams of random numbers, generated by
radio receivers tuned to atmospheric noise. You can access these random numbers R by using the
random package from
Dirk Eddelbuettel. This is as random as you can get: there's no way to predict one number from the last.But for some people, even that isn't good enough.
GamesByEmail.com used to use Random.org to generate the dice rolls for various games played via email, but some of the players complained that the rolls "weren't random enough". (Incidentally, I've seen similar complaints in the forums of every computer strategy game I've ever played where the results of battles are decided by random rolls, like Warcraft, Advance Wars, and the Civilization series. You'd be amazed at the lengths some players go to to "prove" that losing a 20-1 odds-on battle twice in a row is always "biased".) So the site owner fixed the problem by rolling his own true random number generator ... literally. The
Dice-O-Matic Mark III rolls 800 dice -- real, physical dice -- down an irregular, spiral ramp to be loaded into a bucket elevator where a camera with image recognition software reads the pips to generate a stream of random dice rolls. You have to see this monster in action to believe it:
The machine is capable of generating and recognizing 1.3 million random dice rolls a day. Amazing.
Ironically, I wouldn't be surprised to learn this machine is less random than Random.org. It certainly can't be more random, and I'd like to know more about the die images that are rejected by the image recognition software. The description mentions that sixes can be hard to recognize because the pips can blur together in the digitized image: could valid six rolls be rejected as unrecognizable at a greater rate than other rolls, I wonder?
I believe the Keno story is true: It is mentioned on Wikipedia in the entry on the Montreal Casino http://en.wikipedia.org/wiki/Montreal_Casino and the description there matches one I heard from Luc Devroye this week.
Posted by: Duncan Murdoch | June 05, 2009 at 10:43
On the same kind of topics, is it possible to generate Halton draws and quasi-random numbers in R ?
Posted by: PAC | June 08, 2009 at 02:55
@PAC: RSiteSearch("halton") finds such functions in a variety of contributed packages.
Posted by: Ben Bolker | June 08, 2009 at 05:47