« Demand for R jobs on the rise, ctd | Main | New Webinar: High Performance Predictive Analytics in R and Hadoop »

August 22, 2013


Feed You can follow this conversation by subscribing to the comment feed for this post.

Perhaps this is obvious, but if you need a big data set to play with, a solution is to generate the data set randomly. Specify a multivariate stochastic process and draw random samples for each variable. The size of your hard drive is the limit.

For economic data:


Great NYC datasets available at

The Lahman package is also pretty nice if you want a set of larger interlinked tables

The comments to this entry are closed.

R for the Enterprise

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid

Search Revolutions Blog