Statistics and data mining often get bundled together, but (in my opinion), they're generally different practices with different goals. As a language designed for statistics, much of R's core functionality is focused on exploring and understanding data: model design, inference, and visualization. But when your goal is simply to get the best predictions from a big data set (without worrying too much about the model itself), much of R's statistical power can also be put to data mining purposes. A good overview can be found in Luis Torgo's book Data Mining with R, and the functions in the associated package DMwR.
Another good resource is the Yanchang Zhao's website rdatamining.com, which collects resources related to data mining with R. In particular, check out his R Reference Card for Data Mining, a 3-page PDF index of the R functions and packages for association rules, classification, clustering, text mining, social network analysis, and more. Find it in the "Docs" section linked below.
RDataMining.com: Documents
Well, thanks for the book dude! Really interesting. That reference card is becomes very handy!
Posted by: J. Smith | June 07, 2011 at 07:27
Thuohgt it wouldn't to give it a shot. I was right.
Posted by: Danice | July 19, 2011 at 07:14
Hi
Have any of you worked through chapter 3 - Predicting Stock Markets?
It's a really good introduction to many useful R functions for predicting and testing. However, the final section leaves you a bit lost. Have any of you worked out how to obtain the predicted signal for today's (or most recent) data point? Would like to hear from you.
Regards,
Laurits
Posted by: Laurits | June 05, 2012 at 05:29