The best way to learn any software is to use it, and if you're new to Hadoop and want to try using Hadoop with R the process of setting up your own Hadoop cluster can be daunting (to say the least). But if learning is the goal, the key is that you don't need to install a full cluster. All you need is your own machine, and the ability to install software from the shell command line.
RDataMining.com recently published the tutorial "Building an R Hadoop System" with step-by-step procedures for installing Hadoop, R, and RHadoop (including the rmr2 package) on a standard Mac system. (The same procedures will likely work on any Linux-based system as well, with minor tweaks.) Since the Hadoop system is configured in standalong mode on the single machine, you don't have to worry about any of the details around intra-node communication and distributing software across the nodes of a multi-node cluster. The whole process takes about 30 minutes to set up, after which you can start on the Mapreduce in R tutorial from the Revolution Analytics github repository.
Get started with the six-step installation tutorial at the link below.
RDataMining.com: Building an R Hadoop System
I haven't really started to try this. But I'm curious: is there any advantage of applying MapReduce on a single node, except for teaching and learning purposes? Thanks!
Posted by: Hung | September 04, 2013 at 06:13
David, the youtube link http://www.youtube.com/watch?v=hSrW0Iwghtw, put in http://www.rdatamining.com/tutorials/rhadoop, says it is private. Could you make it public?
Posted by: Arunkumar Shanmugasundaram | September 04, 2013 at 18:05
@Arunkumar, I don't know what video was intended to be linked from RDataMining.com, but check out the tutorial video "RHadoop: R Meets Hadoop" by RHadoop lead developer Antionio Piccolboni. It includes the wordcount example.
Posted by: David Smith | September 05, 2013 at 08:25
Will this work on a windows computer?
Posted by: Doug Bergman | September 08, 2013 at 08:02
Why would anybody want hadoop on a mac? This is a troll post right?
Posted by: Aaron | September 09, 2013 at 19:27