JD Long has just posted a great intro to Hadley Wickham's plyr package. One of the best ways to show off your R mojo is to perform iterative tasks without explicit use of the "for" loop, and for this purpose R has a plethora of functions to apply an operation to parts of a data object. For example, you can use the function apply to calculate the mean of each column of a matrix. There's also sapply, lapply, mapply ... the list goes on.
The plyr package is an elegant rationalization of the various interfaces to these types of functions. JD Long provides a nice analogy: it's kind of like using the GROUP BY statement in SQL, where you can specify some variables to slice the data (zipcode and sex, for example), and operations to calculate on each slice (like, calculate the mean of each age). Check out the link below for some examples of plyr in action.
Cerebral Mastication: A fast intro to plyr for R
And I'm hoping one day soon that plyr will be powered by foreach and the iterators package so tasks can easily spread across multiple cores or machines.
Posted by: Hadley | August 28, 2009 at 12:35
I wonder how this relates to using cast from the reshape package. I find it quite useful for doing "group by"-like operations in R. It has a bit of a learning curve though.
Posted by: Dean Eckles | September 02, 2009 at 22:10