I missed this when it was announced back on September 29, but R won a 2014 Bossie Award for best open-source big-data tools from InfoWorld (see entry number 5):
A specialized computer language for statistical analysis, R continues to evolve to meet new challenges. Since displacing lisp-stat in the early 2000s, R is the de-facto statistical processing language, with thousands of high-quality algorithms readily available from the Comprehensive R Archive Network (CRAN); a large, vibrant community; and a healthy ecosystem of supporting tools and IDEs. The 3.0 release of R removes the memory limitations previously plaguing the language: 64-bit builds are now able to allocate as much RAM as the host operating system will allow.
Traditionally R has focused on solving problems that best fit in local RAM, utilizing multiple cores, but with the rise of big data, several options have emerged to process large-scale data sets. These options include packages that can be installed into a standard R environment as well as integrations into big data systems like Hadoop and Spark (that is, RHive and SparkR).
Check out the full list of winners at the link below. (Thanks to RG for the tip!)
InfoWorld: Bossie Awards 2014: The best open source big data tools
And RCloud is also shown on the 4th slide.
Posted by: Gergely Daróczi | December 29, 2014 at 14:19
Does R have anything comparable to the SAS Data Step? RAM is not an issue for SAS data steps. Indexed datasets also help SAS deal with large datasets.
Posted by: Don Mayfield | December 29, 2014 at 18:24
@Don, there are many data-processing tools in R -- take a look at dplyr and data.table for example. Revolution R Enterprise also has a function akin to the Data Step for processing big data files in R.
Posted by: David Smith | January 08, 2015 at 09:50