Here are three recent news articles that feature interviews with members of the Revolution Analytics team talking about the importance of the R language:
- In Forbes, CEO Dave Rich talks to Gil Press about the business landscape for Big Data. In the article, Dave says:
SAS and SPSS remind me of Cobol and Fortran circa 1995. The scientific and academic institutions are relying on R and over time, this will be like the switch from Cobol to Java. The kids today are learning R and it will have the run for the next forty years that SAS and SPSS had for the last forty.
- Sramana Mitra's Thought Leaders in Big Data series features an in-depth interview with Dave and Chief Strategy Officer Michele Chambers. Michele says:
There is a new generation of tools, methodologies, and resources that are being trained at universities today. That is all being done around [the programming language] R.
- I also recently talked to InformationWeek about the Big Data Disconnect: how some organizations are stuggling to make sense of their data stores, and how the R language is a key part of the solution:
R has two primary advantages, according to Smith. "It's designed to work with data and build models with data," he said. "[Programmers] can go from a concept to a working model in a fraction of the time it takes with legacy systems." The second advantage is R's open source design. "You've got an entire community of statisticians and data scientists [who] are really pushing the envelope on data access, data platforms like Hadoop, data analysis techniques and also data visualization, which is an increasingly important part of the story."
The switch from COBOL to java (and I've had unpleasant experience), at least in the Fortune X00 world (where most COBOL lives), isn't as many believe. In fact, most enterprise java applications are nearly line for line transliterations of existing COBOL. What was called "lipstick on a pig" in the days from MS/DOS to Win 3.1. "New" java applications are tied to the VSAM file images with which such developers are comfortable. There isn't much relational model in Fortune X00 RDBMS, COBOL or java. And don't get me started on the silliness of EJBs.
The switch, to extent that it does occur, from S* to R is rather different, in that all stat packs are tied to flatfile data images. Reading Muenchen's "R for SAS and SPSS Users" defines the landscape. I'm not all sure it's possible to transliterate the semantics of S* to R in the way that it's done from COBOL to java.
Also, in the interview with Mr. Rich, he says “When the 2008 recession hit,” he told me recently, “the question was how come we weren’t better prepared with all the money we’ve spent in the last decade on information systems? ..."
The reason is that the quants (and their Suit handlers) chose to ignore the underlying data which was in plain sight: the gross divergence of median income from mortgage values; there was simply no way that house prices could rise without a foundation of rising median income, except through corruption in the mortgagers. (A precious few among the pundit class, including humble self, made the point.) The quants simply didn't wish to include such data in their models. Whether this was due to poor training in quants or mortgaging or simple corruption, I don't know. I've yet to read a mea culpa.
Posted by: Robert Young | February 23, 2013 at 08:22