If you're already familiar with R, but struggling with out-of-memory or performance problems when attempting to analyze large data sets, you might want to check out this new EdX course, Analyzing Big Data with Microsoft R Server, presented by my colleague Seth Mottaghinejad. In the course, you'll learn how to build models using the RevoScaleR
package, and deploy those models to production environments like Spark and SQL Server. The course is self-paced with videos, tutorials and tests, and is free to audit.
(By the way, if you don't already know R, you might want to check out the courses Introduction to R for Data Science and Programming in R for Data Science first.)
The RevoScaleR package isn't available on CRAN: it's included with Microsoft R Server and Microsoft R Client. You can download and use Microsoft R Client for free, which provides an installation of R with the RevoScaleR
library built in and loaded when you start the session. An R IDE is also recommended: you can use R Tools for Visual Studio or RStudio.
The course is open now, and you can get started at EdX at the link below.
EdX: Analyzing Big Data with Microsoft R Server
Comments
You can follow this conversation by subscribing to the comment feed for this post.