by John-Mark Agosta, data scientist manager at Microsoft
Let’s say you’ve built a model in R that is larger than you can conveniently run locally, and you want to take advantage of Azure’s resources simply to run it on a larger machine. This blog explains how to provision and run an Azure virtual machine (VM) for this, using the mrsdeploy library that comes installed with Microsoft’s R Server. We will work specifically with the Unbuntu Linux version of the VM, so I you’ll need to be familiar with working with superuser privileges at the command line in Linux, and of course, familiar with R.
The fundamental architecture consists of your local machine as the client for which you create a server machine in the Cloud. You’ll set up a service on the remote machine — the one in the cloud. Once you do this, you needn’t interact directly with the remote machine; instead you issue commands to it and see the results returned at the client. This is one approach; there are many many ways this can be done in Azure, depending on your choice of language, reliance on coding, capabilities of the service, and complexity and scale of the task. A data scientist typically works first interactively to explore data on an individual machine, then puts the model thus built into production at scale, in this example, in the Cloud. The purpose of this posting is to clarify the deployment process, or as it is called, in a mouthful, operationalization. In short, using a VM running the mrsdeploy library in R Server lets you operationalize your code with little effort, at modest expense.
Alternatively, instead of setting up a service with
R server, one unadvisedly could just provision a an bare virtual machine, and login into it as one would any remote machine with the manual encumbrance of having to work with multiple machines, load application software, and move data and code back and forth. But that’s what we avoid. The point of the Cloud is making large data and compute as much as possible like working on your local computer.
Deploying Microsoft R Server (MRS) on an Azure VM
Azure Marketplace offers a Linux VM (Ubuntu version 16.04) preconfigured with R Server 2016. Additionally the Linux VM with R Server comes with mrsdeploy, a new R package for establishing a remote session in a console application and for publishing and managing a web service written in R. In order to use the R Server’s deployment and operationalization features, one needs to configure R Server for operationalization after installation, to act as a deployment server and host analytic web services.
Alternately there are other Azure platforms for operationalization using R Server in the Marketplace, with other operating systems and platforms including HDInsight, Microsoft’s Hadoop offering. Or, equivalently one could use the Data Science VM available in the Marketplace, since it has a copy of R Server installed. Configuration of these platforms is similar to the example covered in this posting.
- Microsoft R Server 2016 (version 9.0.1) for Linux (CentOS version 7.2)
- R Server Enterprise
- R Server for HDInsight
Provisioning an R Server VM, as reference in the documentation, takes a few steps that are detailed here, which consist of configuring the VM and setting up the server account to authorize remote access. To set up the server you’ll use the system account you set up as a user of the Linux machine. The server account is used for client interaction with the R Server, and should not be confused with the Linux system account. This is a major difference with the Windows version of the R Server VM that uses Active Directory services for authentication.
Provisioning a machine from the Marketplace
You will want to do the install of a Unbuntu Marketplace VM with R server preinstalled. The best way to find it on
portal.azure.com is to search for “r server”: