by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft)
In data science, to develop a model with optimal performance, exploratory experiments on different sets of hyper-parameters are often performed. Preliminary analyses on smaller data can be performed on a single machine, while the experimental one on large-scale data by sweeping multi-sets of parameters can be run on a cluster to boost the computational efforts. Scalable computation resources which can be easily managed are desired for such an application scenario. This blog post shares a walk-through using the Azure Interface tool that
- operates and manages Azure cloud instances directly from R using the AzureSMR package, and
- executes scalable analytical jobs on deployed instances using customized Microsoft R Server computation contexts, which are easily specified in R using an "interface object".
The overall architecture of the Interface Tool is described in the graphic below. The Master script manages Azure instance deployment and operation, as well as settings of analytics execution specifications. The Worker script contains the actual analytics which is submitted onto the Azure instance and run according to the configured context.
A predictive maintenance example is presented to illustrate the efficacy of the interface tool. Predictive maintenance is widely employed within the airline and manufacturing industries to reduce operational costs. One of the problems in a predictive maintenance scenario is to diagnose whether equipment is working under a healthy status or not by analyzing historical sensor measurements. Based on the data set of sensor measurements, a well-trained model is able to classify a machine as “low risk” or “high risk” of failure. To obtain an optimal model for accurate prediction, hyper-parameters should be swept to perform a search on the domain of definition. There are two critical hyper-parameters in the health status recognition that need experimental analysis for optimal prediction: