by Joseph Rickert
A strong case can be made that base R graphics supplemented with either the lattice library or ggplot2 for plotting by subgroups provides everything a statistician might need for both exploratory data analysis and for developing clear, crisp for communicating results. However, it is abundantly clear that web based graphics, driven to a large extent by JavaScript enhanced web design, is opening up new vistas for data visualizations. The ability to interact with graphs, view them from different points of view, establish real-time relationships between different plots and other graphical elements provides opportunities to extract new insights from data. To be fair, many of these capabilities have existed in R for quite some time, some from the very beginning. For example, the identify() function in the graphics package lets you mouse over a point on a plot and click to determine the associated value, and what could be easier than the plot3d() function in the rgl package that uses OpenGL technology to let you grab a #D scatter plot with your mouse and rotate it any which way. Run this code to see how it works.
Developers are continuing to build out the infrastructure of web based graphics, and now it is possible to select environments that offer a rich set of features all in one place.
Until recently, however, making use of web based graphics directly from R required a basic knowledge of web based development and some JavaScript programming skills. If you have these skills, or want to acquire them, have a look at the V8 package which provides an R interface to Google's open source JavaScript engine, but if JavaScript programming is not going to be your thing then htmlwidgets is the way to go.
An R user can load a htmlwidgets library and generate a web based plot by calling a function that looks like any other R plotting function. For example, after installing and loading the three.js library, a few lines of code will produce an interactive 3D scatter plot that can be displayed in a webpage, a markdown document or in the RStudio plot window. The following code generates a more contemporary version of the rotating 3D scatterplot.
library(stringr) library(htmltools) install.packages("devtools",repos="http://cran.rstudio.com/") devtools::install_github("bwlewis/rthreejs") library(threejs) data(mtcars) # load the mtcars data set data <- mtcars[order(mtcars$cyl),] #sort the data set for plotting head(data) uv <- tabulate(mtcars$cyl) # figure our how many observarions for each cylindar type col <- c(rep("red",uv[4]),rep("yellow",uv[6]),rep("blue",uv[8])) #set the colors row.names(mtcars) # see what models of cars are in the data set scatterplot3js(data[,c(3,6,1)], labels=row.names(mtcars), # mousing over a point will show what model car it is size=mtcars$hp/100, # the size of a point maps to horsepower flip.y=TRUE, color=col,renderer="canvas") # point color indicates number of cylindars
This kind of visualization packs a lot of information into a relatively small space. Not only does the ability to rotate the plot produce a satisfying 3 dimensional rendering, but using color, size and mouse movement to convey information provides three additional dimensions.
As exciting as this kind of visualization is, however, I don’t mean to imply that it is somehow going to make static graphics obsolete. Rob Kabacoff's 2012 post using the scatterplot3d package provides an example of a 3D scatterplot of the mtcars data that has a timeless, elegant look and clearly displays the data without distraction.
Nevertheless, I am betting on htmlwidgets moving forward to be the next big thing. Not only are they easy to use, but the developers have created a framework for developing new widgets that hides most of the details of JavaScript bindings and the like. Currently, there are only a few ready to use widgets listed at the htmlwidgets.org showcase. so we will have to see if the R community embraces this technology.
In the meantime, for inspiration, have a look at Bryan Lewis' presentation at the recent NY R conference and the examples of widgets listed on his last slide.
I started using some of the HTML widgets and packages about a month back. Users love the output these widgets produce. I think the developers have done a good job.
What's needed is a stable release of the different packages that work together with a specified version of R that can be used in production. For example, in the Hadoop ecosystem, "approved" release levels of the different components that will work together are listed. You know that if you have your different components installed at the specified level, your workflow is solid. I understand that development and communications are not tightly coordinated in the R community.
I'm on R 3.1.1 and have experienced a few error messages that "this package is not available for your installed version of R." I've had to hack at the command line to get some packages installed. It's taken some time.
It's still kind of a wild west, evolving fast, and programmers should expect to have to take the time to hack and figure out how to get dependencies installed, etc.
Once there is some stability, more knowledge sharing via blog posts to make it easier to install and develop with the widgets, and a kind of "ok, it's safe enough for production," the use of these widgets is going to explode.
A big thank you and kudos to the developers!
Posted by: Phillip Burger | May 15, 2015 at 09:57