by Joseph Rickert
One of the greatest strengths of the R language is surely the base graphics capabilities. Grid graphics, lattice, ggplot2, bigvis and the many R packages that interface with javascript D3 graphics have added astounding capabilities, well beyond what can be achieved with base graphics alone. Nevertheless, the quick, one line, base graphics plots ( like plot() ) are a tremendous aid to data exploration, and are responsible for a good bit of the "flow" of an R session. Consider the left panel in the plot below, the one line, scatter plot of the variables disp and mpg from the mtcars data set, made with just the default settings plot(disp,mpg). One could argue that this quick plot tells you nearly everything you need to know about the data set.
Even for beginners, its pretty easy to get out a basic plot to learn something about a data set. The trouble starts, however, when you want to show the plot to someone else. And, that is not the documentation's "fault". The documentation is all there. Typing help(package="graphics") at the command line will put most every thing you need to know about base graphics at your reach. However, I think that until a person gets used to thinking of R as a system of interacting functions it takes a bit of ingenuity for a beginner to figure out how to accomplish a basic "functional" task like produce a plot that you can show around. So, in the spirit of trying to make things a little easier I have collected a few resources that might be helpful in mastering base graphics.
First, here is the code to create the right panel of the figure above. I don't pretend that this is really any improvement in either aesthetics or clarity, but it does make a nice abbreviated "cheat sheet" for some of the more common things one might want to do with a base plot.
# Some handy plotting parameters attach(mtcars) par(mfrow = c(1,2)) # Put 2 plots on the same device plot(disp,mpg) plot(disp,mpg, main = "MPG vs. Displacement", # Add a title type = "p", col = "grey", # Change the color of the points pch = 16, # Change the plotting symbol see help(points) cex = 1, # Change size of plotting symbol xlab = "Displacement (cu. in)", # Add a label on the x-axis ylab = "Miles per Gallon", # Add a label on the y-axis bty = "n", # Remove the box around the plot #asp = 1, # Change the y/x aspect ratio see help(plot) font.axis = 1, # Change axis font to bold italic col.axis = "black", # Set the color of the axis xlim = c(85,500), # Set limits on x axis ylim = c(10,35), # Set limits on y axis las=1) # Make axis labels parallel to x-axis abline(lm(mpg ~ disp), # Add regression line y ~ x col="red", # regression line color lty = 2, # use dashed line lwd = 2) # Set thickness of the line lines(lowess(mpg ~ disp), # Add lowess line y ~ x col="dark blue", # Set color of lowess line lwd= 2) # Set thickness of the lowess line leg.txt <- c("red = lm", "blue = lowess") # Text for legend legend(list(x = 180,y = 35), # Set location of the legend legend = leg.txt, # Specify text col = c("red","dark blue"), # Set colors for legend lty = c(2,1), # Set type of lines in legend merge = TRUE) # merge points and lines
The Quick-R page on Graphical Parameters is very helpful, and the R Programing/Graphics page on wikibooks.org is very nicely done. It might be easier to memorize the code they provide to plot out the basic plotting symbols than to remember the pch values themselves.
Maindonald and Brown, who include concise but informative sections on both base graphics and lattice (trellis) graphics in their book Data Analysis and Graphics Using R, stress the importance of paying attention to aspect ratio. They offer a simple example of a plot that hides the pattern in the data until the aspect ratio is set to a reasonable value. Try running this code with the default value of the aspect ratio:
plot((1:50)*0.92,sin((1:50)*0.92))
and then again with the aspect ratio set to something around 2 or 3.
plot((1:50)*0.92,sin((1:50)*0.92),asp=3)
Note that the code for all of the plots in Maindonald and Brown are available on this website.
Once you get the hang of things, it is also relatively straightforward to do some fairly sophisticated things with basic R plots. This post by David Smith from a couple of years ago highlights the Vistat cheat sheet for mathematical annotation, and a 2012 post by Winston Chang shows how to use your favorite fonts in R charts.
I found trying to change the background color of a base plot to be a vexing exercise, but this code based on some advice from Marc Schwartz will plot the unadorned scatter plot above with a dark grey background and a light grey plot area.
# Set the background color to "dark grey" par(bg = "grey") plot(education, prestige, type = "n") # Now plot the points on the existing window plot(disp,mpg) # Now set the plot region to grey rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col = "light grey") # Now plot the points on the existing window points(disp,mpg)
A recent post by Derek Ogle points to a draft chapter on Plotting Fundamentals for base graphics from his upcoming book: Introductory Fisheries Analyses with R. The chapter is very well done and it looks like the book will be a real contribution. The example he provides on constructing a scatter plot with different symbols by group is very useful.
For some perceptive and timeless tips on "making your R graphics look their best" please have a look at this early post by David Smith.
For some good reading: Paul Maurell's classic book, R Graphics, is a lucid and comprehensive account of base, trellis and grid graphics. Paul is a member of the R core group and has written many graphics related packages. Note that most of the plots from Paul's book are available on his website.
Finally, the definitive source of information on base graphics is the R documentation. The graphics task view gives an overview what packages are available and how they fit together. Chapter 12, Graphical Procedures, of An Introduction to R is concise but very approachable guide.
Au contraire - having tried all the graphing packages, I use base R as my first choice for all serious work, because I retain control over everything and can create all manner of effects with a little trickery here and there. And if you want more, save it as PDF and edit it in Inkscape / Illustrator. I find ggplot2 and friends to be the quick 'n' dirty options.
Posted by: Robert Grant | January 16, 2015 at 14:22