Apparently, there's some kind of football game going on here in the US this weekend. Strangely though, the ball isn't round. The playing field isn't even oval. No, this is American Football.
« December 2008 | Main | February 2009 »
Apparently, there's some kind of football game going on here in the US this weekend. Strangely though, the ball isn't round. The playing field isn't even oval. No, this is American Football.
Posted by David Smith at 10:46 in sports, statistics | Permalink | Comments (0) | TrackBack (0)
png(file="mygraphic.png",width=400,height=350)
plot(x=rnorm(10),y=rnorm(10),main="example")
dev.off()
png(file="animals72.png",width=400,height=350,res=72)
plot(Animals, log="xy", type="n", main="Animal brain/body size")
text(Animals, lab=row.names(Animals))
dev.off()
R is assuming the graph area is 5.55 inches across, so the default text size is large relative to the graph itself. You can correct this with the res= argument to png, which specifies the number of pixels per inch. The smaller this number, the larger the plot area in inches, and the smaller the text relative to the graph itself. Let's see what happens when you drop this down to 45/inch:
png(file="animals45.png",width=400,height=350,res=45)
plot(Animals, log="xy", type="n", main="Animal brain/body size")
text(Animals, lab=row.names(Animals))
dev.off()
Note the title is smaller, and the text labels are smaller too, making for a less-crowded plot. I like to choose a resolution that gives me an X dimension in the 8-10 inches range (here 400/45 = 8.33 inches).
png(file="notitle.png",width=400, height=350)
par(mar=c(5,3,2,2)+0.1)
hist(rnorm(100),ylab=NULL,main=NULL)
dev.off()
In this version, the text is much easier to read and the lines appear smoother.
If you don't have anti-aliasing on your system (and can't recompile R to enable it), you can use the poor-man's anti-aliasing trick: generate the graph in double the resolution, and display it at half the size. The browser will handle the anti-aliasing, at the expense of additional bandwidth for your graphic.
Of course, the most important tip for making your graph look good is: make a good-looking graph! Graphical display of quantitative data is in some ways more art than science, but as a general rule it takes time and effort to make a truly effective display that lets your data tell the story it needs to tell. Fortunately, R provides you with all the tools you need to pull out all the details, make the right comparisons, and make the results pleasing to the eye. Don't be satisfied with the "stock" graphs from the top-level functions like plot or hist. Make liberal use of the annotation functions like text and line, and experiment with choices of color, layout, and size.
There are many good resources for learning about making good graphical displays, but my favorite is Tufte's classic: The Visual Display of Quantitative Information. Not only is it chock-full with wonderful examples and sensible guidelines for displaying data, it makes a beautiful coffee-table book to show your non-statistician friends that Statistics is about more than just numbers.
If you want to download the scripts that generated the graphs in this article, you can get them here:
Download graphexamples.R (1.4K)
Posted by David Smith at 16:27 in advanced tips, graphics, R | Permalink | Comments (15) | TrackBack (0)
One of the most unique and powerful aspects of R is its ability to create statistical graphics beyond the limited palette found in off-the-shelf graphing tools like Excel. Especially for novices of data presentation, it can be difficult to grasp how much more meaning can be extracted from data when you have the tools to combine science and art creatively to create unique visualizations. (As an aside, I've been pleased to see that this is an idea that has been coming into the mainstream recently: the New York Times, for example, has in recent years has had some truly outstanding displays of data, both static and interactive. There was a fascinating article about the people behind those graphics in the New York magazine last week.)
Posted by David Smith at 09:59 in advanced tips, graphics, R | Permalink | Comments (3) | TrackBack (0)
REvolution R, the high-performance distribution of R from REvolution Computing, is now available for download for Windows and MacOS X systems from the REvolution Computing website. (The software has actually been available for a little while, but has only been formally announced in a press release today.)
Posted by David Smith at 14:51 in announcements, Revolution | Permalink | Comments (0) | TrackBack (0)
Andrew Abela: Choosing a good chart.
Posted by David Smith at 14:50 in graphics | Permalink | Comments (0) | TrackBack (0)
The Bay Area UseR Group will be meeting in San Francisco on Wednesday, February 18 at 7:30PM. The featured event will be a panel discussion: "pRediction: A quick survey of prediction methods in R". The panel members will include:
Posted by David Smith at 13:31 in events | Permalink | Comments (0) | TrackBack (0)
R is fast becoming a powerful tool for high-performance computing: the art making computational problems that take a long time to process run faster through the use of multiprocessor computers or computer clusters.
Posted by David Smith at 12:50 in advanced tips, high-performance computing, R | Permalink | Comments (0) | TrackBack (0)
In what has become an ongoing series of R tutorials, here's another Introduction to R document, by James Monogan. If you're familiar with interactive programming (but not R), but don't have a lot of time, this might be the introduction for you: it takes you through the basics of R at an efficient clip. In its 26 pages you'll learn how to:
Posted by David Smith at 08:13 in beginner tips, R | Permalink | Comments (0) | TrackBack (0)
Michael Friendly asks an interesting question on the r-help list: how can you generate a title where the words are in different colors, like this:
Hair color and Eye color
(Michael suggests a title like this might serve as an implicit legend for the point plotted in the graph below the title.)
The title function allows you to change the color of the text using the col argument, but that color is applied to the entire text string -- there's no obvious way to set the color of individual words.
Or is there? Barry Rowlingson offers an elegant solution that uses the "overhead transparency" principle of R graphics: you can overlay additional graphical elements one atop another, to build up your graph layer by layer. So you could add the title Hair color in red on the left, and Eye color in blue on the right, and put a black "and" in the middle. The trick is in the positioning -- it could take a lot of trial and error to get the x position of each element correct. But if you plot the same text three times in three different colors, but leave some words blank (so they won't overlay previously plotted elements) you don't have to worry about positioning at all. The phantom notation allows you to do that, as shown in Barry's solution:
plot(rnorm(20),rnorm(20),col=rep(c("red","blue"),c(10,10)))
title(expression("Hair color" *
phantom(" and Eye color")),col.main="red")
title(expression(phantom("Hair color and ") *
"Eye color"),col.main="blue")
title(expression(phantom("Hair color ") *
"and " * phantom("Eye color"),col.main="black"))
The phantom notation means "leave room for this, but don't draw it" -- see help(plotmath) for other examples. Barry also provides a function multiTitle to create multicolor titles in a single command:
multiTitle(color="red","Hair color", color="black",
" and ",color="blue","Eye color")
Another solution (suggested by Duncan Murdoch) is to use the strwidth function to calculate the widths of words and use this information to set the x position of individual words, as demonstrated in his technicolorTitle function. However, as this solution is implemented using the mtext function the results can be slightly different to what title usually produces.
You can download the code to create the graph above, and for the multiTitle and technicolorTitle functions here: Download colortitles.R (2.3K)
Posted by David Smith at 14:36 in advanced tips, R | Permalink | Comments (6) | TrackBack (0)
For newcomers to R who have at least a basic background in the principles of statistical analysis, John Maindonald has contributed an introductory guide to R: Using R for Data Analysis and Graphics. It uses a series of data sets and example R code to take the beginning user through launching R (on a Windows system; installing R is not covered), executing simple commands at the command-line, understanding objects and R function calls, graphics, and some simple statistical modeling techniques. Later chapters do touch on some more advanced modeling methods and how to program your own functions, but these sections can safely be ignored by the beginning R user.
Posted by David Smith at 16:01 in beginner tips, R | Permalink | Comments (1) | TrackBack (0)