Last month we showcased the JSM Data Expo, where the winning entry was a visualization of airline delays represented as a color-coded calendar. That graphic was created in SAS, but now thanks to reader Paul Bleicher, we can show you how to create the same graphic in R.
Paul Bleicher, MD PhD is Chief Medical Officer at Humedica, a next-generation clinical informatics company that provides novel business intelligence solutions to the health care and life science industries. Paul is leading a team that is using R extensively for a wide variety of predictive analytics and data visualization applications with medical record data. Paul has been kind enough to share his R code that takes a sequence of numeric values indexed by date, and represents them as a calendar with the days filled with colors representing the values. It's easier to explain by example: let's download Microsoft's stock price from 2006 to date from Yahoo, and plot it using Paul's calendarHeat function:
stock <- "MSFT"
start.date <- "2006-01-12"
end.date <- Sys.Date()
quote <- paste("http://ichart.finance.yahoo.com/table.csv?s=",
"&b=", substr(start.date, 9, 10),
"&c=", substr(start.date, 1,4),
"&e=", substr(end.date, 9, 10),
"&f=", substr(end.date, 1,4),
stock.data <- read.csv(quote, as.is=TRUE)
calendarHeat(stock.data$Date, stock.data$Adj.Close, varname="MSFT Adjusted Close")
Pretty cool (click the calendar to see the details better). We used financial data here because it's easier to access than the airline data, but it's actually a pretty interesting way of looking at a financial time series. Weekend and holiday effects are a bit more obvious, and it's a bit like being able to see the daily, weekly, monthly and yearly closes all at once (by scanning your eye over the calendar in different directions).
The calendarHeat function takes two primary arguments: the first, dates, is a vector of dates. You can provide these in POSIXlt format or as characters as in the example above (where they were converted using the default "%Y-%m-%d" value of the optional date.form argument). The second is a vector of numeric values (here, stock closing prices) which by default are converted into a scale of 99 shades from red to green. You can control the number of shades with the ncolors option, and the options colors="r2b" and colors="w2b" give you the alternative colorschemes red-to-blue and white-to-blue, respectively.
The source code for the calendarHeat function is attached in the link below, licensed as open-source under GPL v2. Share and enjoy. And if someone downloads the airline data and reproduces this chart, let us know!
Update Dec 3: With thanks to Jon Egil Strand in the comments, the calendarHeat.R source file has been updated to fix an off-by-one error.
Paul Bleicher: calendarHeat.R (source code for calendarHeat function)