Last month we showcased the JSM Data Expo, where the winning entry was a visualization of airline delays represented as a color-coded calendar. That graphic was created in SAS, but now thanks to reader Paul Bleicher, we can show you how to create the same graphic in R.
Paul Bleicher, MD PhD is Chief Medical Officer at Humedica, a next-generation clinical informatics company that provides novel business intelligence solutions to the health care and life science industries. Paul is leading a team that is using R extensively for a wide variety of predictive analytics and data visualization applications with medical record data. Paul has been kind enough to share his R code that takes a sequence of numeric values indexed by date, and represents them as a calendar with the days filled with colors representing the values. It's easier to explain by example: let's download Microsoft's stock price from 2006 to date from Yahoo, and plot it using Paul's calendarHeat function:
stock <- "MSFT"
start.date <- "2006-01-12"
end.date <- Sys.Date()
quote <- paste("http://ichart.finance.yahoo.com/table.csv?s=",
stock,
"&a=", substr(start.date,6,7),
"&b=", substr(start.date, 9, 10),
"&c=", substr(start.date, 1,4),
"&d=", substr(end.date,6,7),
"&e=", substr(end.date, 9, 10),
"&f=", substr(end.date, 1,4),
"&g=d&ignore=.csv", sep="")
stock.data <- read.csv(quote, as.is=TRUE)
calendarHeat(stock.data$Date, stock.data$Adj.Close, varname="MSFT Adjusted Close")
Pretty cool (click the calendar to see the details better). We used financial data here because it's easier to access than the airline data, but it's actually a pretty interesting way of looking at a financial time series. Weekend and holiday effects are a bit more obvious, and it's a bit like being able to see the daily, weekly, monthly and yearly closes all at once (by scanning your eye over the calendar in different directions).
The calendarHeat function takes two primary arguments: the first, dates, is a vector of dates. You can provide these in POSIXlt format or as characters as in the example above (where they were converted using the default "%Y-%m-%d" value of the optional date.form argument). The second is a vector of numeric values (here, stock closing prices) which by default are converted into a scale of 99 shades from red to green. You can control the number of shades with the ncolors option, and the options colors="r2b" and colors="w2b" give you the alternative colorschemes red-to-blue and white-to-blue, respectively.
The source code for the calendarHeat function is attached in the link below, licensed as open-source under GPL v2. Share and enjoy. And if someone downloads the airline data and reproduces this chart, let us know!
Update Dec 3: With thanks to Jon Egil Strand in the comments, the calendarHeat.R source file has been updated to fix an off-by-one error.
Paul Bleicher: calendarHeat.R (source code for calendarHeat function)
Nice thanks! It's amazing how giving the R community is. Thanks for the blog.
Posted by: Jay | November 03, 2009 at 08:49
Here's a quick rendition in ggplot2:
Posted by: Hadley | November 03, 2009 at 10:21
But maybe a better display of the same data is :
Posted by: Hadley | November 03, 2009 at 10:53
Your package allows such concise definition of graphics Hadley. This really demonstrates the power of your paradigm for ggplot yet again.
Posted by: Jay | November 03, 2009 at 12:01
Neat! I just used it to chart my iPhone app sales over time:
http://blog.planetaryscale.com/2009/11/04/iphone-app-sales-heatmap/
Posted by: Andrew Wooster | November 04, 2009 at 00:48
Would be interesting to see the % daily change (instead of closing price) in this format.
Posted by: Sean | November 08, 2009 at 12:30
A bug in merge used in original sourceode shifts data one day. Instead use this:
# Merge moves data by one day, avoid
caldat <- data.frame(date.seq = seq(min.date, max.date, by="days"), value = NA)
dates <- as.Date(dates)
caldat$value[match(dates, caldat$date.seq)] <- values
Posted by: Jon Egil Strand | November 28, 2009 at 14:31
Very nice. Even better - hook this into a web application, make the heat map a clickable image map taking you to data for that date.
Posted by: Neil | January 05, 2010 at 00:21
Thanks for showcasing this great tool. I outlined it in a blog on machine learning trading methods, and made a small modification to evaluate %daily changes, instead of price values.
Posted by: intelligent trading | February 22, 2010 at 17:13
Tks for posting and share this tool. I'm going to use in meteorology. Thank you.
Posted by: rafael | August 30, 2011 at 15:53
Nice, Would you consider adding this in the R Graph Gallery ?
Posted by: Romain François | November 11, 2011 at 03:09
Very appealing graphic. An optional argument to set the
color range, such as ylim=c(nn,nn), would allow
several calendar heat maps to be produced with
the same color scale. This modification
appears to be challenging.
One sloppy way to establish a y range
would be to input a dummy data value
at an obviously false date.
Posted by: Giles L Crane | December 05, 2011 at 07:43
When testing this I get the following error (RStudio 0.94.110 running R 2.14.0):
calendarHeat(stock.data$Date, stock.data$Adj.Close, varname="MSFT Adjusted Close")
Error in compute.layout(x$layout, cond.max.levels, skip = x$skip) :
Inadmissible value of layout.
Any ideas how to fix this? Seems it is the first call to print which gives this error.
Posted by: Robert Feldt | January 16, 2012 at 22:20
Robert,
If you're using this with the "2006-01-12" start date that's used in the example, then you will exceed the max levels allowed by the print call.
Try again with a more recent start date and you should be fine...
Posted by: Chris O'Brien | January 23, 2012 at 09:47
Is there a quick fix how to alter the heat map calendar to Mon first, Sun last layout? Thanks
Posted by: Radim Sevcik | March 29, 2012 at 07:20
Can some one help in getting tooltips built into this heat map using library(sendplot) so that upon clicking the cell corresponding to a date, I can see the High/Low of the stock on that date?
An example code to do the above would be greatly welcomed.
Posted by: Vijayan Padmanabhan | April 11, 2012 at 02:55
Vijayan-you should be able to modify the code in the tutorial vignette to do what you need.
carsX = as.matrix(mtcars)
carsX <- sweep(carsX, 2, colMeans(carsX, na.rm = T))
sx <- apply(carsX, 2, sd, na.rm = T)
carsX <- sweep(carsX, 2, sx, "/")
x = 1:dim(carsX)[2] ; y = 1:dim(carsX)[1] ; z = t(carsX)
plot.call = "image(x=x,y=y, z=z,
axes = FALSE, xlab = '', ylab = '');
axis(1,1:dim(carsX)[2],
labels=colnames(carsX),
las = 2, line = -0.5, tick = 0,cex.axis =1);
axis(4,1:dim(carsX)[1],
labels=rownames(carsX),
las = 2, line = -0.5, tick = 0,cex.axis =.8)"
mai.mat = matrix(c(1,.2,.2,1.5), ncol=4)
mai.prc = FALSE
xy.labels=list(value=round(carsX,3))
x.labels=data.frame(
label=colnames(carsX),
description=c("Miles/(US) gallon","Number of cylinders",
"Displacement (cu.in.)",
"Gross horsepower",
"Rear axle ratio",
"Weight (lb/1000)",
"1/4 mile time",
"V/S",
"Transmission (0 = automatic, 1 = manual)",
"Number of forward gears",
"Number of carburetors")
)
imagesend(plot.call=plot.call,
y.pos=y,x.pos=x,
mai.mat=mai.mat, mai.prc=mai.prc,
xy.type="image.midpoints",
x.labels=x.labels,
xy.labels = xy.labels,
image.size="800x600",
fname.root="exPlotImage",
font.size=18)
Posted by: The Dude | September 27, 2012 at 13:27
A simple fix to deal with the "Inadmissible value of layout" error is to replace:
layout = c(1, nyr%%7)
with
layout = c(1, nyr)
in the source. To accommodate the extra levels, you could decrease the spacing between levels slightly:
between = list(x=0, y=c(0.5, 0.5))
Alternatively, using:
layout = c(1, 7, nyr/7)
yields a multi-page plot, but would need to be tweaked to allow trellis.panelArgs() to function correctly.
Posted by: John B | March 20, 2013 at 16:10