If you need to present two time series spanning the same period, but in wildly different scales, it's tempting to use a time series chart with two separate vertical axes, one for each series, like this one from the Reserve Bank of New Zealand:
Charts like this typically have one or more crossover points, and that crossing imparts meaning to the viewer of the sense that one series is now "ahead" of the other. One problem is that crossover-points in dual-axis time series charts are entirely arbitrary. Changing either the left-hand or right-hand scale (and replotting the data accordingly) will change where the crossover points appear. And (as if often the case) the scales are automatically chosen to allow each series to use the full vertical space available, just changing the time-range of the data plotted will also change the location of the crossover points.
In an excellent blog post, statistician Peter Ellis points out five problems with dual-axis time series charts:
- The designer has to make choices about scales and this can have a big impact on the viewer
- In particular, “cross-over points” where one series cross another are results of the design choices, not intrinsic to the data, and viewers (particularly unsophisticated viewers) will not appreciate this and think there is more significance in cross over than is actually the case
- They make it easier to lazily associate correlation with causation, not taking into account autocorrelation and other time-series issues
- Because of the issues above, in malicious hands they make it possible to deliberately mislead
- They often look cluttered and aesthetically unpleasing
A simple alternative is to rescale both time series, for example to define both series to have a nominal value at a specific time, say both start at 100 on January 1, 2016. This is a useful way to compare the growth in two series since the beginning of the year, and means that both can be represented using the same single scale. (If you're using the ggplot2 package in R to plot time series, you can use the stat_index function from Peter's ggseas package to scale time series in this way.) The problem though is that you use the interpretability of the chart, having now lost the true scales for both time series.
All that being said, Peter suggests that there are times when a dual-axis chart can be appropriate, for example when the two axes are conceptually similar (as above, when both are linear monetary scales), and you use a consistent process to set the scales of the vertical axes. Other considerations include color-coding the axes for interpretability, and choosing colors that don't favor one series over the other. Implementing these best practices, Peter has created the dualplot() function for R, which cooses the axes according to a cross-over point you specify. This is equivalent to rescaling the series to have the same value at that specified points, but keeps the real-value axes for interpretability. Heres' the above chart, rendered with dualplot() with a crossover point at January 2104:
For more great discussion of the pros and cons of dual-axis time series charts, and the R code for the dualplot() function, follow the link to Peter's blog post below.
Peter's stats stuff: Dual axes time series plots may be ok sometimes after all (via Harlan Harris)
Thanks David for the article, I’m also in favour of selective approach to use dual axis in data analysis. In my case I had a dataset of 2 similar unit metrics but their values had a different range; and I tried to show this in my blog post: http://datanrg.blogspot.ca/2016/05/analyzing-david-and-goliath-datasets-on.html
Posted by: Datanrg.blogspot.com | August 19, 2016 at 12:07
Nice post. I would like to have two x axis and one y. Do not you know about solution?
Milan
Posted by: Milan Cisty | August 20, 2016 at 00:55
Nice post, thanks for this! A commenter on my original post picked up an embarrassing error (although not material to the question at point) with the y axis label on my "fixed" chart. It should be "USD purchased with one NZD", not the other way around... I've fixed in the original, would you mind picking up a fresh copy so I don't have to wince if this is reproduced in future? http://ellisp.github.io/img/0051-dualgood.svg
Posted by: Peter Ellis | August 20, 2016 at 22:18
Wait. You show five excellent reasons not to use dual axis plots and then proceed using exactly those?
That seems ironic. Am I missing something?
My take away is: multipanel plots are usually the best solution.
"will often be my preferred approach to this type of data".
Also, adding a third series is easy...
Posted by: Berry | August 21, 2016 at 09:57
@Berry, check Peter's original post (last link in the above) for his excellent explanations of why dual-axis plots are usually bad, and the situations where they are useful. I usually prefer panel plots as well, but as Peter explains there are some downsides to those as well.
Posted by: David Smith | August 22, 2016 at 08:23
@Peter, you're welcome, and I've updated the chart above. Cheers!
Posted by: David Smith | August 22, 2016 at 08:25