by Daniel Hanson, QA Data Scientist, Revolution Analytics
Introduction and Data Setup
Last time, we included a couple of examples of plotting a single xts time series using the plot(.) function (ie, said function included in the xts package). Today, we’ll look at some quick and easy methods for plotting overlays of multiple xts time series in a single graph. As this information is not explicitly covered in the examples provided with xts and base R, this discussion may save you a bit of time.
To start, let’s look at five sets of cumulative returns for the following ETF’s:
SPY SPDR S&P 500 ETF Trust
QQQ PowerShares NASDAQ QQQ Trust
GDX Market Vectors Gold Miners ETF
DBO PowerShares DB Oil Fund (ETF)
VWO Vanguard FTSE Emerging Markets ETF
We first obtain the data using quantmod, going back to January 2007:
library(quantmod)
tckrs <- c("SPY", "QQQ", "GDX", "DBO", "VWO")
getSymbols(tckrs, from = "2007-01-01")
Then, extract just the closing prices from each set:
SPY.Close <- SPY[,4]
QQQ.Close <- QQQ[,4]
GDX.Close <- GDX[,4]
DBO.Close <- DBO[,4]
VWO.Close <- VWO[,4]
What we want is the set of cumulative returns for each, in the sense of the cumulative value of $1 over time. To do this, it is simply a case of dividing each daily price in the series by the price on the first day of the series. As SPY.Close[1], for example, is itself an xts object, we need to coerce it to numeric in order to carry out the division:
SPY1 <- as.numeric(SPY.Close[1])
QQQ1 <- as.numeric(QQQ.Close[1])
GDX1 <- as.numeric(GDX.Close[1])
DBO1 <- as.numeric(DBO.Close[1])
VWO1 <- as.numeric(VWO.Close[1])
Then, it’s a case of dividing each series by the price on the first day, just as one would divide an R vector by a scalar. For convenience of notation, we’ll just save these results back into the original ETF ticker names and overwrite the original objects:
SPY <- SPY.Close/SPY1
QQQ <- QQQ.Close/QQQ1
GDX <- GDX.Close/GDX1
DBO <- DBO.Close/DBO1
VWO <- VWO.Close/VWO1
We then merge all of these xts time series into a single xts object (à la a matrix):
basket <- cbind(SPY, QQQ, GDX, DBO, VWO)
Note that is.xts(basket)returns TRUE. We can also have a look at the data and its structure:
> head(basket)
SPY.Close QQQ.Close GDX.Close DBO.Close VWO.Close
2007-01-03 1.0000000 1.000000 1.0000000 NA 1.0000000
2007-01-04 1.0021221 1.018964 0.9815249 NA 0.9890886
2007-01-05 0.9941289 1.014107 0.9682540 1.0000000 0.9614891
2007-01-08 0.9987267 1.014801 0.9705959 1.0024722 0.9720154
2007-01-09 0.9978779 1.019889 0.9640906 0.9929955 0.9487805
2007-01-10 1.0012025 1.031915 0.9526412 0.9517923 0.9460847
> tail(basket)
SPY.Close QQQ.Close GDX.Close DBO.Close VWO.Close
2014-01-10 1.302539 NA 0.5727296 1.082406 0.5118100
2014-01-13 1.285209 1.989130 0.5893833 1.068809 0.5053915
2014-01-14 1.299215 2.027058 0.5750716 1.074166 0.5110398
2014-01-15 1.306218 2.043710 0.5826177 1.092707 0.5109114
2014-01-16 1.304520 2.043941 0.5886027 1.089411 0.5080873
2014-01-17 1.299003 2.032377 0.6070778 1.090647 0.5062901
Note that we have a few NA values here. This will not be of any significant consequence for demonstrating plotting functions, however.
We will now look how we can plot all five series, overlayed on a single graph. In particular, we will look at the plot(.) functions in both the zoo and xts packages.
Using plot(.) in the zoo package
The xts package is an extension of the zoo package, so coercing our xts object basket to a zoo object is a simple task:
zoo.basket <- as.zoo(basket)
Looking at head(zoo.basket) and tail(zoo.basket), we will get output that looks the same as what we got for the original xts basket object, as shown above; the date to data mapping is preserved. The plot(.) function provided in zoo is very simple to use, as we can use the whole zoo.basket object as input, and the plot(.) function will overlay the time series and scale the vertical axis for us with the help of a single parameter setting, namely the screens parameter.
Let’s now look at the code and the resulting plot in the following example, and then explain what’s going on:
# Set a color scheme:
tsRainbow <- rainbow(ncol(zoo.basket))
# Plot the overlayed series
plot(x = zoo.basket, ylab = "Cumulative Return", main = "Cumulative Returns",
col = tsRainbow, screens = 1)
# Set a legend in the upper left hand corner to match color to return series
legend(x = "topleft", legend = c("SPY", "QQQ", "GDX", "DBO", "VWO"),
lty = 1,col = tsRainbow)
We started by setting a color scheme, using the rainbow(.) command that is included in the base R installation. It is convenient as R will take in an arbitrary positive integer value and select a sequence of distinct colors up to the number specified. This is a nice feature for the impatient or lazy among us (yes, guilty as charged) who don’t want to be bothered with picking out colors and just want to see the result right away.
Next, in the plot(.) command, we assign to x our “matrix” of time series in the zoo.basket object, labels for the horizontal and vertical axes (xlab, ylab), a title for the graph (main), the the colors (col). Last, but crucial, is the parameter setting screens = 1, which tells the plot command to overlay each series in a single graph.
Finally, we include the legend(.) command to place a color legend at the upper left hand corner of the graph. The position (x) may be chosen from the list of keywords "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right" and "center"; in our case, we chose "topleft". The legend parameter is simply the list of ticker names. The lty parameter refers to “line type”, and by setting it to 1, the lines in the legend are shown as solid lines, and as in the plot(.) function, the same color scheme is assigned to the parameter col.
Back to the color scheme, we may at some point need to show our results to a manager or a client, so in that case, we probably will want to choose colors that are easier on the eye. In this case, one can just store the colors into a vector, and then use it as an input parameter. For example, set
myColors <- c("red", "darkgreen", "goldenrod", "darkblue", "darkviolet")
Then, just replace col = tsRainbow with col = myColors in the plot and legend commands:
plot(x = zoo.basket, xlab = "Time", ylab = "Cumulative Return",
main = "Cumulative Returns", col = myColors, screens = 1)
legend(x = "topleft", legend = c("SPY", "QQQ", "GDX", "DBO", "VWO"),
lty = 1, col = myColors)
We then get a plot that looks like this:
Using plot(.) in the xts package
While the plot(.) function in zoo gave us a quick and convenient way of plotting multiple time series, it didn’t give us much control over the scale used along the horizontal axis. Using plot(.) in xts remedies this; however, it involves doing more work. In particular, we can no longer input the entire “matrix” object; we must add each series separately in order to layer the plots. We also need to specify the scale along the vertical axis, as in the xts case, the function will not do this on the fly as it did for us in the zoo case.
We will use individual columns from our original xts object, basket. By using basket rather than basket.zoo, this tells R to use the xts version of the function rather than the zoo version (à la an overloaded function in traditional object oriented programming). Let’s again look at an example and the resulting plot, and then discuss how it works:
plot(x = basket[,"SPY.Close"], xlab = "Time", ylab = "Cumulative Return",
main = "Cumulative Returns", ylim = c(0.0, 2.5), major.ticks= "years",
minor.ticks = FALSE, col = "red")
lines(x = basket[,"QQQ.Close"], col = "darkgreen")
lines(x = basket[,"GDX.Close"], col = "goldenrod")
lines(x = basket[,"DBO.Close"], col = "darkblue")
lines(x = basket[,"VWO.Close"], col = "darkviolet")
legend(x = 'topleft', legend = c("SPY", "QQQ", "GDX", "DBO", "VWO"),
lty = 1, col = myColors)
As mentioned, we need to add each time series separately in this case in order to get the desired overlays. If one were to try x = basket in the plot function, the graph would only display the first series (SPY), and a warning message would be returned to the R session. So, we first use the SPY series as input to the plot(.) function, and then add the remaining series with the lines(.) command. The color for each series is also included at each step (the same colors in our myColors vector).
As for the remaining arguments in the plot command, we use the same axis and title settings in xlab, ylab, and main. We set the scale of the vertical axis with the ylim parameter; noting from our previous example that VWO hovered near zero at the low end, and that DBO reached almost as high as 2.5, we set this range from 0.0 to 2.5. Two new arguments here are the major.ticks and minor.ticks settings. The major.ticks argument represents the periods in which we wish to chop up the horizontal axis; it is chosen from the set
{"years", "months", "weeks", "days", "hours", "minutes", "seconds"}
In the example above, we chose "years". The minor.ticks parameter can take values of TRUE/FALSE, and as we don’t need this for the graph, we choose FALSE. The same legend command that we used in the zoo case can be used here as well (using myColors to indicate the color of each time series plot). Just to compare, let’s change the major.ticks parameter to "months" in the previous example. The result is as follows:
Wrap-up
A new package, called xtsExtra, includes a new plot(.) function that provides added functionality, including a legend generator. However, while it is available on R-Forge, it has not yet made it into the official CRAN repository. More sophisticated time series plotting capability can also be found in the quantmod and ggplot2 packages, and we will look at the ggplot2 case in an upcoming post. However, for plotting xts objects quickly and with minimal fuss, the plot(.) function in the zoo package fills the bill, and with a little more effort, we can refine the scale along the horizontal axis using the xts version of plot(.). R help files for each of these can be found by selecting plot.zoo and plot.xts respectively in help searches.