by Daniel Hanson
QA Data Scientist, Revolution Analytics
Some Applications of the xts Time Series Package
In our previous discussion, we looked at accessing financial data using the quantmod and Quandl R packages. As noted there, the data series returned by quantmod comes in the form of an xts time series object, and Quandl provides a parameter that sets the return object of type xts. As the xts R package comes included with the quantmod package, it is not necessary to reload it as long as quantmod has been loaded.
In this article, we will look at some of the useful features in xts, by way of data retrieved from quantmod. We will also need to load one more package: moments. The moments package computes skewness and kurtosis of data, as these calculations are somewhat surprisingly not included in base R. With that said, we run the following library commands so that we’ll be ready to go:
library(quantmod)
library(xts)
library(moments) # to get skew & kurtosis
Now, let’s download the last 10 years of daily prices for the SPDR S&P 500 ETF (SPY).
getSymbols("SPY", src="google", from = "2004-01-01")
Remark: the from parameter in getSymbols is not described in the help file for the getSymbols function; a reader of our previous post was kind enough to provide this information.
As before, the return object from getSymbols is assigned the name of the ticker symbol, in this case SPY. Let’s look at this in more detail. First, we can verify that it is an xts object:
is.xts(SPY) # returns TRUE
Next, we can have a quick look at the data using head(SPY) and tail(SPY), just as we would for an R dataframe. These return, respectively:
SPY.Open SPY.High SPY.Low SPY.Close SPY.Volume 2004-01-02 111.85 112.19 110.04 111.23 34487200 2004-01-05 111.61 112.52 111.59 112.44 27160100 2004-01-06 112.25 112.73 112.00 112.55 19282500 2004-01-07 112.43 113.06 111.89 112.93 28340200 2004-01-08 112.90 113.48 112.77 113.38 34295500 2004-01-09 113.35 113.50 112.27 112.39 41431900 SPY.Open SPY.High SPY.Low SPY.Close SPY.Volume 2013-12-26 183.34 183.96 183.32 183.86 63365227 2013-12-27 184.10 184.18 183.66 183.84 61813841 2013-12-30 183.87 184.02 183.58 183.82 56857458 2013-12-31 184.07 184.69 183.93 184.69 86247638 2014-01-02 183.98 184.07 182.48 182.92 119636836 2014-01-03 183.21 183.60 182.63 182.88 81390502
We will, in fact, see that xts objects can typically be treated just like dataframes in a number of other cases. To wit, if we just wanted the closing prices for the series, we can extract them in the usual way:
SPY.Close <- SPY[, "SPY.Close"]
Then, note that this subset is also an xts object:
is.xts(SPY.Close) # returns TRUE
We will use this series of closing prices later when we look at plotting.
By now, you may be asking: if xts objects can be treated like dataframes, what’s the big deal? Well, the main difference is that, being indexed by date, we have a convenient tool at our disposal. For example, suppose we wanted to take the subset of prices from January 2006 through December 2007. This is easily done by entering the command
x1 <- SPY['2006-01/2007-12'] # store the output in x1, etc
Note that the index setting is of the form ‘from /to’, with date format YYYY-MM. The output is stored in a new xts object called, say, x1. In a similar fashion, if we wanted all the prices from the beginning of the set through, say, the end of July 2005, we would enter:
x2 <- SPY['/2005-07']
We can also store all the prices from a particular year, say 2010, or a particular month of the year, say December 2010, as follows (respectively):
x3 <- SPY['2010']
x4 <- SPY['2010-12']
Next, suppose we wish to extract monthly or quarterly data for 2010. There are a couple of ways to do this. First, one can use the commands:
x5 <- to.period(SPY['2010'], 'months')
x6 <- to.period(SPY['2010'], 'quarters')
These will give the prices on the last day of each month and quarter, respectively, as shown here:
SPY["2010"].Open SPY["2010"].High SPY["2010"].Low SPY["2010"].Close SPY["2010"].Volume 2010-01-29 112.37 115.14 107.22 107.39 3494623433 2010-02-26 108.15 111.58 104.58 110.74 4147289073 2010-03-31 111.20 118.17 111.17 117.00 3899883233 2010-04-30 118.25 122.12 117.60 118.81 3849880548 2010-05-27 119.38 120.68 104.38 110.76 7116214265 (etc…) SPY["2010"].Open SPY["2010"].High SPY["2010"].Low SPY["2010"].Close SPY["2010"].Volume 2010-03-31 112.37 118.17 104.58 117.00 11541795739 2010-06-30 118.25 122.12 102.88 103.22 16672606329 2010-09-30 103.15 115.79 101.13 114.13 12867300420 2010-12-31 114.99 126.20 113.18 125.75 10264947894
Alternatively, we can use the following commands to get the same data:
x7 <- to.monthly(SPY['2010'])
x8 <- to.quarterly(SPY['2010'])
The only difference is that instead of the actual end-of-month dates being shown, we get MMM YYYY and YYYY QQ formats in the left column, rather than the actual dates:
SPY["2010"].Open SPY["2010"].High SPY["2010"].Low SPY["2010"].Close SPY["2010"].Volume Jan 2010 112.37 115.14 107.22 107.39 3494623433 Feb 2010 108.15 111.58 104.58 110.74 4147289073 Mar 2010 111.20 118.17 111.17 117.00 3899883233 Apr 2010 118.25 122.12 117.60 118.81 3849880548 May 2010 119.38 120.68 104.38 110.76 7116214265 (etc…) SPY["2010"].Open SPY["2010"].High SPY["2010"].Low SPY["2010"].Close SPY["2010"].Volume 2010 Q1 112.37 118.17 104.58 117.00 11541795739 2010 Q2 118.25 122.12 102.88 103.22 16672606329 2010 Q3 103.15 115.79 101.13 114.13 12867300420 2010 Q4 114.99 126.20 113.18 125.75 10264947894
To close things out, let’s go back to the column of closing prices that we extracted above. As noted, the object SPY.Close is also an xts object. Taking a look at the top of this data using
head(SPY.Close)
we get:
SPY.Close
2004-01-02 111.23
2004-01-05 112.44
2004-01-06 112.55
2004-01-07 112.93
2004-01-08 113.38
2004-01-09 112.39
We can also use the subsetting features such as:
SPY.Close['2006-01/2007-12']
SPY.Close['/2005-07']
SPY.Close['2010']
SPY.Close['2010-12']
as we did for the full SPY set. What we apparently cannot do, however, is extract monthly or quarterly data from SPY.Close using the to.period(.), to.monthly(.) or to.quarterly(.) command. One gets unexpected behavior where all columns of the original SPY set are returned, rather than just the desired subsets of the SPY.Close prices. The reason for this does not seem to be provided in the xts documentation or vignette.
Getting back to the task at hand, we can plot the SPY.Close series in a fashion similar to plotting a vector of data using the plot(.) command. We can use the same parameters to indicate the x and y axis labels, a title, and the color of the graph. In addition, we can make our plot look nicer by using parameters specific to xts objects. For example, let’s look at the following plot(.) command:
plot(SPY.Close, main = "Closing Daily Prices for SP 500 Index ETF (SPY)",
col = "red",xlab = "Date", ylab = "Price", major.ticks='years',
minor.ticks=FALSE)
The parameters main, col, xlab, and ylab are the same as those used in base R plot(.). The parameters major.ticks and minor.ticks are specific to xts; the former will display years along the x-axis, while setting the latter to FALSE, suffice it to say for now, avoids a gray bar along the x-axis.
Our graph then looks like this:
We can also calculate daily log returns
SPY.ret <- diff(log(SPY.Close), lag = 1)
SPY.ret <- SPY.ret[-1] # Remove resulting NA in the 1st position
and then plot these using the same command, but with the return data:
plot(SPY.ret, main = "Closing Daily Prices for SP 500 Index ETF (SPY)",
col = "red", xlab = "Date", ylab = "Return", major.ticks='years',
minor.ticks=FALSE)
Interestingly, we can see the spike in the volatility of returns during the financial crisis of 2008-2009.
Finally, we can calculate summary statistics of the time series of returns, namely the mean return, volatility (standard deviation), skewness, and kurtosis:
statNames <- c("mean", "std dev", "skewness", "kurtosis")
SPY.stats <- c(mean(SPY.ret), sd(SPY.ret), skewness(SPY.ret), kurtosis(SPY.ret))
names(SPY.stats) <- statNames
SPY.stats
which gives us:
mean std dev skewness kurtosis
0.0001952538 0.0127824204 -0.0805232066 17.3271959064
The above hopefully provides a useful introduction to the xts package for use with financial market data. The plot examples given are admittedly rudimentary, and other packages, including quantmod, provide more sophisticated features that result in a more palatable presentation, as well as more useful information, such as plots of overlayed time series. We will look at more advanced graphics in an upcoming article.
This is a very nice post to highlight the great xts package by Jeff Ryan and Josh Ulrich. Above, Joseph mentioned that xts objects behave much like data.frames in terms of subsetting etc. It is important to know that the data in an xts object is actually stored in a matrix, not a data.frame. The xts object does not support data in a data.frame and any attempt to do so will result in the data.frame being converted into a matrix. So if you have mixed data types in a data.frame x.df (e.g. character data and numeric data) and you create an xts object using the xts() constructor function with x.df as the data then the resulting xts object will have its data as a character matrix (which you can confirm using class(coredata(my.xts))).
Posted by: Eric Zivot | January 09, 2014 at 14:26
Thanks for the informative post. Using
options(digits=6)
makes the final output more useful.
Posted by: Beliavsky | January 10, 2014 at 06:10
If I make a script out of the code above, to get data for QQQ instead of SPY, I need to replace "SPY" with "QQQ" in many places, instead of one, which would be better. This occurs because, as the post states, "the return object from getSymbols is assigned the name of the ticker symbol, in this case SPY." How can the code be made more generic so that it works for any ticker symbol?
Posted by: Beliavsky | January 10, 2014 at 07:59
Good tutorial, but there is something very weird from this outlet.
In last post You wrote "One of the limitations of data available from Yahoo and Google, as may be noticed above, is that it only dates back to January of 2007,"
and based on that instructed users to use futures data instead which is "common in these situations", which they can get at Quandl
yet in this tutorial You claim "getSymbols("SPY", src="google", from = "2004-01-01")"
Lately there is so much hustle-peddling of Quandl in data community that besides its goodness is really nothing special, and You lower Yourself in eyes of people who are actually very familiar with these matters.
Posted by: rbr | January 11, 2014 at 11:27