« Where Ichiro Hits | Main | Because it's Friday: Keep Calm, and Carry on Charting »

June 17, 2011


Feed You can follow this conversation by subscribing to the comment feed for this post.

would be nice to see the time sries of the two components, we might be able to relate them to some observables

I might have said this here before and I will say this again. When doing "Big Data" kind of stuff...do show timing comparison..other wise it's meaningless. PCA in general is not that interesting. Also 9 million times a handful of columns dont make big data. Pick some gene data (billion rows by thousands cols) and then see how good this is compared to some other tools (for eg SAS, SPSS, R etc). Then this PCA would be interesting....

I suspect you want to be looking at log prices as otherwise your errors are going to be dominated by recent prices.

I would even say you have to look at returns not prices. The nonstationarity in the stock prices (or log-prices) will make the correlation coefficient meaningless. After you obtain the principal component of the returns you can obtain the principal component of the stock prices by transforming returns back to prices.

The comments to this entry are closed.

Search Revolutions Blog

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid
Get this blog via email with Blogtrottr