Principal Component Analysis, Part I


Principal Component Analysis (PCA) is a way of summarizing data. For example, if you take financial services, there are quite a few sector indices that cover it: Bank, Pvt. Bank, Public Bank, Financial Services, etc… There will be overlap between all these indices, so the question is, in what proportion should one invest in these individual indices in order to get the most optimal exposure to financial services? PCA is one way to answer this question. To get a better understanding of what it is, see: stats.stackexchange.

NASDAQ OMX India TR Indices

To start this series on PCA, we will first look at the USD denominated Total Return indices published by NASDAQ-OMX. Choosing these indices helps us avoid a lot of data pre-processing steps. First, they are Total Return, so they incorporate dividends, etc. Second, they are US dollar denominated, so we don’t have to worry about being long USDINR while looking at tech stocks. And third, they start from 2001, which goes way farther than the TR indices published by the NSE.

We use the following sector indices:
NASDAQ India Basic Matls TR Index (NQIN1000T),
NASDAQ India Cnsmr Goods TR Index (NQIN3000T),
NASDAQ India Financials TR Index (NQIN8000T),
NASDAQ India Health Care TR Index (NQIN4000T),
NASDAQ India Inds TR Index (NQIN2000T),
NASDAQ India Tech TR Index (NQIN9000T),
and the NASDAQ India TR Index (NQINT) to further divide time periods when it is above and below 50-, 100- and 200-day SMA.

The question we are trying to answer is that are the factor loadings stable? If they are not, then how do they change over time and across different market regimes. To answer this, we setup a sliding window of 5-year daily returns that is incremented by one year at a time. That gives us 11 datasets, starting from 2002-2007 through to 2013-2017. We run PCA on the daily returns of the sector indices listed above. We then plot the loadings of the first principal component.

NASDAQOMX India Sector Index PCA

A few things stand out:

  1. Dominated by Basic Materials, Financials and Industrials.
  2. Relative importance of IT has dropped.
  3. Financials dominate the below-SMA200 market regime implying that most of the time, the market is below 200-SMA because of financials.

What we had hoped to find was some sort of stability in the loadings either in the entire dataset or in specific SMA regimes. We could have then constructed a “good times” and “bad times” portfolio and switched between them based on SMA. But it looks like it is not possible with these indices.

Code and more charts are on github.

Comments are closed, but trackbacks and pingbacks are open.