Author: shyam

Principal Component Analysis, Part I

Introduction

Principal Component Analysis (PCA) is a way of summarizing data. For example, if you take financial services, there are quite a few sector indices that cover it: Bank, Pvt. Bank, Public Bank, Financial Services, etc… There will be overlap between all these indices, so the question is, in what proportion should one invest in these individual indices in order to get the most optimal exposure to financial services? PCA is one way to answer this question. To get a better understanding of what it is, see: stats.stackexchange.

NASDAQ OMX India TR Indices

To start this series on PCA, we will first look at the USD denominated Total Return indices published by NASDAQ-OMX. Choosing these indices helps us avoid a lot of data pre-processing steps. First, they are Total Return, so they incorporate dividends, etc. Second, they are US dollar denominated, so we don’t have to worry about being long USDINR while looking at tech stocks. And third, they start from 2001, which goes way farther than the TR indices published by the NSE.

We use the following sector indices:
NASDAQ India Basic Matls TR Index (NQIN1000T),
NASDAQ India Cnsmr Goods TR Index (NQIN3000T),
NASDAQ India Financials TR Index (NQIN8000T),
NASDAQ India Health Care TR Index (NQIN4000T),
NASDAQ India Inds TR Index (NQIN2000T),
NASDAQ India Tech TR Index (NQIN9000T),
and the NASDAQ India TR Index (NQINT) to further divide time periods when it is above and below 50-, 100- and 200-day SMA.

The question we are trying to answer is that are the factor loadings stable? If they are not, then how do they change over time and across different market regimes. To answer this, we setup a sliding window of 5-year daily returns that is incremented by one year at a time. That gives us 11 datasets, starting from 2002-2007 through to 2013-2017. We run PCA on the daily returns of the sector indices listed above. We then plot the loadings of the first principal component.

NASDAQOMX India Sector Index PCA

A few things stand out:

  1. Dominated by Basic Materials, Financials and Industrials.
  2. Relative importance of IT has dropped.
  3. Financials dominate the below-SMA200 market regime implying that most of the time, the market is below 200-SMA because of financials.

What we had hoped to find was some sort of stability in the loadings either in the entire dataset or in specific SMA regimes. We could have then constructed a “good times” and “bad times” portfolio and switched between them based on SMA. But it looks like it is not possible with these indices.

Code and more charts are on github.

The Inflation Drag on Returns

What do you think is the annualized inflation adjusted NIFTY 50 return is from 1991 through 2016? Hint: Gross returns were ~13%
Gross vs. Real NIFTY 50 returns
It was 5%

The Midcap index was created much later. So to keep things on an even keel, if you run both NIFTY 50 and NIFTY MIDCAP 100 between 2002 and 2016, it turns out that their real returns were about 7% and 13% respectively.

So,

  1. The asset class that you pick should jive with your time horizon. No point investing in NIFTY 50 (or large-caps, for that matter) if you have a 10+ year time-horizon.
  2. The market demanded high gross returns because of high inflation. If the RBI’s commitment of a 4-6% inflation band gets fully priced in, expect gross returns to come down in the future.
  3. Neither market returns nor inflation is under your control. However, your lifestyle inflation is all on you.

Code and additional charts are on github.

StockViz Tools

Compare and Plot multiple time-series datasets

Ever wondered how the dollar adjusted NIFTY MIDCAP 100 index performed vs. US Midcaps? Wonder no more! You can now quickly run a comparison using our new Compare tool.

compre time-series

We have made a large number of datasets available:

NASDAQOMX The NASDAQ OMX TR USD Dataset
MF A pruned set of Indian mutual funds
FRED The St. Louis FRED dataset
THEME StockViz Themes
IN_NSE Indices available on the NSE
IN_BSE Indices available on the BSE
ETF_US US listed ETFs
EQ_US US listed Stocks
EQ_NSE Stocks listed on the NSE
EQ_BSE Stocks listed on the BSE
MCX Commodity futures listed on MCX
NCDEX Commodity futures listed on NCDEX

The “THEME” comparison tools also gives you the impact of STT and brokerage costs on returns.

So, now you can compare pretty much anything with everything.

When it comes to plotting a single time-series, we have made intraday prices on stocks listed on the NSE available. You can run these 10-days at a time:
chart time-series

The daily chart of NSE stocks also marks important corporate actions:
TCS daily time-series candle-stick with corporate actions

Keeping it free

The biggest challenge we faced was managing data storage costs. Earlier, we tried to upload all the data to the cloud but we quickly ran into performance bottlenecks and costs overruns. So we went back to the drawing board and designed a hybrid solution where the data and the analytics reside within our firewalls and only results are displayed to the user. This allows us to keep these tools free and open to everybody.

I want more!

If you want to see something here that would be of interest to other investors/traders, please WhatsApp us (+918026650232) with your ideas!

Book Review: The Attention Merchants

In The Attention Merchants: The Epic Scramble to Get Inside Our Heads (Amazon,) author Tim Wu walks us through how the advertising industry evolved from its patent medicine roots to the current mess of privacy invading ad exchanges.

Advertising is primarily about gaining a person’s attention. The medium through which this is done has evolved from posters and billboards to newspapers and magazines to tv and internet. The media is the aggregator of attention – their principle goal is to draw people in and then sell their attention to the highest bidder. But how do you measure attention? How do you know if someone watched your ad? Or, for that matter, should watch your ad? What are the boundaries of one’s privacy?

What struck me was that as the stakes kept getting higher, so did the degree of invasion into our head-space. We are now at a point where the vast majority of websites you visit are sending your data to third-party sources, usually without your permission or knowledge.

For every two eyes looking at a screen there are probably ten or more looking back at them.

And it is no longer just your “digital” footprint.

Did you know that there are now billboards that can track you by your phone’s wifi?

All WiFi-capable devices broadcast a unique ID – a Media Access Control (MAC) address – when they’re looking for networks (and so long as WiFi is enabled, they’re always looking for networks). So if you walk around carrying a mobile phone with WiFi turned on, you’re broadcasting your own, unique radio beacon, and it’s easy to track your movements.

Scary stuff.

Recommendation: Worth a read.