Category: Investing Insight

Investing insight to make you a better investor.

Euclid vs. Hamming Distance

Our previous post explored the differences between CAPM Beta and Hamming distance. Think of Beta as a linear regression between two time-series and Hamming distance as the number of days when the direction of returns differed. The usefulness of the Euclidian distance for non-reverting timeseries is somewhere between the two.

Extending the previous example using HDFC and keeping everything else the same, here’s what the Euclidian distance measure looks like.

Higher the distance, the farther apart their curves and worse the index hedge. Here’s the equity curve that can help map returns to distance.

Reproducing the Beta and Hamming Distance charts:

From a linear portfolio point-of-view, which of these series is more “predictable?” Is it possible to specify bands beyond which things “break?” And does using shorter look-backs help?

Beta vs. Hamming

The beta of a portfolio is often used to hedge it against the market. We did a brief intro in our post: A Gentle Introduction to Hedging. And previously, we discussed how the Hamming distance can unearth relationships by simplifying the data that we have. Here, we bring the two concepts together.


CAPM Beta is a glorified linear regression between two return streams. It is useful in the context of linear-payoff portfolios.

For example, a typical long-only fund can use its portfolio beta to measure sensitivity to the market and to hedge against it. A single-stock portfolio with only HDFC in it will exhibit varying beta wrt different markets.

If you are after a linear-payoff (long stocks or futures outright,) beta can be a useful metric to track.


Betas are useless if you are trying to hedge or analyze a portfolio with convex payoffs. Like, say, an options portfolio. Here, you care more about up/down days over an index. This is where Hamming distances are useful.

A Hamming distance of 70 over a 250-day return stream means that by flipping the direction of just a third of the sample, the up/down series will equalize.

In our HDFC-only single-stock portfolio example above, we see that its beta over NIFTY/BANK-NIFTY is vastly different whereas its Hamming distances closely track each other. This behavior can be used to construct trades that go beyond being long-only equity.

Hamming Distance

Previously, we discussed how removing information from data can be useful. And our discussion on using Euclidean Distance for Pattern Matching showed how you can use a rolling window to identify matching segments within a time-series. What if we mix the two ideas together?

If you transform a time-series of returns to 0-1, then we can use Hamming distance, a measure the minimum number of substitutions required to change one string into the other (Wikipedia,) as a measure of similarity.

For example, take the most recent 20-day VIX time-series and “match” it with a rolling window of historical 20-day VIX segments and sort it by its Hamming Distance.

Here, on the second row, we see that by just flipping two bits, the 20-day sequence ending on 2020-05-18 matches with the 20-day sequence ending 2021-11-16.
If you are looking for a rough up/down days match, then this is a blistering fast way to compute it.

Direction vs. Magnitude

Sometimes, it is useful to remove information from the data that you have.

Lets say, you have a time-series of returns: +0.001, +0.001, +0.001, +0.1, -0.01, -0.01. What if, you removed magnitude information and kept only the direction? You end up with: UP, UP, UP, UP, DN, DN. Now, you can analyze this transformed dataset using a whole bunch of algorithms designed to work on binary sequences.

Run Length Encoding (rle) is one such algo. We used it while looking for streaks (Part I, Part II.) We dismissed the backtest as a datamining artefact. Which it might very well be. However, if you believe that a timeseries can exhibit both trend and mean-reversion, then looking at it through this lens can be useful.

Knowing the “average” length of streaks can also help in position sizing in a trend-following system and regime classification.

Stay tuned.

SPDR Sector ETF Average-Momentum Rotation

We’ve been having a bit of fun with the S&P Sector “Spider” ETFs: Intro, Momentum, Anti-Momentum. We saw how strategies that backtested well with pre-2011 data failed later. In this post, we see if buying all ETFs with a positive return over n-months help us beat the S&P 500 index.

Rules of Rotation


  1. Calculate rolling returns over n months. Where n = 1, 3, 6, 12.
  2. For the n+1th month, go long the ETFs that had positive returns in Step 1.

Like before, we split the dataset into Before 2010 and After 2011.

Pick your Fighter

The Before 2010 dataset shows rotation by 6- and 12-month look-back periods to be better than buying-and-holding the S&P 500.

The SPY Rope-a-Dope

MOM6 and MOM12 were too close to call in the training set. If you had “course-corrected” after the first couple of years of under-performance of MOM12 and switched to MOM6, you would’ve out-performed. On the other hand, staying the course would’ve meant losing out to the mighty S&P 500.

Once again, by simply holding onto the ropes, a passive buy-and-hold S&P 500 investor would’ve come out miles ahead of someone who tried to time sectors systematically.

What did we learn?

We tested a few basic allocation strategies that investors typically use to approach the “rotation” problem. Some of them worked well in the training set but their performance failed to carry over. Besides, if you add transaction costs and taxes, we are not sure if it was worth the effort given the post-2011 market regime.

Maybe there are more sophisticated qualitative/fundamental ways to approach this problem that work. However, most media articles about “sector rotation” are written with perfect hindsight and it is near impossible to do it with simple strategies that are accessible to the average investor.