In the book Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are (Amazon,) author Seth Stephens-Davidowitz provides a peek into how “big data” can help us understand the world around us provided we know how to ask the right questions. And that, depressingly, people are consistently lying to themselves.
The author goes on to show how some of the most successful companies in recent memory are based on capitalizing on the difference between what people say they are and what they really are.
The book also touches on a common problem faced by quants who are trying to use big data for building trading models: the curse of dimensionality. We discussed this in our GEM and SMA series of articles (GEM, SMA) – there is no single “best” look-back period for calculating momentum or moving averages. There are trade-offs involved and there is always risk. These issues tend to snowball when using large, multi-dimensional data-sets to a point where it is hard to discern signal from noise.
The principal take-away from this book is that big-data is useful to answer questions typically raised in the social sciences and public policy. But a poor fit where the underlying data is heteroscedastic or the system itself is complex-adaptive.
Recommendation: must read!