In the book Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are (Amazon,) author Seth Stephens-Davidowitz provides a peek into how “big data” can help us understand the world around us provided we know how to ask the right questions. And that, depressingly, people are consistently lying to themselves.
The book also touches on a common problem faced by quants who are trying to use big data for building trading models: the curse of dimensionality. We discussed this in our GEM and SMA series of articles (GEM, SMA) – there is no single “best” look-back period for calculating momentum or moving averages. There are trade-offs involved and there is always risk. These issues tend to snowball when using large, multi-dimensional data-sets to a point where it is hard to discern signal from noise.
The principal take-away from this book is that big-data is useful to answer questions typically raised in the social sciences and public policy. But a poor fit where the underlying data is heteroscedastic or the system itself is complex-adaptive.
Recommendation: must read!