Category: Your Money

Big Data’s Big Blind-spots

big data's big problems

Yesterday, we discussed how theoretical models can be used to draw biased conclusions by using faulty assumptions. If the models then get picked up without an understanding of those assumptions, it leads to expensive mistakes. But are empirical models free from such bias, especially if the data-set is big enough? Absolutely not.

In an article titled “Big data: are we making a big mistake?” in the FT, author Tim Harford points out that by merely finding statistical patterns in the data, data scientists are focusing too much on correlation and giving short shrift to causation.

But a theory-free analysis of mere correlations is inevitably fragile. If you have no idea what is behind a correlation, you have no idea what might cause that correlation to break down.

All the problems that you had in “small” data exist in “big” data, but they are only tougher to find. When it comes to data, size isn’t everything, you still need to deal with sample error and sample bias.

For example, it is in principle possible to record and analyse every message on Twitter and use it to draw conclusions about the public mood. But while we can look at all the tweets, Twitter users are not representative of the population as a whole. According to the Pew Research Internet Project, in 2013, US-based Twitter users were disproportionately young, urban or suburban, and black.

Worse still, as the data set grows, it becomes harder to figure out if a pattern is statistically significant, i.e., can such a pattern have emerged purely by chance.

The whole article is worth read, plan to spend some time on it: Big data: are we making a big mistake?

Models don’t lie, incorrect assumptions do

Snarl

An engineer, a physicist and an economist are stranded on a deserted island with nothing to eat. A crate containing many cans of soup washes ashore and the three ponder how to open the cans.
Engineer: Let’s climb that tree and drop the cans on the rocks.
Physicist: Let’s heat each can over our campfire until the increase in internal
pressure causes it to open.
Economist: Let’s assume we have a can opener.

In a recent paper titled Chameleons: The Misuse of Theoretical Models in Finance and Economics Paul Pfleiderer of Stanford University lays out how some models, built on assumptions with dubious connections to the real world, end up being used to inform policy and other decision making. He terms these models “Chameleons.”

Notice how similar these two abstracts are:

To establish that high bank leverage is the natural (distortion-free) result of intermediation focused on liquid-claim production, the model rules out agency problems, deposit insurance, taxes, and all other distortionary factors. By positing these idealized conditions, the model obviously ignores some important determinants of bank capital structure in the real world. However, in contrast to the MM framework – and generalizations that include only leverage-related distortions – it allows a meaningful role for banks as producers of liquidity and shows clearly that, if one extends the MM model to take that role into account, it is optimal for banks to have high leverage.

– “Why High Leverage is Optimal for Banks” by Harry DeAngelo and René Stulz.

To establish that high intake of alcohol is the natural (distortion free) result of human liquid-drink consumption, the model rules out liver disease, DUIs, health benefits, spousal abuse, job loss and all other distortionary factors. By positing these idealized conditions, the model obviously ignores some important determinants of human alcohol consumption in the real world. However, in contrast to the alcohol neutral framework – and generalizations that include only overconsumption-related distortions – it allows a meaningful role for humans as producers of that pleasant “buzz” one gets by consuming alcohol, and shows clearly that if one extends the alcohol neutral model to take that role into account, it is optimal for humans to be drinking all of their waking hours.

– “Why High Alcohol Consumption is Optimal for Humans” by Bacardi and Mondavi 😉

These are a good illustration that one can generally develop a theoretical model to produce any result within a wide range.

Read the whole thing at your leisure: Chameleons: The Misuse of Theoretical Models in Finance and Economics

Weekly Recap: Bubbles are a Good Thing

world markets 2014-03-21.2014-03-28

The Nifty ended the week a whopping +3.12% (+4.75% in USD terms.)

Here’s how the rest of the world markets faired:

Major
DAX(DEU) +2.61%
CAC(FRA) +1.75%
UKX(GBR) +0.89%
NKY(JPN) +3.32%
SPX(USA) -0.66%
MINTs
JCI(IDN) +1.45%
INMEX(MEX) +0.71%
NGSEINDX(NGA) +1.43%
XU030(TUR) +7.98%
BRICS
IBOV(BRA) +5.14%
SHCOMP(CHN) -0.29%
NIFTY(IND) +3.12%
INDEXCF(RUS) +2.81%
TOP40(ZAF) +3.06%
Commodities(CME)
Gold Futures $1293.80 -3.16%
Silver Futures $19.77 -2.53%
Platinum Futures $1404.70 -2.15%
Copper Futures $3.06 +2.17%
Brent Crude Oil Last Day Financial Futures $108.07 +1.08%
E-mini Natural Gas Futures $4.49 +3.99%

Nifty Heatmap

CNX NIFTY heatmap 2014-03-21.2014-03-28

Index Performance

index performance 2014-03-21.2014-03-28

Top winners and losers

IDFC +15.02%
CONCOR +16.35%
PNB +16.85%
ZEEL -5.78%
DRREDDY -5.50%
GLENMARK -3.51%
Dr Reddy and Glenmark led the drop in pharma stocks, something to do with the biotech massacre in the US?

ETFs

PSUBNKBEES +7.78%
BANKBEES +3.34%
NIFTYBEES +3.04%
JUNIORBEES +2.45%
GOLDBEES -2.77%
PSU banks rallied on the back of the RBI’s decision to delay adoption of Basel III norms.

Investment Theme Performance

High-beta stocks flew on increased risk-taking. A Modi win better workout…

Sector Performance

sector performance 2014-03-21.2014-03-28

Yield Curve

yield Curve 2014-03-21.2014-03-28

Thought for the Weekend

Robert Shiller on bubbles:

First of all, it’s a free world and people can do what they want. I’m not proposing we put the straightjacket on these things. The other thing is that human nature needs stimulation and people have to have some sense of opportunity and excitement. I think profits are an important motivator. In the long run, it’s hard to say that bubbles are really bad.

Source: Robert Shiller’s Nobel Knowledge