While developing a model, historical data alone may not be sufficient to test its robustness. One way to generate test data is to re-sample historical data. This “re-arrangement” of past time-series can then be fed to the model to see how it behaves.
The problem with sampling historical market data is that it may not sufficiently account for fat-tails. Typically, a uniform sample is taken. The problem with this is it under-represents the tails. This leads to models that work on average but blow up on occasion. Something you’d like to avoid.
One way to overcome this problem is through stratified sampling. You chop the data into intervals and use their frequencies to probability weight the sample. This preserves the original distribution in the sample.
Notice the skew and the tails in the “STRAT” densities for both NIFTY and MIDCAP indices. This distribution is far more likely to result in a robust model compared to the one that just uses uniform sampling.
You can check out the R-code here.