Wednesday, July 25, 2012

Update on FRAME backtesting and data bias

Yesterday I sampled a new set of 100 trades from random dates between 2001 and 2011 and saw my expectancy plummet. The possibility of data bias has always lingered in my mind, and yesterday's backtest confirmed its existence.

Grabbing a large sample size isn't enough. For sampling to be as representative of a population as much as possible, you want your selection criteria to be strictly neutral.

The 100 samples I grabbed from the previous backtest originated in early to mid-2001. I've always intended to grab a total of 1000 sample trades, with the other 900 coming from other years. However, I envisioned grabbing those 900 samples in "chunks", in the same manner as my first 100 samples.

The problem with grabbing data in a "chunk" is that each sample influences the next one. You want this influence to be minimal to avoid bias. If I pull a 100 sample-size chunk from a strongly-trending period, those 100 samples will give a giant green tick for a trend-following system. But it's not representative of the entire population, which will most likely be a mix of trending and ranging periods.

Lesson learned.

I now use Random.org to randomly choose dates and times for sampling.

As for the backtest results from random dates, the optimal R:R looks to be 7:1, providing an expectancy of 29%. That expectancy is not too bad, but reward-to-risk is too high for my liking. My win% is 17%, so expect alot of drawdown between wins. You'll have to trade small for this system to be profitable. But trading the 15M chart provides so many opportunities. Realistically, you can trade 20 times per month on the EURUSD alone.

However, this isn't something I want to continue backtesting, for now anyway.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.