Statistical Overfitting and Backtest Performance

"Risk-Based and Factor Investing", Quantitative Finance Elsevier, 2015 (Forthcoming).

10 Pages Posted: 9 Oct 2014 Last revised: 5 Jul 2015

See all articles by David H. Bailey

David H. Bailey

Lawrence Berkeley National Laboratory; University of California, Davis

Stephanie Ger

Northwestern University - Department of Engineering Sciences and Applied Mathematics

Marcos Lopez de Prado

Cornell University - Operations Research & Industrial Engineering; Abu Dhabi Investment Authority; True Positive Technologies

Alexander Sim

University of California, Berkeley - Lawrence Berkeley National Laboratory (Berkeley Lab)

Kesheng Wu

University of California, Berkeley - Lawrence Berkeley National Laboratory (Berkeley Lab)

Date Written: October 7, 2014

Abstract

In the field of mathematical finance, a “backtest” is the usage of historical market data to assess the performance of a proposed trading strategy. It is a relatively simple matter for a present-day computer system to explore thousands, millions or even billions of variations of a proposed strategy, and pick the best performing variant as the “optimal” strategy “in sample” (i.e., on the input dataset). Unfortunately, such an “optimal” strategy often performs very poorly “out of sample” (i.e., on another dataset), because the parameters of the invest strategy have been overfit to the in-sample data, a situation known as “backtest overfitting”.

While the mathematics of backtest overfitting has been examined in several recent theoretical studies, here we pursue a more tangible analysis of this problem, in the form of an online simulator tool. Given a input random walk time series, the tool develops an “optimal” variant of a simple strategy by exhaustively exploring all integer parameter values among a handful of parameters. That “optimal” strategy is overfit, since by definition a random walk is unpredictable. Then the tool tests the resulting “optimal” strategy on a second random walk time series. In most runs using our online tool, the “optimal” strategy derived from the first time series performs poorly on the second time series, demonstrating how hard it is not to overfit a backtest. We offer this online tool to facilitate further research in this area.

Keywords: backtest, historical simulation, backtest over-fitting, investment strategy, optimization, Sharpe ratio, performance degradation

JEL Classification: G0, G1, G2, G15, G24, E44

Suggested Citation

Bailey, David H. and Ger, Stephanie and López de Prado, Marcos and López de Prado, Marcos and Sim, Alexander and Wu, Kesheng, Statistical Overfitting and Backtest Performance (October 7, 2014). "Risk-Based and Factor Investing", Quantitative Finance Elsevier, 2015 (Forthcoming)., Available at SSRN: https://ssrn.com/abstract=2507040 or http://dx.doi.org/10.2139/ssrn.2507040

David H. Bailey

Lawrence Berkeley National Laboratory ( email )

1 Cyclotron Road
Berkeley, CA 94720
United States

HOME PAGE: http://www.davidhbailey.com

University of California, Davis ( email )

One Shields Avenue
Apt 153
Davis, CA 95616
United States

HOME PAGE: http://www.davidhbailey.com

Stephanie Ger

Northwestern University - Department of Engineering Sciences and Applied Mathematics ( email )

Northwestern University
2145 Sheridan Road, Room M426
Evanston, IL 60208-3125
United States

Marcos López de Prado (Contact Author)

Abu Dhabi Investment Authority ( email )

211 Corniche Road
Abu Dhabi, Abu Dhabi PO Box3600
United Arab Emirates

HOME PAGE: http://www.adia.ae

Cornell University - Operations Research & Industrial Engineering ( email )

237 Rhodes Hall
Ithaca, NY 14853
United States

HOME PAGE: http://www.orie.cornell.edu

True Positive Technologies ( email )

NY
United States

HOME PAGE: http://www.truepositive.com

Alexander Sim

University of California, Berkeley - Lawrence Berkeley National Laboratory (Berkeley Lab) ( email )

1 Cyclotron Road
Berkeley, CA 94720
United States

Kesheng Wu

University of California, Berkeley - Lawrence Berkeley National Laboratory (Berkeley Lab) ( email )

1 Cyclotron Road
Berkeley, CA 94720
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
1,912
Abstract Views
8,647
Rank
15,935
PlumX Metrics