Cascading Logistic Regression Onto Gradient Boosted Decision Trees to Predict Stock Market Changes Using Technical Analysis

27 Pages Posted: 25 Jul 2018 Last revised: 8 Aug 2018

See all articles by Feng Zhou

Feng Zhou

Guangdong University of Finance and Economics

Zhang Qun

Guangdong University of Foreign Studies

Didier Sornette

Risks-X, Southern University of Science and Technology (SUSTech); Swiss Finance Institute; ETH Zürich - Department of Management, Technology, and Economics (D-MTEC); Tokyo Institute of Technology

Liu Jiang

University of Surrey - Surrey Business School

Date Written: July 24, 2018

Abstract

In the data mining and machine learning fields, forecasting the direction of price change can be generally formulated as a supervised classfii cation. This paper attempts to predict the direction of daily changes of the Nasdaq Composite Index (NCI) and of the Standard & Poor's 500 Composite Stock Price Index (S&P 500) covering the period from January 3, 2012 to December 23, 2016, and of the Shanghai Stock Exchange Composite Index (SSEC) from January 4, 2010 to December 31, 2014. Due to the complexity of stock index data, we carefully combine raw price data and eleven technical indicators with a cascaded learning technique to improve the performance of the classifi cation. The proposed learning architecture LR2GBDT is obtained by cascading the logistic regression (LR) model onto the gradient boosted decision trees (GBDT) model. Given the same test conditions, the experimental results show that the LR2GBDT model performs better than the baseline LR and GBDT models for these stock indices, according to the performance metrics Hit ratio, Precision, Recall and F-measure. Furthermore, we use these models to develop simple trading strategies and assess their performance in terms of their Average Annual Return, Maximum Drawdown, Sharpe Ratio and Average Annualized Return/Maximum Drawdown. When transaction costs and buy-sell thresholds are taken into account, the best trading strategy derived from LR2GBDT model still reaches the highest Sharpe Ratio and clearly beats the buy-and-hold strategy. The performances are found to be both statistically and economically signi ficant.

Keywords: Ensemble learning; gradient boosted decision trees; logistic regression; price prediction; transaction costs, technical analysis

JEL Classification: C45, C53, C60, G17

Suggested Citation

Zhou, Feng and Qun, Zhang and Sornette, Didier and Jiang, Liu, Cascading Logistic Regression Onto Gradient Boosted Decision Trees to Predict Stock Market Changes Using Technical Analysis (July 24, 2018). Swiss Finance Institute Research Paper No. 18-50, Available at SSRN: https://ssrn.com/abstract=3218941 or http://dx.doi.org/10.2139/ssrn.3218941

Feng Zhou

Guangdong University of Finance and Economics ( email )

21 Luntou Road
Guangzhou, Guangdong 510320
China

Zhang Qun

Guangdong University of Foreign Studies ( email )

Collaborative Innovation Center for Silk Road
Guangzhou, Guangdong
China

Didier Sornette (Contact Author)

Risks-X, Southern University of Science and Technology (SUSTech) ( email )

1088 Xueyuan Avenue
Shenzhen, Guangdong 518055
China

Swiss Finance Institute ( email )

c/o University of Geneva
40, Bd du Pont-d'Arve
CH-1211 Geneva 4
Switzerland

ETH Zürich - Department of Management, Technology, and Economics (D-MTEC) ( email )

Scheuchzerstrasse 7
Zurich, ZURICH CH-8092
Switzerland
41446328917 (Phone)
41446321914 (Fax)

HOME PAGE: http://www.er.ethz.ch/

Tokyo Institute of Technology ( email )

2-12-1 O-okayama, Meguro-ku
Tokyo 152-8550, 52-8552
Japan

Liu Jiang

University of Surrey - Surrey Business School ( email )

Guildford, Surrey GU2 8DN
United Kingdom

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
604
Abstract Views
2,377
Rank
82,138
PlumX Metrics