An Introduction to Machine Learning for Panel Data

93 Pages Posted: 11 Dec 2020 Last revised: 28 Jan 2021

See all articles by James Ming Chen

James Ming Chen

Michigan State University - College of Law

Date Written: October 23, 2020

Abstract

Machine learning has dramatically expanded the range of tools for evaluating economic panel data. This paper applies a variety of machine-learning methods to the Boston housing dataset, an iconic proving ground for machine learning. Though machine learning often lacks the overt interpretability of linear regression, methods based on decision trees score the relative importance of dataset features. In addition to addressing the theoretical tradeoff between bias and variance, this paper discusses practices rarely followed in traditional economics: the splitting of data into training, validation, and test sets; the scaling of data; and the preference for retaining all data. The choice between traditional and machine-learning methods hinges on practical rather than mathematical considerations. In settings emphasizing interpretative clarity through the scale and sign of regression coefficients, machine learning may best play an ancillary role. Wherever predictive accuracy is paramount, however, or where heteroskedasticity or high dimensionality might impair the clarity of linear methods, machine learning can deliver superior results.

Keywords: Machine learning, bias-variance tradeoff, decision trees, random forests, extra trees, XGBoost, learning ensembles, boosting, support vector machines, neural networks

JEL Classification: C18, C23, C33, C45, R31

Suggested Citation

Chen, James Ming, An Introduction to Machine Learning for Panel Data (October 23, 2020). International Advances in Economic Research, Vol. 27, 2021, Available at SSRN: https://ssrn.com/abstract=3717879 or http://dx.doi.org/10.2139/ssrn.3717879

James Ming Chen (Contact Author)

Michigan State University - College of Law ( email )

318 Law College Building
East Lansing, MI 48824-1300
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
2,558
Abstract Views
5,439
Rank
10,179
PlumX Metrics