Mining Big Data Using Parsimonious Factor and Shrinkage Methods
50 Pages Posted: 17 Jul 2013
Date Written: July 15, 2013
Abstract
A number of recent studies in the economics literature have focused on the usefulness of factor models in the context of prediction using "big data". In this paper, our over-arching question is whether such "big data" are useful for modelling low frequency macroeconomic variables such as unemployment, inflation and GDP. In particular, we analyze the predictive benefits associated with the use dimension reducing independent component analysis (ICA) and sparse principal component analysis (SPCA), coupled with a variety of other factor estimation as well as data shrinkage methods, including bagging, boosting, and the elastic net, among others. We do so by carrying out a forecasting "horse-race", involving the estimation of 28 different baseline model types, each constructed using a variety of specification approaches, estimation approaches, and benchmark econometric models; and all used in the prediction of 11 key macroeconomic variables relevant for monetary policy assessment. In many instances, we find that various of our benchmark specifications, including autoregressive (AR) models, AR models with exogenous variables, and (Bayesian) model averaging, do not dominate more complicated nonlinear methods, and that using a combination of factor and other shrinkage methods often yields superior predictions. For example, simple averaging methods are mean square forecast error (MSFE) "best" in only 9 of 33 key cases considered. This is rather surprising new evidence that model averaging methods do not necessarily yield MSFE-best predictions. However, in order to "beat" model averaging methods, including arithmetic mean and Bayesian averaging approaches, we have introduced into our "horse-race" numerous complex new models involve combining complicated factor estimation methods with interesting new forms of shrinkage. For example, SPCA yields MSFE-best prediction models in many cases, particularly when coupled with shrinkage. This result provides strong new evidence of the usefulness of sophisticated factor based forecasting, and therefore, of the use of "big data" in macroeconometric forecasting.
Keywords: prediction, independent component analysis, sparse principal component analysis, bagging, boosting, Bayesian model averaging, ridge regression, least angle regression, elastic net and non-negative garotte
JEL Classification: C32, C53, G17
Suggested Citation: Suggested Citation
Do you have negative results from your research you’d like to share?
Recommended Papers
-
The Generalized Dynamic Factor Model: One-Sided Estimation and Forecasting
By Mario Forni, Marc Hallin, ...
-
Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets
-
By James H. Stock and Mark W. Watson
-
Monetary Policy in a Data-Rich Environment
By Ben S. Bernanke and Jean Boivin
-
Eurocoin: A Real Time Coincident Indicator of the Euro Area Business Cycle
By Filippo Altissimo, Antonio Bassanetti, ...
-
Are More Data Always Better for Factor Analysis?
By Jean Boivin and Serena Ng
-
Implications of Dynamic Factor Models for VAR Analysis
By James H. Stock and Mark W. Watson
-
By Domenico Giannone, Lucrezia Reichlin, ...
-
By Domenico Giannone, Lucrezia Reichlin, ...