Knowing Factors or Factor Loadings, or Neither? Evaluating Estimators of Large Covariance Matrices with Noisy and Asynchronous Data

61 Pages Posted: 23 Feb 2017 Last revised: 1 Nov 2017

See all articles by Chaoxing Dai

Chaoxing Dai

University of Chicago - Booth School of Business

Kun Lu

Princeton University

Dacheng Xiu

University of Chicago - Booth School of Business

Date Written: October 29, 2017

Abstract

We investigate estimators of factor-model-based large covariance (and precision) matrices using high-frequency data, which are asynchronous and potentially contaminated by the market microstructure noise. Our estimation strategies rely on the pre-averaging method with refresh time to solve the microstructure problems, while using three different specifications of factor models with a variety of thresholding methods, respectively, to battle the curse of dimensionality. To estimate a factor model, we either adopt the time-series regression (TSR) to recover loadings if factors are known, or use the cross-sectional regression (CSR) to recover factors from known loadings, or use the principal component analysis (PCA) if neither factors nor their loadings are assumed known. We compare the convergence rates in these scenarios using the joint in-fill and increasing dimensionality asymptotics. To evaluate the empirical trade-off between robustness to model misspecification and statistical efficiency among all 30 combinations of estimation strategies, we run a horse race on the out-of-sample portfolio allocation with Dow Jones 30, S&P 100, and S&P 500 index constituents, respectively, and find the pre-averaging-based strategy using TSR or PCA with location thresholding dominates, especially over the subsampling-based alternatives.

Keywords: high-dimensional data, high-frequency data, factor model, pre-averaging estimator, portfolio allocation, low-rank plus sparse covariance matrix, Barra covariance matrix estimator

JEL Classification: C13, C14, C55, C58, G01

Suggested Citation

Dai, Chaoxing and Lu, Kun and Xiu, Dacheng, Knowing Factors or Factor Loadings, or Neither? Evaluating Estimators of Large Covariance Matrices with Noisy and Asynchronous Data (October 29, 2017). Chicago Booth Research Paper No. 17-02, Available at SSRN: https://ssrn.com/abstract=2920693 or http://dx.doi.org/10.2139/ssrn.2920693

Chaoxing Dai

University of Chicago - Booth School of Business ( email )

5807 S. Woodlawn Avenue
Chicago, IL 60637
United States

Kun Lu

Princeton University ( email )

22 Chambers Street
Princeton, NJ 08544-0708
United States

Dacheng Xiu (Contact Author)

University of Chicago - Booth School of Business ( email )

5807 S. Woodlawn Avenue
Chicago, IL 60637
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
259
Abstract Views
1,650
Rank
214,809
PlumX Metrics