Sparse Canonical Correlation Analysis from a Predictive Point of View

26 Pages Posted: 22 Jan 2014

See all articles by Ines Wilms

Ines Wilms

Maastricht University

Christophe Croux

KU Leuven - Faculty of Business and Economics (FEB)

Date Written: 2013

Abstract

Canonical correlation analysis (CCA) describes the associations between two sets of variables by maximizing the correlation between linear combinations of the variables in each data set. However, in high-dimensional settings where the number of variables exceeds the sample size or when the variables are highly correlated, traditional CCA is no longer appropriate. This paper proposes a method for sparse CCA. Sparse estimation produces linear combinations of only a subset of variables from each data set, thereby increasing the interpretability of the canonical variates. We consider the CCA problem from a predictive point of view and recast it into a multivariate regression framework. By combining a multi-variate alternating regression approach together with a lasso penalty, we induce sparsity in the canonical vectors. We compare the performance with other sparse CCA techniques in different simulation settings and illustrate its usefulness on a genomic data set.

Keywords: Canonical correlation analysis, Genomic data, Lasso, Multivariate regression, Sparsity

Suggested Citation

Wilms, Ines and Croux, Christophe, Sparse Canonical Correlation Analysis from a Predictive Point of View (2013). Available at SSRN: https://ssrn.com/abstract=2381968 or http://dx.doi.org/10.2139/ssrn.2381968

Ines Wilms (Contact Author)

Maastricht University ( email )

P.O. Box 616
Maastricht, Limburg 6200MD
Netherlands

Christophe Croux

KU Leuven - Faculty of Business and Economics (FEB) ( email )

Naamsestraat 69
Leuven, B-3000
Belgium

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
92
Abstract Views
1,470
Rank
510,028
PlumX Metrics