Using Double-Lasso Regression for Principled Variable Selection
70 Pages Posted: 18 Feb 2016
Date Written: 2016
Abstract
The decision of whether to control for covariates, and how to select which covariates to include, is ubiquitous in psychological research. Failing to control for valid covariates can yield biased parameter estimates in correlational analyses or in imperfectly randomized experiments and contributes to underpowered analyses even in effectively randomized experiments. We introduce double-lasso regression as a principle method for variable selection. The double lasso method is calibrated to not over-select potentially spurious covariates, and simulations demonstrate that using this method reduces error and increases statistical power. This method can be used to identify which covariates have sufficient empirical support for inclusion in analyses of correlations, moderation, mediation and experimental interventions, as well as to test for the effectiveness of randomization. We illustrate both the method’s usefulness and how to implement it in practice by applying it to four analyses from the prior literature, using both correlational and experimental data.
Keywords: research methods, covariate, regression, variable selection, confound, omitted variable bias
Suggested Citation: Suggested Citation