Prediction

The new version of Stata (release 16) includes LASSO regression. This is excellent because LASSO is one of the better methods for developing prediction and classification models. However, like stepwise regression, it is unfit for producing parameter estimates with adjustment for confounding bias. The adjustment must be based on considerations regarding cause-effect relations (i.e. confounders must be included in, and mediators and colliders excluded from, the statistical model used for the estimation). This information cannot be derived from data.

I fear that we will soon see publications with LASSO regression being used for the wrong purpose.

--

Addition July 7, 2019

Citation from Stata:

"Lasso is intended for prediction and selects covariates that are jointly correlated with the variables that belong in the best-approximating model. Said differently, lasso estimates the variables that belong in the model. Like all estimation, this is subject to error. However you put it, the inference methods are robust to these errors if the true variables are among the potential control variables that you specify."

The condition "if the true variables are among the potential control variables that you specify" is crucial. The last sentence should be read with the emphasis on "you specify". Don't expect that the Lasso method can help you.


You'll only receive email when Where the rubber meets the road publishes a new post

More from Where the rubber meets the road: