Large data sets with many variables provide particular challenges when constructing analytic models. Lasso-related methods provide a useful tool, although one that remains unfamiliar to most epidemiologists.
We illustrate the application of lasso methods in an analysis of the impact of prescribed drugs on the risk of a road traffic crash, using a large French nationwide database (PLoS Med 2010;7:e1000366). In the original case-control study, the authors analyzed each exposure separately. We use the lasso method, which can simultaneously perform estimation and variable selection in a single model. We compare point estimates and confidence intervals using (1) a separate logistic regression model for each drug with a Bonferroni correction and (2) lasso shrinkage logistic regression analysis.
Shrinkage regression had little effect on (bias corrected) point estimates, but led to less conservative results, noticeably for drugs with moderate levels of exposure. Carbamates, carboxamide derivative and fatty acid derivative antiepileptics, drugs used in opioid dependence, and mineral supplements of potassium showed stronger associations.
Lasso is a relevant method in the analysis of databases with large number of exposures and can be recommended as an alternative to conventional strategies.
From the aUniversité Bordeaux, ISPED, Centre INSERM U897-Epidemiologie-Biostatistique, Bordeaux, France; bINSERM, ISPED, Centre INSERM U897-Epidemiologie-Biostatistique, Bordeaux, France; cAgrocampus Ouest, Rennes, France; and dCNRS UMR 6599, Heudiasyc, University of Technology of Compiègne, Compiègne, France.
Submitted 5 April 2011; accepted 15 March 2012; posted 3 July 2012.
The CESIR-A project was funded by the French Health Products Agency (AFSSAPS), the French National Research Agency (ANR, DDA 0766CO204), the French Medical Research Foundation (Equipe FRM), the French Direction Générale de la Santé (DES), and the French National Institute for Medical Research (Equipe INSERM Avenir).
The author reported no financial interests related to this research.
Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com). This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author.
Correspondence: Marta Avalos, ISPED, Université Bordeaux Segalen, F-33076 Bordeaux, France. E-mail: email@example.com.