Institutional members access full text with Ovid®

Share this article on:

Studies with Many Covariates and Few Outcomes: Selecting Covariates and Implementing Propensity-Score–Based Confounding Adjustments

Patorno, Elisabettaa; Glynn, Robert J.a; Hernández-Díaz, Soniab; Liu, Juna; Schneeweiss, Sebastiana

doi: 10.1097/EDE.0000000000000069

Background: Propensity scores are useful for confounding adjustment in the commonly observed setting of many potential confounders, frequent exposure, and rare events. However, with few exposed outcomes to inform covariate selection and many candidate confounders, optimal approaches to construct and implement propensity-score–based confounding adjustment remain unclear.

Methods: In a cohort study on the effect of anticonvulsant drugs on cardiovascular risk among adult patients from the HealthCore Integrated Research Database, we compared the performance for confounding control of various covariate-selection strategies for propensity-score estimation (expert knowledge only, expert knowledge informed by empirical covariate selection via high-dimensional propensity-score, and high-dimensional propensity-score empirical specification only) and propensity-score–based adjustment methods (propensity-score-matching and propensity-score-decile stratification). This article focuses on the first 90 days of follow-up because any treatment effect identified in this temporal window almost certainty originates from residual confounding rather than pharmacologic action.

Results: We identified 166,031 new users and 564 ischemic cardiovascular events. Among those, 12,580 patients initiated anticonvulsants that strongly induce cytochrome P450 enzymes and experienced 68 events. The unadjusted hazard ratio was 1.72 (95% confidence interval = 1.34–2.22). Adjustment for investigator-identified covariates led to 41% to 59% reductions in the hazard ratio; adjustment for both investigator-identified and high-dimensional propensity-score empirically identified covariates led to larger reductions (54% to 72%). A selection strategy based on high-dimensional propensity-score empirical specification alone produced less-attenuated and more-volatile hazard ratio estimates. This volatility seemed to be slightly attenuated in a trimmed propensity-score–stratified analysis.

Conclusions: The high-dimensional propensity-score algorithm complements expert knowledge for confounding adjustment, but in settings with few exposed outcomes, its performance without investigator-specified covariates is less clear and may be associated with an increased likelihood of bias. In our example, investigator specification of variables combined with high-dimensional propensity-score empirical selection and the use of trimmed propensity-score–stratified analysis seem to improve effect estimation. Plotting the relation of effect estimates to the increasing number of empirical covariates is a useful diagnostic.

Supplemental Digital Content is available in the text.

From the aDivision of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA; and bDepartment of Epidemiology, Harvard School of Public Health, Boston, MA.

Supported by the Pharmacoepidemiology Program at Harvard School of Public Health, which is funded by Pfizer and Asisa (to E.P.). Robert J. Glynn has received grants to his institution from AstraZeneca and Novartis for the design, interim monitoring, and analysis of clinical trials. Sonia Hernández-Díaz participates in The North American Antiepileptic Drugs (AED) Pregnancy Registry, which receives grants from multiple pharmaceutical companies, and has consulted for Novartis and GSK Biologics pregnancy registries for medications not the subject of this analysis. The Pharmacoepidemiology program at Harvard School of Public Health receives funds for training grants from Pfizer. Jun Liu has no conflicts to report.

Dr. Schneeweiss is Principal Investigator of the Harvard-Brigham Drug Safety and Risk Management Research Center funded by FDA. He is consultant to WHISCON LLC and Booz & Co., and his research is partially funded by investigator-initiated grants to the Brigham and Women’s Hospital by Pfizer, Novartis, and Boehringer-Ingelheim unrelated to the topic of this study. He is consultant to and share holder of Aetion, Inc. a start-up software company that seeks to provide a database backbone and intuitive user interface for database analysis of real world data assets.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article ( This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author.

Editors’ note: A commentary on this article appears on page 279.

Correspondence: Elisabetta Patorno, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, 1620 Tremont Street (Suite 3030), Boston, MA 02120. E-mail:

© 2014 by Lippincott Williams & Wilkins, Inc