The Changing Face of Epidemiology
Nonexperimental Comparative Effectiveness Research Using Linked Healthcare Databases
Stürmer, Til; Jonsson Funk, Michele; Poole, Charles; Brookhart, M. Alan
From the Pharmacoepidemiology Program, Department of Epidemiology, UNC Gillings School of Global Public Health University of North Carolina at Chapel Hill, Chapel Hill, NC.
Supported by the National Institute on Aging (RO1 AG023178 and K25 AG27400); AHRQ (K02 HS17950); the UNC-GSK Center of excellence in Pharmacoepidemiology and Public Health, an innovative academia-industry collaboration; and unrestricted research grants from the pharmaceutical industry (eg, Merck, Sanofi-Aventis, Amgen). UNC houses an AHRQ DEcIDE Center.
Editors' note: Related articles appear on pages 290, 292, 295, and 302.
Correspondence: Til Stürmer, McGavran-Greenberg, CB 7435, Chapel Hill, NC 27599-7435. E-mail: email@example.com or firstname.lastname@example.org.
Comparative effectiveness research has gained a great deal of attention over the past years through the new federal coordinating council,1 a recent Institute of Medicine (IOM) report,2 and the American Recovery & Reinvestment Act stimulus funding.3 Comparative effectiveness research has a broad scope as defined by the IOM, addressing “... the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care. The purpose of [comparative effectiveness research] is to assist consumers, clinicians, purchasers, and policymakers to make informed decisions that will improve health care at both the individual and population levels.”2
So what's new? As pharmacoepidemiologists, we could point out that we have been generating evidence on health-relevant drug benefits and harms at the population level for over 25 years (the International Society for Pharmacoepdemiology just had its 26th International conference). And we could point out that we have moved from using untreated persons as a comparison group when assessing drug effects toward using persons treated with a realistic clinical alternative as a comparator group before the need for comparative effectiveness research became widely recognized (eg, Brookhart et al4). While comparative effectiveness research is much broader than pharmacoepidemiology, what is really new for us is the implicit acknowledgment of the value of nonexperimental evidence for benefits. Until recently,5 the U.S. Food and Drug Administration (FDA) has largely dismissed nonexperimental evidence on drug benefits (eg, Temple6), mainly because of fear about intractable confounding by indication.7,8 The FDA also insists on comparing treated with untreated persons (including those using a placebo) and does not accept the idea of a comparator drug for proof of efficacy.9 It is interesting to see that other U. S. government agencies, including the Agency for Health Care Research and Quality through their network of DEcIDE (Developing Evidence to Inform Decisions about Effectiveness) Centers,10 see this differently. It is also encouraging to see that some divisions of the FDA (eg, the Division of Epidemiology in the Office of Surveillance and Biometrics, Center for Devices and Radiologic Health) are pioneers in recognizing the value of nonexperimental research.5
When we focus on the nonexperimental evaluation of the use and the beneficial and harmful effects of drugs in the population (ie, pharmacoepidemiology), we have to acknowledge some specific aspects of drugs that we need to take into account. Drugs are the mainstay of contemporary medicine. They are used for the primary, secondary, and tertiary preventions of disease outcomes. A multibillion-dollar industry constantly screens chemical compounds for physiologic effects. Drugs are marketed only after experimental proof of efficacy (benefits) in humans. They are therefore likely to affect disease outcomes.
Most prescription drugs are dispensed at pharmacies, which then submit claims to payors such as Medicare, Medicaid, or private insurance companies. Likewise, physician visits, hospital stays, laboratory test results, procedures, injections, and other encounters with the healthcare system generate paper trails (or the electronic equivalent thereof). By linking these data across various sources (insurance claims, electronic health records, clinical records, billing data, laboratory results, and vital statistics), an integrated picture of the patient's health and healthcare emerges. Researchers can obtain permission to access these data (stripped of key identifiers), with appropriate confidentiality safeguards.11
These databases have unique advantages for epidemiologic research. Most are population based and therefore less prone to the healthy (given the target) selection that is virtually unavoidable when recruiting and consenting participants for randomized trials or cohort studies.12 Linked healthcare databases include continuous service dates rather than the interval assessments (eg, every 2 years) that are common in epidemiologic cohort studies and many large trials. Continuous assessment of exposure and outcomes allows us to be specific about timing. This is an important consideration, given that having the exposure before the outcome is arguably the only sine qua non condition for causality.13 Linked healthcare databases contain information on almost all drugs prescribed or dispensed in an outpatient setting. They include codes for outpatient and inpatient diagnoses and procedures that the patient has received.11 They also have major downsides. Without pretending to be exhaustive (which is neither within the scope of this commentary nor necessary for our argument), these include lack of data on important confounders, lack of data on drugs administered during hospitalization or purchased over the counter, lack of mortality data (in some databases), lack of data on the sensitivity and specificity of various algorithms to define outcomes, and lack of data on events not covered by the corresponding insurance plan.11
Let us now step back from linked healthcare databases and highlight several important threats to the validity of nonexperimental research on drug effects in general. In addition to other sources of confounding, there is the potential for confounding by indication. Sicker patients are almost always more likely to be treated and are often more likely to have bad outcomes. If the severity of disease is unknown or measured with error, residual confounding will make the drug look bad. Recent work has focused on another kind of unmeasured confounding, confounding by frailty. Frailty is a difficult-to-measure condition preceding death that is not linked to a specific pathology but rather an overall poor state of health. Frailty is probably easily recognized by trained physicians. Unlike study populations in many other nonexperimental and experimental settings, study populations assembled from large linked healthcare databases include very frail patients. Frailty may reduce the likelihood of a particular treatment if physicians focus on a patient's main medical problem and do not initiate useful therapies for secondary conditions.14 The practitioner may determine that a therapy offers little expected benefit in the presence of competing risks.15 Because frailty is hard to measure and a very strong risk factor for poor outcomes (especially mortality), it will lead to unmeasured and residual confounding. When comparing the treated with the untreated, however, frailty will often tend to make the drug look good. Frailty is a plausible explanation for paradoxical treatment-outcome associations observed in the elderly.16-18
Besides confounding by unmeasured confounders and residual confounding, selection bias over time on treatment is a major problem when assessing longer-term effects of drugs. Patients who persist with treatments over prolonged periods tend to be healthier. Conversely, patients who stop treatments tend to be sicker. This leads to increasingly healthy users with increasing duration of treatment.19,20 Similar to confounding by frailty, selection bias from healthy users tends to be most pronounced for all-cause mortality.21 Those not adherent to placebo in randomized trials have been shown to have twice the mortality rates of those adherent to placebo.22 There is also the potential for immortal-time bias.23
None of these threats to validity is specific to pharmacoepidemiology or linked healthcare databases. For example, selection bias is a major issue in occupational epidemiology (eg, healthy worker bias24). Furthermore, it is only the potential for major confounding that is specific to nonexperimental study designs. The lack of an inherent sampling structure compared with ad hoc studies may, however, increase the risk for flawed designs.
Methodologists in pharmacoepidemiology have made substantial progress in recent years addressing the above-mentioned threats to validity. Recent developments include the new-user design,25 which allows us to focus on treatment-initiation decision processes and hypothetical interventions. The new-user design allows us to implement propensity scores26 and instrumental variables,27 and to assess positivity28,29 and treatment contrary to prediction.18 All these can be used to restrict study populations to those patients who have an indication for the initiation of all treatments compared (ie, with some equipoise in the decision between all treatments compared) and then balance the risk for the disease outcome between these cohorts of patients treated differently. The new-user design also allows us to address stopping, switching, and augmenting drug use after baseline separate from the confounding at baseline. We are thus able to apply various methods to deal with changing treatments over time, ranging from immediate censoring to ignoring (first treatment carried forward or intention-to-treat analysis). If we are able to predict persistance, we can apply marginal structural models.30-32 Finally, comparing patients who start one drug with patients who start another drug for the same indication reduces the potential for (and thus magnitude of) most of the biases outlined above, including confounding, selection bias, and immortal time bias. The “comparative” in comparative effectiveness research can therefore help us to avoid major biases when making nonexperimental treatment comparisons.
Based on these recent developments, we propose that study design has a larger influence on validity of pharmacoepidemiologic studies than whether we use linked healthcare databases or data from ad hoc studies. For instance, a reanalysis of the data from the Nurses Health Study on the effects of estrogen and progestin therapy on coronary heart disease in postmenopausal women based on a new-user design and dealing with selection bias after initiation33 showed results compatible with the ones from a large randomized trial. Threats to validity in pharmacoepidemiology are by no means specific to linked healthcare databases.
We need timely and trustworthy answers about both drug benefits and harms in the population. Such answers are essential to safeguard public health, and they can rarely be obtained by data collection, be it within randomized controlled trials, large simple trials, or cohort studies. Current examples include insulin glargine and angiotensin receptor blockers, which have been suggested to increase risk for malignancies.34,35 Fortunately in both cases we have obvious comparator drugs that allow us to limit the potential for bias, but we nonetheless need to be careful to avoid flaws in the study design.36
Are we brave or foolhardy? The answer may depend on what we want to do. We can be brave and study treatments with similar indications (eg, insulin glargine vs. long- or intermediate-acting human insulin, angiotensin receptor blockers vs. angiotensin-converting enzyme inhibitors), unintended effects, and short-term effects. We would probably be foolhardy to study intended long-term effects without a comparator, eg, statins versus no statin. Note that this list is not exhaustive. There will always be gray zones, and it may very well be easier for academics to live with these than for industry and regulatory agencies to do so.37
Large linked healthcare databases offer major advantages for pharmacoepidemiologic research.38 While certainly not ideal, the population is often closer to ideal than the one of ad hoc studies because it is unselected (eg, Medicare, General Practice Research Database, Scandinavia39). And, while again not ideal, the information on drug exposure is almost ideal for prescription drugs in the outpatient setting, ie, for most of the drugs used. We can study clinically relevant outcomes, and given the large size of linked healthcare databases, we can find timely answers to multiple questions without waiting for new data to be collected. Among the many downsides is the lack of information on important covariates. But we have design options to limit and reduce confounding by unmeasured confounders and selection bias. Large linked healthcare databases allow us to answer some important questions that could otherwise not be answered. The future will allow us to link electronic medical records, cohort studies, and claims data. The ideal database will remain elusive, however.
We need to use state-of-the-art nonexperimental methodology applied to large linked healthcare databases to answer appropriate pharmacoepidemiological and comparative effectiveness research questions. We also need more head-to-head, simple, large randomized trials that compare the effect of relevant treatment alternatives on clinically relevant outcomes (eg, based on linked healthcare databases) in unselected populations.
ABOUT THE AUTHOR
The authors are the core faculty of the Pharmacoepidemiology Program within the Department of Epidemiology at the UNC Gillings School of Global Public Health. The core faculty comprises epidemiologists with backgrounds in medicine (Til Stürmer), psychology (Michele Jonsson Funk), health administration (Charles Poole), and biostatistics (Alan Brookhart) who share common interest in the development and assessment of innovative research methods, specifically for the nonexperimental evaluation of drug benefits and harms using large linked healthcare databases. The program's Web site is:
2. Institute of Medicine. Initial National Priorities for Comparative Effectiveness Research. Washington, DC: National Academies Press; 2009.
4. Brookhart MA, Wang PS, Solomon DH, Schneeweiss S. Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable. Epidemiology. 2006;17:268–275.
6. Temple R. Problems in the use of large data sets to assess effectiveness. Int J Technol Assess Health Care. 1990;6:211–219.
7. Miettinen OS. The need for randomization in the study of intended drug effects. Stat Med. 1983;2:267–271.
8. Salim Yusuf, Rory Collins, Richard Peto. Why do we need some large, simple randomized trials? Stat Med. 1984;3:409–420.
11. Strom BL. Pharmacoepidemiology. 4th ed. Chichester, UK: John Wiley & Sons Ltd; 2005.
12. Sesso HD, Gaziano JM, VanDenburgh M, Hennekens CH, Glynn RJ, Buring JE. Comparison of baseline characteristics and mortality experience of participants and nonparticipants in a randomized clinical trial: the Physicians' Health Study. Control Clin Trials. 2002;23:686–702.
13. Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.
14. Redelmeier DA, Tan SH, Booth GL. The treatment of unrelated disorders in patients with chronic medical diseases. N Engl J Med. 1998;338:1516–1520.
15. Welch HG, Albertsen PC, Nease RF, Bubolz TA, Wasson JH. Estimating treatment benefits for the elderly: the effect of competing risks. Ann Intern Med. 1996;124:577–584.
16. Glynn RJ, Knight EL, Levin R, Avorn J. Paradoxical relations of drug treatment with mortality in older persons. Epidemiology. 2001;12:682–689.
17. Jackson LA, Jackson ML, Nelson JC, Neuzil KM, Weiss NS. Evidence of bias in estimates of influenza vaccine effectiveness in seniors. Int J Epidemiol. 2006;35:337–344.
18. Stürmer T, Rothman KJ, Avorn J, Glynn RJ. Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution-a simulation study. Am J Epidemiol. 2010;172:843– 854.
19. Brookhart MA, Patrick AR, Dormuth C, Avorn J, Shrank W, Cadarette SM, Solomon DH. Adherence to lipid-lowering therapy and the use of preventive health services: an investigation of the healthy user effect. Am J Epidemiol. 2007;166:348–354.
20. Brookhart MA, Stürmer T, Glynn RJ, Rassen J, Schneeweiss S. Confounding control in healthcare database research: challenges and potential approaches. Med Care. 2010;48(6 suppl):S114–S120.
21. Andersen M, Brookhart MA, Glynn RJ, Stovring H, Stürmer T. Practical issues in measuring cessation and re-initiation of drug use in databases [abstract]. Pharmacoepidemiol Drug Saf. 2008;17(suppl 1):S27.
22. Simpson SH, Eurich DT, Majumdar SR, et al. A meta-analysis of the association between adherence to drug therapy and mortality. BMJ. 2006;333:15.
23. Suissa S. Immortal time bias in observational studies of drug effects. Pharmacoepidemiol Drug Saf. 2007;16:241–249.
24. Pearce N, Checkoway H, Kriebel D. Bias in occupational epidemiology studies. Occup Environ Med. 2007;64:562–568.
25. Ray WA. Evaluating medication effects outside of clinical trials: Newuser designs. Am J Epidemiol. 2003;158:915–920.
26. Glynn RJ, Schneeweiss S, Stürmer T. Indications for propensity scores and review of their use in pharmacoepidemiology. Basic Clin Pharmacol Toxicol. 2006;98:253–259.
27. Brookhart MA, Wang PS, Solomon DH, Schneeweiss S. Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable. Epidemiology. 2006;17:268–275.
28. Messer LC, Oakes JM, Mason S. Effects of socioeconomic and racial residential segregation on preterm birth: a cautionary tale of structural confounding. Am J Epidemiol. 2010;171:664–673.
29. Westreich D, Cole SR. Invited commentary: positivity in practice [commentary]. Am J Epidemiol. 2010;171:674–677.
30. Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–560.
31. Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology. 2000;11:561–570.
32. Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168:656–664.
33. Hernán MA, Alonso A, Logan R, et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology. 2008;19:766–779.
34. Hemkens LG, Grouven U, Bender R, et al. Risk of malignancies in patients with diabetes treated with human insulin or insulin analogues: a cohort study. Diabetologia. 2009;52:1732–1744.
35. Sipahi I, Debanne SM, Rowland DY, Simon DI, Fang JC. Angiotensin-receptor blockade and risk of cancer: meta-analysis of randomised controlled trials. Lancet Oncol. 2010;11:627–636.
36. Pocock SJ, Smeeth L. Insulin glargine and malignancy: an unwarranted alarm [letter]. The Lancet. 2009;374:511–513.
37. Jan P. Vandenbroucke. Observational research, randomized trials, and two views of medical science. PLoS Medicine. 2008;5:e67.
38. ISPE. Guidelines for good pharmacoepidemiology practices (GPP). Pharmacoepidemiol Drug Saf. 2008;17:200–208.
39. Frank L. Epidemiology: When an entire country is a cohort. Science. 2000;287:2398–2399.
Editors' note: This series addresses topics that affect epidemiologists across a range of specialties. Commentaries are first invited as talks at symposia organized by the Editors. This paper was originally presented at the 2010 Society for Epidemiologic Research Annual Meeting in Seattle, WA.
© 2011 Lippincott Williams & Wilkins, Inc.