Is the age of maturity for automated databases (predicted for decades1–5) finally arriving? Database use has increased dramatically, particularly for the study of medications and other therapeutic interventions. This trend is fueled by the growing recognition that randomized controlled trials, while essential, cannot answer all of the important questions related to the efficacy and safety of therapeutics, ever-improving informatics technology, and advances in epidemiologic methods. Indeed, the recent debut of the US multimillion-dollar FDA Sentinel Initiative testifies to the high expectations for automated databases.
A session at the 2010 meeting of the Society for Epidemiologic Research sounded an important note of caution. It is undisputed that automated databases have the potential to enable very large retrospective studies of both relative safety and efficacy that would be infeasible with traditional methods. However, the symposium presenters6–8 cautioned that this potential must be balanced against the nontrivial limitations of these data systems. The underlying premise of database studies is deceptively simple—use computerized information for very large populations to rapidly define cohorts and compare end point occurrence in exposed versus unexposed groups. But the reality of these studies is far more complex, and there are many pitfalls that await the unwary (Table).
Adequate data quality is the foundation for any study. Its absence can have disastrous effects, as these minatory examples remind us:
- A study mistakenly concluded that community hospitals performed excessive coronary artery bypass grafts, based on a database in which the patient zip code field sometimes contained the hospitals' zip code9 instead.
- A report concluded that suicides were approximately doubled following natural disasters—a finding later determined to be due to data linkage errors.10
- Silicone breast implants were reported to have a large apparent protective effect against breast cancer; this finding was later suggested to be due at least in part to the researchers' unfamiliarity with the diverse databases used in the study.11
Several trends have set the stage for such data errors. Large numbers of databases are now available from diverse sources. The ultimate consumers of the data may be far removed from the data originators, and may have very limited capacity to check data elements against the sources. Data-privacy concerns may compound the difficulties of data validation. Quality problems may be further increased by database linkages relying upon incomplete information. These problems may be amplified for studies performed with multiple databases—a common strategy to increase study power.
Standardized validation/certification processes could improve data quality. While such certification would be far more complex than a simplistic “Good Housekeeping Seal of Approval” model, a standardized set of procedures could provide epidemiologists with greater confidence in the quality of the data used and, importantly, alert them to known limitations. Alternatively, epidemiologists might give greater preference to databases for which the data originators had strong, well-documented, quality-control procedures.
A fundamental step for observational studies is defining the study exposure of interest, with an importance analogous to that of randomization for clinical trials. Databases may lack the information necessary for exposure definition. The commentary by Weiss8 provides an instructive example: whether or not a potential study subject had the exposure of interest—a screening colonoscopy—could not be determined from the data available. Data are never ideal, and there is always some information missing. Still, when such a fundamental factor as exposure cannot be measured, it is generally better to not use the database for the study, even though other aspects, such as the large population or end point data, may be attractive.
The study-exposure definition must be consistent with the hypothesized mechanism relating it to the disease. For acute effects, such as proarrhythmic effects12 or psychomotor impairment,13 the risk may be present only for patients currently using the drug. For chronic effects, such as cancers related to hormone replacement therapy, a certain minimal threshold of duration or cumulative dose may be necessary. Inappropriate exposure definitions—such as simplistic ever/never categories—can lead to serious misclassification.13
Data consistency edits can also help to reduce exposure misclassification. For example, a filled prescription for which the data list “1 tablet of drug dispensed” and “a 30-day supply” can generate, for analysis purposes, 1 day of current drug use and 29 days of “indeterminate” use, avoiding the potential misclassification inherent in designating the 30 days as either all drug use days or all nonuser days.
The key challenge is identification of the study exposed and unexposed groups, which generally involves sampling of both persons and person-time. Several problems may arise at this stage. As Weiss8 notes, some studies have failed to assure that all members of the cohort are at risk for the study outcome: eg, endometrial cancer studies that may have included women without a uterus. Stürmer7 outlines other problems, including bias related to prevalent medication users (survivors of an initial period of potential high risk14) and immortal person-time bias (when a person-day is sampled or has its exposure status defined based on a future occurrence15).
Endpoint definitions must have adequate positive predictive value and sensitivity. Some endpoints will be difficult to study exclusively in automated databases. An example is congenital anomalies, where medical record review may be required to reduce misclassification.16 Because database studies generally rely on medical-care encounters for endpoint identification, the many conditions not reliably diagnosed or treated (eg, hypertension, hypothyroidism, depression) usually require prospective methods for ascertainment.
Adequate management of confounding—a major challenge for all epidemiologic studies—may be particularly difficult for database studies. Because many of these databases consist of medical-care encounters, information may be missing or incomplete for important factors such as smoking or weight. If the study exposure is related to medical care, difficult-to-measure factors may be important confounders. Indeed, Weiss8 notes the likely role of the healthy-drug-user effect (alternatively, the frail nonvaccine-user) as a potential confounder in studies of influenza vaccine and mortality, and Stürmer7 describes the healthy-drug-adherer effect, noting that potential confounding is likely to increase with duration of medication use. Given investigator interest in relative efficacy studies6,7 and database potential for very large studies, investigators may seek to detect relatively small or moderate effects, which will increase the importance of confounding.
For many database studies, design options and analytic techniques can reduce confounding. As Stürmer7 notes, if the database is sufficiently large, both the exposed and the nonexposed groups may be carefully restricted and perhaps matched to reduce the potential for confounding by difficult-to-measure factors. Propensity-score methodology provides, at least in principle, the possibility of considering hundreds if not thousands of potential confounders.17 Both approaches may be helpful in controlling for otherwise troublesome factors such as the healthy-drug-user effect.
Although the potential of automated databases for epidemiologic studies is widely recognized, there are many pitfalls, ranging from the very obvious to the very subtle. Thus, the alluring notion that databases enable rapid, perhaps even automatic, exposure-disease studies is, in my mind, overly simplistic. These studies are almost always complicated and the challenges generally are closely tied to the specific exposure—disease relationship under study. They thus require substantial investment of both thought and time. Although the investigator may be spared many of the logistic preoccupations of traditional methods (eg, collecting study data via interview), there are numerous other challenges (Table).
How can we improve the state of the art? One important step is greater appreciation of the potential pitfalls (see the companion commentaries by Dryer,6 Stürmer,7 and Weiss,8 the accompanying editorial and the Table). Dreyer6 notes the potential value of standardized guidelines for observational studies, similar to the widely used Consolidated Standards of Reporting Trials (CONSORT) guidelines for clinical trials. A greater awareness of the intricacy and complexity of studies that use databases, coupled with methodologic advances that allow us to make more complete use of the detailed information recorded within these systems, may allow us to move closer to realizing the long-anticipated promise of automated database studies.
ABOUT THE AUTHOR
WAYNE RAY is a Professor of Preventive Medicine and Director, Division of Pharmacoepidemiology at Vanderbilt University School of Medicine. His particular interest is use of automated databases for pharmacoepidemiologic studies.
1. Federspiel CF, Ray WA, Schaffner W. Medicaid records as a valid data source: the Tennessee experience. Med Care
2. Jick H, Watkins RN, Hunter JR, et al. Replacement estrogens and endometrial cancer. N Engl J Med
3. Ray WA, Griffin MR. Use of Medicaid data for pharmacoepidemiology. Am J Epidemiol
4. Ray WA. Population-based studies of adverse drug effects. N Engl J Med
5. Strom BL, Carson JL. Automated data bases used for pharmacoepidemiology research. Clin Pharmacol Ther
6. Dreyer NA. Making observational studies count: shaping the future of comparative effectiveness research. Epidemiology
7. Stürmer T, Funk MJ, Poole C, Brookhart MA. Nonexperimental comparative effectiveness research using linked healthcare databases. Epidemiology
8. Weiss NS. The new world of data linkages in clinical epidemiology: are we being brave or foolhardy? Epidemiology
9. Cherry JK, Carmichael DB, Shean FC, Ritt DJ. Inaccurate data in “Solving the medical care dilemma.” N Engl J Med.
10. Krug E, Kresnow M, Peddicord J, et al. Retraction. N Eng J Med
11. Bryant H, Brasher P. Breast implants and breasts cancer–reanalysis of a linkage study. New Engl J Med
12. Ray WA, Chung CP, Murray KT, Hall K, Stein CM. Atypical antipsychotic drugs and the risk of sudden cardiac death. N Engl J Med
13. Ray WA, Thapa PB, Gideon P. Misclassification of benzodiazepine exposure by use of a single baseline measurement and its effects upon studies of injuries. Pharmacoepidemiol Drug Saf
14. Ray WA. Evaluating medication effects outside of clinical trials: new-user designs. Am J Epidemiol
15. Suissa S. Immortal time bias in pharmacoepidemiology. Am J Epidemiol
16. Cooper WO, Hernandez-Diaz S, Gideon P, et al. Positive predictive value of computerized records for major congenital malformations. Pharmacoepidemiol Drug Saf
17. Schneeweiss S, Rassen JA, Glynn RJ, Avorn J, Mogun H, Brookhart MA. High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology
18. Hernán MA. With great data comes great responsibility: Publishing comparative effectiveness reserach in Epidemiology. Epidemiology
Editors' note: This series addresses topics of interest to epidemiologists across a range of specialties. Commentaries start as invited talks at symposia organized by the Editors. This paper was presented at the 2010 Society for Epidemiologic Research Annual Meeting in Seattle, WA.