Epidemiologic studies of the effects of drugs are essential. Increasingly, these studies are based on large administrative data files containing longitudinal drug exposure and clinical outcome data for entire populations. These data allow for rapid, retrospective evaluation of the intended and adverse effects of prescription medications as they are used in routine practice. Health care claims data, however, do not contain measurements of many important risk factors that physicians use to make prescribing decisions. Confounding by unmeasured indication is a threat to the validity of most pharmacoepidemiologic studies but is likely to be particularly severe for studies based on administrative data.1
Instrumental variable (IV) methods present a possible solution to the problem of residual confounding, provided suitable instruments can be identified. For an IV to reduce confounding in the initial treatment assignment, it should influence the prescribing decision but have no direct effect on the health outcome under study. We have proposed an instrument related to physician prescribing preference for use in database studies of short-term drug effects.2 In this issue of Epidemiology, Hernán and Robins3 give a comprehensive overview of IV methods and provide a detailed commentary on our analysis. We would like to review briefly the central assumptions of our study, discuss an additional concern raised by Hernán and Robins, and elaborate on further aspects of this analysis and the IV method in the context of empirical pharmacoepidemiologic research with secondary data.
In our study, we assessed the risk of gastrointestinal (GI) toxicity associated with selective versus nonselective nonsteroidal anti-inflammatory drug (NSAID) exposure using a physician’s NSAID preference (as measured by the last NSAID prescription written by a physician) as an instrument. When treatment effects are homogeneous, the internal validity of our approach relies principally on 3 assumptions: (1) physicians differ in their NSAID preference; (2) a physician’s NSAID prescribing preference as reflected in the last NSAID prescription he has written is unrelated to unmeasured risk factors in his current patient; and (3) a physician’s NSAID prescribing preference as reflected in the last prescription written has no direct effect on the outcome in the current patient (known as the exclusion restriction). As stated,2 these assumptions are not verifiable with data, but their plausibility can be evaluated in a variety of ways.
Hernán and Robins3 raise the important point that if the effect of treatment is heterogeneous (eg, if selective NSAIDs are protective in some patients and hazardous in others), then an additional assumption is required to justify the use of the usual IV estimator. One such assumption states that there should be no additive effect modification by the instrument among the treated patients (and similarly among the untreated patients). Although the possibility of effect modification by the instrument seems remote, Hernán and Robins make the interesting observation that this assumption can be violated if the instrument interacts with important unmeasured GI risk factors (eg, tobacco use, alcohol consumption, obesity) to determine treatment. For example, this could happen if physicians who prefer nonselective NSAIDs rely strongly on a patient’s smoking status to make treatment decisions, whereas selective NSAID preferring physicians decide on treatment without much consideration of smoking status. Such interactions could also lead to a violation of the “monotonicity” assumption of Imbens and Angrist4,5—an alternative assumption that can be used to justify the use of the usual IV estimator when there is suspected treatment effect heterogeneity.
The assumption that the instrument does not interact with unmeasured GI risk factors to determine treatment is not empirically testable. However, its plausibility can be partly evaluated by exploring whether or not the instrument interacts with measured GI risk factors to determine treatment. In our study, there were no such interactions, leading us to suspect that strong interactions with the unmeasured GI risk factors are unlikely. Although we do not think that treatment effect heterogeneity is a first-order source of bias in our study of NSAIDs, the issue is subtle and deserves further research.
Our intention-to-treat approach has intrinsic limitations independent of the issues of IVs. Because many patients who initiate NSAIDs discontinue them by 60 days, the effect of treatment assignment might be different from the effect of uninterupted exposure. Furthermore, discontinuation may be more likely for nonselective NSAID users, leading to a bias against selective NSAIDs. We reduced this bias by focusing on short-term effects. Alternative “as-treated” longitudinal approaches suffer from confounding by informative treatment changes that occur during follow-up (for example, unrecorded GI symptoms that lead to the discontinuation of nonselective NSAIDs). The approach we adopted estimates the short-term effect of a treatment assignment that occurred in a routine care setting. However, if in addition to the claims data, detailed longitudinal clinical data were available that captured the important predictors of treatment discontinuation, then the structural nested model approach discussed by Hernán and Robins would allow for the estimation of the effect of uninterupted exposure. This approach could use an instrument and baseline covariates to control the confounding in the initial assignment of treatment and time-varying covariates to control for informative treatment changes that occur during follow-up.
We agree with Hernán and Robins that IV methods are not a panacea for the problems of epidemiology, allowing us to avoid strong, unverifiable assumptions in our studies. Indeed, the major obstacle to adopting an IV approach in any particular study is the initial requirement to have an observational defensible instrument—a variable that must satisfy highly restrictive and unverifiable assumptions. The scarcity of IV studies in epidemiology suggests that such variables are exceedingly hard to find for most exposures. In pharmacoepidemiology, however, instruments might be more readily found because prescribing decisions can be strongly influenced by physician or facility preference and exogenous institutional factors. For example, instruments could arise from regional or practice-level variations in drug use, changes in clinical guidelines, secular trends in drug use, and drug insurance or policy changes.6 Like ours, these instruments will not be perfect. But in the imperfect world of pharmacoepidemiologic research with secondary data, the crucial issue is whether the use of such instruments might improve upon statistical approaches that rely on the strong and usually unrealistic assumption that all confounders are measured.
We thank Til Stürmer, Kenneth J. Rothman, and Josh Angrist for helpful comments.
1. Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol
2. Brookhart MA, Wang PS, Solomon DH, Schneeweiss S. Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable. Epidemiology
3. Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology
4. Imbens GW, Angrist JD. Identification and estimation of local average treatment effects. Econometrica
5. Angrist JD, Imbens GW, Rubin DR. Identification of causal effects using instrumental variables. J Am Stat Assoc
6. Schneeweiss S, Maclure M, Soumerai SB, Walker AM, Glynn RJ. Quasi-experimental longitudinal designs to evaluate drug benefit policy changes with low policy compliance. J Clin Epidemiol