From the Biostatistics Section and Center for Statistical Sciences, Program in Public Health, Brown University, Providence, Rhode Island.
Supported by NIH grants R01-HL-79457 and R01-DA-021729.
Correspondence: Joseph W. Hogan, Center for Statistical Sciences, Box G-S121-7, Brown University, Providence RI 02912. E-mail: email@example.com
Principled, sensible, and durable inference from complex observational data in epidemiology requires the careful formulation, implementation, and interpretation of models that are parameterized explicitly in terms of causal effects. The paper by Bembom et al1 in this issue of Epidemiology is an outstanding example; aside from being a thorough and rigorous application of causal modeling, it addresses directly the shortcomings of standard regression approaches to answer causal questions.
The data analyzed by Bembom et al have characteristics that are frequently encountered in longitudinal epidemiologic studies: the primary exposure cannot be easily studied using a randomized trial; confounders are time-varying; and standard regression models cannot be used to estimate the causal parameter of interest. The paper is remarkable not only for its careful description of the effects of interest and the use of cutting-edge models, but for its in-depth examination of key assumptions and side-by-side comparison with more familiar models.
Statistical models that purport to draw causal inferences are rightly viewed with skepticism. For many epidemiologists (and not a few statisticians), causal models remain blurry visions, their structure obscured by a blizzard of notation and unfamiliar language. I offer 5 suggestions for bringing these models into sharper focus and promoting their more widespread use.
Modify the Language
The term “counterfactual” is well understood and routinely used among causal inference methodologists, but does nothing to make the average reader feel confident about the findings from a causal model. It is more clear, and in my view more accurate, to say that a causal model is based on potential outcomes, one for each level of exposure. The effects of interest should be referred to as causal effects (not counterfactual effects), because in terms of the model, that is what they are. (How well they can be estimated by data is another matter; see below.) The potential outcomes formulation has a long history, dating at least to the early part of the last century,2 and is the basis of a wide-ranging class of models for longitudinal and time-to-event data.3
In the current paper, the levels of exposure are binary and indexed by a, where a = 0 indicates no leisure-time physical activity and a = 1 indicates engagement in such activity. The potential outcomes for each individual are y0 and y1. One is observed (the outcome under the actual exposure) and the other is not (it is counterfactual). The causal effect is some contrast of these variables or their distributions; eg, the average causal effect is E(y1 − y0).
Use a Picture
There are a number of approaches that can be used to estimate a causal effect. Most marginal structural models have equivalent representations as a graphical model (Pearl4 provides an accessible account). Graphical models have a formal syntax, appeal to intuition, make assumptions clear, and illustrate what is being estimated. A graphical representation of the marginal structural model in Bembom et al would be a useful addition.
Emphasize the Connection Between Inverse Weighting and Randomized Exposures
Methods of inference that rely on inverse weighting, as in Bembom et al, view the nonrandom allocation to exposure as a selection bias. Methods for analyzing weighted sample surveys are used to adjust for the nonrandom selection. Consider a survey where a certain under-represented subgroup is oversampled (say individuals over 90). A weighted sample, where those over 90 are down-weighted in proportion to their selection probability, is representative of the population as a whole. In weighted estimation of causal models, the probability of being selected to exposure or nonexposure is modeled as a function of confounding variables, and the sample is re-weighted using inverse probabilities. If the measured confounders fully explain differential selection to exposure versus nonexposure, the weighted sample can be viewed as being drawn from a population where exposure is randomly allocated.5
Report Sensitivity to Untestable Assumptions
It is inescapable that causal models cannot be fully identified from observational data; that is, the parameters of interest cannot be estimated without making untestable assumptions. In fitting a marginal structural model—or when using any estimation method based on propensity scores—the key untestable assumption is that all confounders are measured. If there is an unmeasured source of confounding, inferences about exposure effect may be altered by taking it into account. Robins6 has devised an intuitive and understandable way to capture sensitivity to unmeasured confounding. Another approach is to place bounds on the estimates, rather than reporting a single value.
Some researchers might be concerned that a sensitivity analysis would show a range of effects so wide as to render the conclusions useless. I would argue the opposite: it shows the degree to which model-based inference is informed by the data, and the robustness of the inference to departures from untestable assumptions. Sensitivity analysis allows the reader to see the range of possible bias, rather than speculate about it. For a study in which most of the relevant confounders are available, the potential biases may actually turn out to be small, lending even more credence to the final conclusions. (See the paper by Ko et al7 for an example involving the effect of combination antiviral therapy among women with HIV.)
Do Not Try This at Home
The use of causal models for complex observational data is not trivial and requires input from a scientist with relevant expertise. There is not an SAS routine—nor should there be—for the models fit by Bembom et al. The required input includes formulation of the model (most important), fitting the model (usually requiring development of code), thorough evaluation of potential biases attributable to modeling assumptions (eg, no unmeasured confounding), correct computation of standard errors (here via the bootstrap method), and finally expository interpretation of the model findings.
Research based on observational data is entering an era in which the scientific agenda will be driven by the availability of enormous databases, comprising records on hundreds of thousands of individuals, constructed from medical records systems and from the combining of large cohorts. Implicitly or explicitly, studies that use these databases will be investigating causal hypotheses. Exemplary applications like the one reported by Bembom et al are sure to go a long way toward helping epidemiologists become more comfortable with using causal models and with enlisting collaborators who are skilled in applying them.
ABOUT THE AUTHOR
JOSEPH HOGAN is Professor of Biostatistics at Brown University. He conducts research on methods for missing data, longitudinal data and causal inference, with emphasis on applications in HIV/AIDS and behavioral medicine. He is co-author, with Michael Daniels, of the recent book “Missing Data in Longitudinal Studies” (Chapman & Hall/CRC Press, 2008).
1. Bembom O, van der Laan M, Haight T, Tager I, et al. Leisure-time physical activity and all-cause mortality in an elderly cohort. Epidemiology
2. Rubin DB. Comment: Neyman (1923) and causal inference in experiments and observational studies. Stat Sci
3. van der Laan MJ, Robins JM. Unified Methods for Censored Longitudinal Data and Causality.
New York: Springer Verlag; 2003.
4. Pearl J. Causal inference in the health sciences: a conceptual introduction. Health Serv Outcomes Res Methodol
5. Hernan MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health
6. Robins JM. Association, causation and marginal structural models. Synthese
7. Ko H, Hogan JW, Mayer KH. Estimating causal treatment effects from longitudinal HIV natural history studies. Biometrics