The Consistency Statement in Causal Inference: A Definition or an Assumption? : Epidemiology

Secondary Logo

Journal Logo

Causation: Commentary

The Consistency Statement in Causal Inference

A Definition or an Assumption?

Cole, Stephen R.a; Frangakis, Constantine E.b

Author Information
Epidemiology 20(1):p 3-5, January 2009. | DOI: 10.1097/EDE.0b013e31818ef366
  • Free

Three assumptions sufficient to identify the average causal effect are consistency, positivity, and exchangeability (ie, “no unmeasured confounders and no informative censoring,” or “ignorability of the treatment assignment and measurement of the outcome”). The exchangeability assumptions are well known territory for epidemiologists and biostatisticians. Briefly, to be satisfied, these 2 exchangeability assumptions that require exposed and unexposed subjects, and censored and uncensored subjects have equal distributions of potential outcomes, respectively. Indeed, the so-called fundamental problem of causal inference1 is directly linked to the first exchangeability assumption.

In contrast, the consistency and positivity assumptions are less well known. The positivity assumption states that there is a nonzero (ie, positive) probability of receiving every level of exposure for every combination of values of exposure and confounders that occur among individuals in the population.2,3 It remains unclear why the consistency and positivity assumptions are less well known. Optimistically, perhaps these assumptions are less important with respect to an impact on estimation of the average causal effect. Pessimistically, these assumptions are less well known because there is little alarming evidence of a departure from either of these assumptions in observational studies without explicitly looking for the departure. Here we will focus on the preliminary issue of clarifying the consistency assumption.

The consistency assumption is often stated such that an individual's potential outcome under her observed exposure history is precisely her observed outcome.4 Methods for causal inference require that the exposure is defined unambiguously. Specifically, one needs to be able to explain how a certain level of exposure could be hypothetically assigned to a person exposed to a different level. This requirement is known as consistency. Consistency is guaranteed by design in experiments, because application of the exposure to any individual is under the control of the investigator. Consistency is plausible in observational studies of medical treatments, because one can imagine how to manipulate hypothetically an individual's treatment status. However, consistency is problematic in observational studies with exposures for which manipulation is difficult to conceive. Consistency is especially difficult when the exposure is a biologic feature, such as body weight, insulin resistance, or CD4 cell count.5,6 For example, there are many competing ways to assign (hypothetically) a body mass index of 25 kg/m2 to an individual, and each of them may have a different causal effect on the outcome.

To state consistency formally, let us first define individual j's potential outcome Yj(x) under exposure x as the outcome that would have been observed if individual j had received exposure x. The variable Yj(x) is known as a potential outcome because it describes subject j's outcome value that would have been observed under a potential exposure value that the subject may or may not have actually experienced. For a more detailed definition of potential outcomes, please see references.7,8 Many authors,7,9–12 but not all,13 find the use of potential outcomes central to the definition of causation and causal effects. Indeed, potential outcomes in the form of “causal response types” have a long history of assisting epidemiologists in the clarification of concepts such as confounding,14 effect measure modification,15 and mediation.16

Consistency is often stated as

where x denotes the argument of the potential outcome function, and (1) is read “the observed outcome for individual j is the potential outcome, as a function of intervention, when the intervention is set to the observed exposure.” Those who are willing to accept potential outcomes at least as a useful framework may see statement (1) as an axiom, an identity, or even a tautology. Indeed, discussions of inferences from applications of methods directly using potential outcomes rarely account for possible departures from the consistency assumption.11,12,17 Some of these same applications routinely discuss departures from the exchangeability assumptions stated previously, or even use analyses quantifying the sensitivity of inferences to the exchangeability assumptions.

The point of the present commentary is that the statement of an assumption should explicitly show what is being assumed. More precisely, if statement (1) is to be an “assumption” (as it is often referred to) and not a definition, it must be possible to be incorrect given the concepts that have been already accepted before stating (1). The problem is that precisely in much of the work that takes (1) to be an assumption, there is no stated concept based on which this assumption could be incorrect, thus making it a definition or axiom. So, what is (1) supposed to “assume” if it is an assumption and not a definition? To clarify, we may go back to what epidemiologists intend to mean by (1) and then formulate that meaning in a precise mathematical way. The new formulation then allows investigations of departures from what is assumed.

We proffer that by (1) most epidemiologists mean that “there were a number of ways in which treatment Xj could have been assigned, but that all those ways would have resulted in the same observed outcome.” To reflect this meaning, we suggest the following revision of the consistency assumption to be in the spirit of Rubin's “no versions of treatments,”18 and akin to the formulation suggested by van der Laan, Haight, Tager.19 Specifically, let us first expand the definition of individual j's potential outcome as Yj(x,k) under exposure x obtained by means k as the outcome that would have been observed if individual j had received exposure x by means (route, condition, etc.) k. Then consistency is defined as

where it is now evident that we are assuming the means of exposure k are irrelevant. After assumption (2) is made, one may discard the reference to k and again state the potential outcome as Yj(x). An informal definition of consistency is “have I defined exposure to include the causally relevant features?” Explicitly including k in (2) clarifies the consistency assumption, and removes the appearance of a tautology. To accept inferences derived using the consistency assumption, we must allow that either (a) all k routes allowed by the (implicit or explicit) exposure assignment mechanism are equivalent, or (b) for a particular individual, any exposure would necessarily be from the route that was observed for that individual. Like the exchangeability assumptions mentioned previously, the consistency assumption will rarely (if ever) be exactly true. However, suspicion that this consistency assumption is grossly violated will lead to discussion about possible components of k that can be investigated in current research, or that, if currently unmeasured, may be measured and investigated in future research.

Some may wish to clarify the exposure x by moving components of k into the definition of x. Such efforts at more precisely defining exposure are laudable; however, there will always be residual components of k no matter how well we define exposure x. For example, x may be use of aspirin or not. Then k includes the number of doses/day, milligram/dose, and whether the aspirin was buffered or not. Such variation can be included in the “technical errors” discussed by Rubin.7 So, x could be revised to be less vague, perhaps as the once daily intake of 40 mg buffered aspirin or not. Then the number of components in k is reduced, but k does not consist of the empty set. For instance, was the aspirin taken in the morning, and with or without food? The choice of which components of k are needed in the specification of exposure depends in part on the outcome. For example, whether or not aspirin is buffered may be an important issue when studying unintended gastrointestinal effects of aspirin intake in those with Helicobacter pylori infection. In contrast, whether aspirin is buffered or not may be relatively unimportant when studying risks of myocardial infarction. As another example, in studying the effects of antithrombolytic drugs on recovery after stroke, the timing of administration is a crucial component of k that is wisely included in the definition of exposure.20 A third example with alcohol as the exposure is given by Robins and Greenland.21 Many examples of departures from consistency can be viewed from the perspectives of study design or exposure measurement error. For instance, the causally relevant window for postmenopausal hormone therapy exposure to provide protection against cardiovascular disease22 may be viewed as a consistency issue. Indeed, formalizing the assumption of consistency may help elevate discussions about exposure specifications. In the absence of a formal foundation, such discussions may devolve into what seem to be subjective preferences. The exercise of reducing the number of components in k by better specifying the exposure x is crucial and central to the designation of well-defined exposures, and hence well-defined potential outcomes. However, such exercises, if taken to extreme, will result in so narrowly defined exposure as to preclude generalizability. Therefore, one must balance the specificity of the exposure against the generality of the inference.

A simple ad-hoc sensitivity analysis for departures from consistency may be achieved by conducting a series of analyses, each with a differing specification of exposure. Each specification alters the components of k. When results are consistent across specifications, this is evidence in favor of these components of k being irrelevant and consistency holding for these measured features of exposure. Of course, such sensitivity analyses do not address unmeasured components of exposure; further methods are needed. Such ad-hoc sensitivity analyses are not uncommon in the epidemiologic literature but should be standard.

Here we have focused on k describing features of the exposure x itself or the method of exposure intake. Beyond the scope of this commentary, k could also represent the paths by which exposure x affects the outcome y. Ignoring such post exposure k induces y to be a random variable for participant j beyond the randomness induced by sampling observed participants from a well-defined population or assuming the observed participants are a random sample from an underlying population. We believe that the “annoying” presence of k in our refined definition of consistency is necessary and hopefully sufficient to cause investigators to explore possible departures from the consistency assumption.


We thank the members of the Johns Hopkins University Bloomberg School of Public Health's causal inference working group, James Robins and Donald Rubin for helpful discussions, and the Reviewers for thoughtful suggestions.


1.Holland PW. Statistics and causal inference. JASA. 1986;81:945–970.
2.Hernán MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health. 2006;60:578–586.
3.Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168:656–664.
4.Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–560.
5.Hernán MA. Invited commentary: hypothetical interventions to define causal effects–afterthought or prerequisite? Am J Epidemiol. 2005;162:618–620; discussion 21–22.
6.Hernán MA, Taubman SL. Does obesity shorten life? The importance of well-defined interventions to answer causal questions. Int J Obes. 2008;32(suppl 3):S8–S14.
7.Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66:688–701.
8.Hernán MA. A definition of causal effect for epidemiological research. J Epidemiol Community Health. 2004;58:265–271.
9.Robins J. A graphical approach to the identification and estimation of causal parameters in mortality studies with sustained exposure periods. J Chronic Dis. 1987;40:139S–161S.
10.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29.
11.Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the joint causal effect of non-randomized treatments. JASA. 2001;96:440–448.
12.Cole SR, Hernán MA, Anastos K, et al. Determining the effect of highly active antiretroviral therapy on changes in human immunodeficiency virus type 1 RNA viral load using a marginal structural left-censored mean model. Am J Epidemiol. 2007;166:219–227.
13.Dawid AP. Causal inference without conterfactuals. JASA. 2000;95:407–424.
14.Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epedemiol. 1986;15:413–418.
15.Greenland S, Poole C. Invariants and noninvariants in the concept of interdependent effects. Scand J Work Environ Health. 1988;14:125–129.
16.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155.
17.Frangakis CE, Brookmeyer R, Varadhan R, et al. Methodology for evaluating a partially controlled longitudinal treatment using principal stratification, with application to a needle exchange program. JASA. 2004;99:239–249.
18.Rubin DB. Statistics and Causal Inference: Comment: Which Ifs Have Causal Answers. JASA. 1986;81:961–962.
19.van der Laan M, Haight T, Tager IB. Response to “Hypothetical interventions to define causal effects.” Am J Epidemiol. 2005;162:621–622.
20.Steg PG, Bonnefoy E, Chabaud S, et al. Impact of time to treatment on mortality after prehospital fibrinolysis or primary angioplasty: data from the CAPTIM randomized clinical trial. Circulation. 2003;108:2851–2856.
21.Robins JM, Greenland S. Comment on: causal inference without counterfactuals. JASA. 2000;95:431–435.
22.Prentice RL, Langer R, Stefanick ML, et al. Combined postmenopausal hormone therapy and cardiovascular disease: toward resolving the discrepancy between observational studies and the Women's Health Initiative clinical trial. Am J Epidemiol. 2005;162:404–414.
© 2009 Lippincott Williams & Wilkins, Inc.