# Bounding Causal Effects Under Uncontrolled Confounding Using Counterfactuals

Common sensitivity analysis methods for unmeasured confounders provide a corrected point estimate of causal effect for each specified set of unknown parameter values. This article reviews alternative methods for generating deterministic nonparametric bounds on the magnitude of the causal effect using linear programming methods and potential outcomes models. The bounds are generated using only the observed table. We then demonstrate how these bound widths may be reduced through assumptions regarding the potential outcomes under various exposure regimens. We illustrate this linear programming approach using data from the Cooperative Cardiovascular Project. These bounds on causal effect under uncontrolled confounding complement standard sensitivity analyses by providing a range within which the causal effect must lie given the validity of the assumptions.

From the *Department of Epidemiology, University of North Carolina School of Public Health, Chapel Hill, NC; and the †Department of Otolaryngology, University at Buffalo, Buffalo NY.

Submitted 25 November 2003; final version accepted 1 March 2005.

Funded by contract R01-HD-39746 from the National Institute of Child Health and Human Development.

Correspondence: Richard F. MacLehose, Department of Epidemiology (CB#7435), University of North Carolina School of Public Health, McGavran-Greenberg Hall, Pittsboro Road, Chapel Hill, NC 27599-7435. E-mail: maclehose@unc.edu.

Unmeasured confounding is a central concern in observational studies.^{1} The extent to which the crude association between an exposure and disease is due to the effect of the exposure, rather than to an unmeasured confounder, has been the subject of heated debates.^{2} Deterministic sensitivity analyses offer quantitative approaches for estimating the bias due to an unmeasured confounder.

The first efforts to quantify the magnitude of confounding that could explain an association date back at least as far as Cornfield et al.^{3} These authors demonstrated that an unmeasured confounder would need to have an extremely strong association with smoking to be responsible for the entire observed association of smoking with lung cancer. Subsequently, a number of other authors have expanded this approach by calculating the value of an effect measure under a range of assumptions about the unmeasured confounder.^{4–10} These methods generally require information on the prevalence of the unmeasured confounder, its association with the exposure, and its effect on the outcome. Once these parameters are specified, the “externally adjusted” estimate of the effect of the exposure on the outcome can be calculated. Alternatively, specification of a subset of these parameters allows the adjusted exposure effect estimate to be bounded, and the “externally adjusted” estimates are generally computed for a wide range of hypothetical confounder distributions.

This sensitivity-analysis approach provides a corrected point estimate or bound for each specification of the unknown parameters. Parameters are specified and corrected point estimates are calculated, from which the investigator will usually identify a range of values that appear plausible, generating probabilistic bounds for the causal effect. Subject matter knowledge is required to assess plausible ranges for these 3 parameters that identify the causal effect.

The purpose of this article is to review methods for generating deterministic nonparametric bounds on the magnitude of the causal effect using potential outcomes models and linear programming. In this approach, only the observed contingency table is used to bound the causal effect. The width of these bounds may be reduced by incorporating subject matter knowledge through assumptions regarding the potential outcomes under various exposure regimens. Although applications have been restricted to binary observed variables (which we also do in this article), the method can be extended to categorical exposures, confounders, and outcomes.

## The Method of Deriving Bounds Through Linear Programming

To illustrate the linear programming approach to bounding causal effects, consider the elementary situation in which an exposure (E = 1) may cause a disease (D = 1). An unmeasured confounder, U, affects both exposure and disease status, as can be seen from the directed acyclic graph (DAG) in Figure 1. The domain of possible values for U is unrestricted. In the interest of simplifying the presentation, we assume throughout this article that there is no misclassification, no sampling variability, and no misspecification of the DAG. The directed arcs in Figure 1 can have any functional form.^{11,12}

The DAG can also be described in terms of the potential outcomes derived from latent counterfactual responses (eg, what would have happened to an exposed individual had the exposure not occurred).^{13} For the DAG in Figure 1, there are 4 possible potential response types: 1) a person who gets the disease regardless of the exposure (doomed), 2) a person who gets the disease only if exposed (causative), 3) a person who gets the disease only if not exposed (preventive), and 4) a person who does not get the disease regardless of exposure (immune).^{14,15} Within each exposure stratum, e *=* 0,1, let the proportion of each type, *t* = 1…4, be denoted *pet* (eg, the number of unexposed doomed patients divided by the total number of unexposed patients is *p01*). The disease outcomes for the 4 types are expressed in Table 1.

Each potential response type in the study population contributes to 1 of 4 observed proportions in the realized E-D contingency table. Exposed people (*e =* 1) who are observed to have the disease may have gotten it because they are either doomed (*t = 1*) or causative (*t* = 2). Unexposed people (*e =* 0) are observed to have the disease because they are either doomed (*t* = 1) or preventive response types (*t* = 3). Similarly, exposed people (*e =* 1) are observed not to get disease because they are preventive (*t* = 3) or immune types (*t* = 4), and unexposed people (*e =* 0) are observed not to get the disease because they are either causative (*t* = 2) or immune types (*t* = 4). Each observed proportion in the table, conditioned on exposure, is the sum of the proportions of 2 latent potential response types in one or the other exposure strata. Because of this fact, potential response-type proportions themselves are not identified from the observed table. For example, observing that P(D = 1|E = 1) = 0.2 does not reveal what values p_{11} or p_{12} must take. However, the observed conditional probabilities may be expressed as a set of linear equations that place constraints on potential response-type proportions:

Exposure effects must be defined with respect to a specific inferential target population and a specific contrast measure.^{16} For the purposes of this article, we focus on the causal effect of the exposure among the unexposed (ie, with the unexposed as the target population) and measure the effect by the risk difference (RD). Other target populations (eg, exposed, total) can be accommodated in a similar fashion. This causal RD among the unexposed population (RD_{C|E=0}) is the difference between the proportion of unexposed people who would have gotten the disease if they had (counter to the fact) been exposed and the proportion of these same unexposed people who in fact did get the disease. Expressed symbolically, this gives the expression:

Where “|SET{E = e′},E = e” indicates conditioning on the E = e subpopulation had they been forced by external intervention to have their exposure level SET to e′. Because our target is the unexposed subpopulation Pr(D = 1|SET{E = 0},E = 0) = Pr(D = 1|E = 0) and so is factual, whereas Pr(D = 1|SET{E = 1},E = 0) is counterfactual.

The causal risk difference in the unexposed, RD_{C|E=0}, is therefore not identified by the observed data and must instead be estimated using a substitute population to infer the counterfactual experience of the unexposed had they been exposed.^{16} Using the exposed column from Table 1 as the substitute population gives a formula for the associational risk difference between exposure and disease:

RD_{A|E=0} will equal RD_{C|E=0} in the absence of confounding (assuming absence of other errors), which implies (*p01* + *p03*) = (*p11 + p13*).^{15,17} This equality is not demonstrable on the basis of the observed data, so one must rely on background knowledge to judge whether an estimate is confounded.^{18}

Sensitivity analysis methods require parametric specification of the unmeasured confounder to assess its expected impact on the observed exposure–disease association.^{19} In contrast, a linear programming approach allows the analyst to use the observed contingency table to generate bounds on possible values for RD_{C|E=0} by relying on the observed data and constraints such as those imposed by equation 1. Values outside of these bounds are impossible given the observed data and specified constraints.^{20,21}

Linear programming is a mathematical procedure widely used in operations research and econometrics to find the global minimum or maximum of a linear function of a set of variables (the “objective” function) subject to a series of linear equality or inequality constraints on these variables. Graphically, the linear programming approach constructs an n-dimensional polygon of feasible solutions out of the constraints, with the number of dimensions determined by the number of variables and equality constraints in the problem. A classic efficient algorithm for the solution (the simplex method) moves from vertex to adjacent vertex in a systematic fashion until a maximum or minimum of the objective is reached.

### Example

As an example to illustrate the use of linear programming in determining bounds, we use data from the Cooperative Cardiovascular Project, which is described in detail elsewhere.^{22–24} Briefly, all Medicare claims that listed acute myocardial infarction (MI) as the principal discharge diagnosis in 1995 were identified from Medicare claims files. Medical records for patients were abstracted, providing a comprehensive account of disease status, treatment, and patient demographics. There were 222,612 admissions for acute MI in the 45 states available for this analysis (data from Alabama, Connecticut, Iowa, Minnesota, and Wisconsin were unavailable to us). Excluded from the analysis were patients under the age of 65, without a confirmed acute MI, with an in-hospital MI (rather than occurring before hospitalization), or who were transferred from another hospital. After exclusions, 135,759 patients remained in the analysis. To illustrate this bounding method, we examine the effect of the use of beta-blockers on 30-day mortality. All analyses are performed in Lingo version 7.0.^{25}

#### Example 1: Bounds When Only Exposure and Disease Are Measured

We want to find the minimum and maximum possible values of RD_{C|E=0} (minRD_{C|E=0} and maxRD_{C|E=0}, respectively) under constraints on potential response type proportions given in equation 1. Additionally, we constrain the set of possible solutions by requiring each potential response type proportion to be nonnegative, ie, *pet≥* 0. It is apparent from equation 2 that RD_{C|E=0} depends only on potential response type proportions among the unexposed, *p02* and *p03*. To find the minimum and maximum RD_{C|E=0}, we examine the observed conditional probability Pr(D = 1|E = 0) and its complement Pr(D = 0|E = 0), which are partially determined by *p03* and *p02*, respectively. These 2 observed probabilities are also partially determined by *p01* and *p04*, respectively. The remaining 2 observed probabilities in equation 1 contain no information about *p02* or *p03* and then can be ignored.

For this linear programming problem, the feasible solution region (satisfying the linear constraints) in the 4-dimensional space of *p01*, *p02*, *p03*, *p04* is a 2-dimensional quadrilateral, which when projected onto the *p02**–p03* plane becomes a rectangle, as shown in Figure 2, with 0 ≤ *p02 ≤* Pr(D = 0|E = 0)) and 0 ≤ *p03 ≤* Pr(D = 1|E = 0) = {1 − Pr(D = 0|E = 0)}. Points of constant objective function value (RD_{C|E=0} = constant) are the parallel family of dashed lines at 45° slope. Shifting toward the upper left increases RD_{C|E=0}, whereas shifting toward the lower right decreases RD_{C|E=0}. Only that portion of each line within the rectangle represents feasible solutions. Therefore, maxRD_{C|E=0} occurs at the upper left vertex and equals Pr(D = 0|E = 0), and minRD_{C|E=0} occurs at the lower right vertex equaling −Pr(D = 1|E = 0). The bound width will therefore always be 1, as has been noted previously.^{26}

Of the 135,799 people in the CCP data, roughly 71% did not receive beta-blockers. Approximately 3.07% of those who received beta-blockers died, whereas 25.66% of those who did not receive beta-blockers died, so that RD_{A|E=0} = −0.2359. The observed proportions necessary for the linear programming solution are Pr(D = 1|E = 0) = 0.2566 and Pr(D = 0|E = 0) = 0.7434. These probabilities yield a maxRD_{C|E=0} = 0.7427 and minRD_{C|E=0} = −0.2566. The maxRD_{C|E=0} occurs when all recipients of beta-blockers who die do so because they are causative types rather than doomed, and all users of beta-blockers who do not die do so because they are immune rather than because beta-blockers are preventive. Conversely, minRD_{C|E=0} occurs when all recipients of beta-blockers who die are considered to be doomed rather than causative, and all recipients of beta-blockers who do not die do so because beta-blockers prevent their death rather than the recipients being immune to the beta-blockers.

#### Example 1a: Introducing Additional Constraints

Unfortunately, bounds as wide as those in example 1 will often be of no practical use and will never exclude the hypothesis that RD_{C|E=0} is 0. To decrease the bound width, one must introduce additional constraints. One constraint that may be substantively justified is that exposure does not cause disease in any individual or, equivalently, that *p02**= p12* = 0. The linear programming solution for the bounds is straightforward for this analysis. If *p02* = 0 then, from (equation 2), RD_{C|E=0} = −*p03*. Of the observed probabilities, only Pr(D = 1|E = 0) contains information about *p03*. To solve this linear programming problem, one simply finds the minimum and maximum feasible values for *p03*, which are clearly −Pr(D = 1|E = 0) and 0, respectively, as seen from equation 2 and the nonnegativity constraints on the *pet*. From introducing the constraint that exposure does not cause disease, it immediately follows that the bound width is no longer fixed at 1; instead, it is equal to Pr(D = 1|E = 0). Using the CCP data and the assumption that beta-blockers do not cause death, the deterministic bounds shrink from −0.2573, 0.7427 to −0.2573, 0. The introduction of this constraint represents a substantial decrease in the bound width for this estimate. Furthermore, the lower bound for the effect is close to RD_{A|E=0}, indicating that if the estimate is confounded to any appreciable amount, it must be away from the null.

#### Example 2: Bounds When a Covariate Is Measured

The linear programming approach can be extended to more complicated causal structures as well. Figure 3 illustrates a DAG in which the exposure–disease relationship is confounded by one measured binary covariate, Z, and one unmeasured covariate, U. The introduction of a measured confounder, Z, greatly increases the number of potential response types. Because the exposure and measured confounder may both cause disease, there may be effect modification of the risk difference. Thus, it is desirable to focus on Z-stratum-specific effects. Because Z is exogenous, it is reasonable to define the potential response-type proportions conditional on covariate Z as in Table 2. Four potential disease–response types are possible for each value of Z. Additionally, Z is related to E through 4 potential response types leading to 4^{3} = 64 combinations of response types.

The proportion of potential response types in the total population can be defined as *pijk* where the subscripts i, j, and k each range from 1 to 4 and represent the response-type relations for Z → E, E → D|Z = 0, and E → D|Z = 1, respectively. The lack of any arcs entering Z in Figure 3 implies that potential response proportions, *pijk*, are identical between levels of Z (ie, Z is exogenous in the DAG). However, as the target population of interest is the unexposed group, the Z-stratum-specific RDs are most clearly defined in terms of potential response proportions among the unexposed given some value of Z. We therefore transform the response-type proportions, *pijk*, to the response-type proportions among the unexposed population given Z = 0, *p*^{u|z=0}_{ijk} and the response-type proportions among the unexposed population given Z = 1, *p*^{u|z=1}_{ijk} as follows:

The potential response-type proportions that are defined to be zero are logical impossibilities. For instance, if i = 3 (the measured confounder has a preventive effect on exposure) and Z = 0, then E can never equal 0. Similarly, if i = 2 (the measured confounder has a causative effect on exposure) and Z = 1, then E can never equal 0. Lastly, if i = 1, then exposure will always occur regardless of the values Z takes.

These transformed potential response-type proportions can be used to define the causal stratum-specific RDs:

These are simply the sum of response-type proportions where exposure has a causative effect on disease in that stratum of the measured confounder minus the sum of response-type proportions where there is a preventive effect of exposure on disease in that stratum of the measured confounder.

Although the solutions to examples 1 and 1a may have been obvious, maximizing and minimizing the stratum-specific causal RDs in equation 5 in 64 dimensional spaces could prove too complex without the aid of linear programming. Similar to example 1, the 64 combinations of potential response-type proportions comprise the 8 cell probabilities in the observed E-Z-D contingency table. These observed proportions and the assumption that measured confounder Z is exogenous determine the constraints for the linear programming solution.

To examine the effect of beta-blockers on mortality among ethnic groups, we restrict the Cooperative Cardiovascular Project data to black and white study participants (Table 3). The stratum-specific associational RDs for blacks (Z = 0) and whites (Z = 1) are −0.2017 and −0.2285, respectively, so that evidence of appreciable RD modification by ethnicity is not strong, although it is statistically significant. The bounds around RD(0)_{C|E=0} are minRD_{C|E=0} = −0.229 and maxRD_{C|E=0} = 0.771, and the bounds around RD(1)_{C|E=0} are minRD_{C|E=0} = −0.259 and maxRD_{C|E=0} = 0.741. Both bound widths are 1, identical to the bound width found in the first example. This will always be the case; measuring one potential confounder will not decrease the bound width for the residual confounding that remains, although it may shift the bounds in one direction or the other, like in this example.

#### Example 2a: Additional Constraints

As in example 1, it is possible to narrow the bound width by introducing subject matter constraints. One constraint is that the measured confounder Z does not cause the exposure, leading to *p2jk* = 0. A similar constraint, that neither the measured confounder Z nor the exposure ever directly cause disease (monotonicity), leads to *pi2k**= pij2* = 0, because these are proportions of response types for which the disease is caused by the exposure if the population were forced to one or the other exposures. This constraint also forces *pi31* = *pi41* = *pi43* = 0, because these are proportions of response types for which the measured exogenous confounder causes disease within at least one stratum of the exposure. A third constraint is the absence of interdependent types (also referred to as interacting types^{19}) between the measured confounder and exposure. An interdependent potential response type is one in which the value of the potential outcome can be determined only by knowing the value of both the confounder and exposure. This response leads to all *pijk* = 0 except for those that relate exposure and confounder to disease with neither E nor Z having an effect (*pi11* and *pi44*), E not having an effect (*pi41* and *pi14*), or Z not having an effect (*pi22* and *pi33*). Rather than assume the absence of interdependent response types, a fourth assumption of no RD modification could be made, constraining RD(0)_{C|E=0} to equal RD(1)_{C|E=0}.

We illustrate these assumptions using the Cooperative Cardiovascular Project data. Bounds are computed for the 4 assumptions listed previously (Table 4). The bound width is smallest for both strata with the assumption of monotonicity. We note 3 important aspects of these results. First, the bounds in Table 4 provide evidence that if the observed effect is confounded to an important extent, it must be confounded in only one direction: away from the null. The crude RDs are close to the lower bound under no additional restrictions (Table 4, row 1) and nearly identical to the lower bound under the assumption that there is no effect modification of the RD. This result is consistent with randomized trials indicating that taking beta-blockers is associated with a 13% decrease in mortality over the following 2 weeks.^{27} Second, under the assumption of monotonicity, the bounds on the causal effect among whites exclude the null. If the assumption of monotonicity is correct, beta-blockers must have a preventive effect on 30-day mortality among whites. Third, there is no combination of potential response-type proportions that conforms to the assumption that being white does not cause anyone to receive beta-blockers. That is, these data would be impossible to obtain were the assumption true. This result is consistent with previous research that indicates that blacks are less likely to receive beta-blockers than whites.^{28,29}

## CONCLUSION

Sensitivity analysis methods for unmeasured confounders provide a corrected point estimate for each specified set of unknown parameters.^{3,30,31} The selection of these parameter values, and the representation and synthesis of the set of corrected values, is left to the subjective judgment of the authors and readers. A complementary nonparametric approach using linear programming, described in this review, provides deterministic bounds on the range of the unidentifiable causal parameter on the basis of the observed table and subject-matter knowledge in the form of assumptions about the potential outcomes. These assumptions represent background knowledge about the nature of the causal effect rather than about the joint distributions of the exposure and disease with the putative confounder.

These nonparametric bounds quantify the absolute limits on the true effect of the exposure given an unmeasured amount of confounding, when the assumptions characterized by the linear constraints and the DAG hold. There are a variety of scenarios in which these bounds may prove useful. Under realistic assumptions these bounds, as demonstrated in example 2a, may exclude hypotheses regarding effect sizes of interest. For instance, the hypothesis that beta-blockers have no effect on 30-day mortality among whites is excluded under the monotonicity assumption, allowing conclusions about the effect of treatment even in the presence of unmeasured confounding. Additionally, these bounds may be useful in providing insights for sensitivity analyses. In example 2, if confounding is present to an appreciable amount, the observed effect must be biased away from the null. Sensitivity analyses such as those by Lash et al^{30} would need to account only for confounders that could bias the effect in one direction. Another implementation of this approach allows the assessment of whether certain causal assumptions are untenable given the observed data. In example 2a, the assumption that being white does not cause one to be treated with beta-blockers relative to being black cannot be true given the observed data. This ability of this approach to detect untenable assumptions may have use in other methodologies that rely on the absence of certain causal types such as instrumental variables methods.^{34}

The bounds presented in this article are absolute bounds outside of which the effect cannot lie. However, like all models, this model is only as valid as the assumptions used to create it. If certain assumptions such as monotonicity cannot reasonably be met, they should not be implemented. We assume correct specification of the DAG, no sampling variability, and no misclassification of exposure or disease. Stochastic variability can be introduced to the extent that the observed population is a random sample of a hypothetical superpopulation. In this case, the cell frequencies are sampled and the resulting constraints have a random component, which can be modeled using Markov Chain Monte Carlo methods.^{32,33} Furthermore, if the assumption of no misclassification does not hold, these same counterfactual methods can be expanded to account for either misclassification of exposure or disease, perhaps using instrumental variable techniques.^{34}

We generated bounds for average causal effects using linear programming, which is restricted to linear effects such as risk differences. Similar bounds can easily be constructed for nonlinear measures of effect using a similar type of constrained optimization (ie, nonlinear programming). Software for either approach is readily available. Factorization of the joint probability distribution can also provide deterministic bounds,^{35} but these may be wider than the linear programming bounds in this article if the independence assumptions encoded in the DAG are represented by weak instead of strong ignorability.^{12,36}

In conclusion, deterministic bounds offer information on causal effects in the presence of uncontrolled confounding. These methods complement standard sensitivity analyses; that is, although sensitivity analyses may approach unmeasured confounding using a probabilistic approach, deterministic bounds present a range within which the causal effect *must* lie and may help to refine sensitivity analyses. Although these bounds may be quite wide in the absence of any prior knowledge, incorporation of that knowledge in the form of realistic assumptions concerning the existence of potential response types can narrow the bounds substantially.

## REFERENCES

*Encyclopedia of Biostatistics*. New York: John Wiley and Sons; 1998:900–907.

*J Chronic Dis*. 1981;34:433–438.

*J Natl Cancer Inst*. 1959;22:173–203.

*Statistical Methods in Cancer Research: Volume 1—The Analysis of Case–Control Studies*. IARC scientific publication No 32. Lyon: IARC; 1980.

*J Chronic Dis*. 1967;20:487–495.

*Epidemiology*. 1990;1:239–246.

*Am J Ind Med*. 1988;12:119–130.

*Biometrics*. 1998;54:948–963.

*Am J Epidemiol*. 1978;108:3–8.

*Biometrika*. 1984;71:191–194.

*Epidemiology*. 1999;10:37–48.

*Causality: Models, Reasoning and Inference*. Cambridge, UK: Cambridge University Press; 2000.

*Int J Epidemiol*. 2002;31:1030–1037.

*Biometrika*. 1973;60:467–476.

*Int J Epidemiol*. 1986;15:413–419.

*Int J Epidemiol*. 2002;31:422–429.

*Stat Sci*. 1999;14:29–46.

*Epidemiology*. 2001;11:313–320.

*Modern Epidemiology*, 2nd ed. Philadelphia: Lippincott Williams & Wilkins; 1998;343–358.

*J Am Stat Assoc*. 1997;92:1171–1176.

*JAMA*. 1995;273:1509–1514.

*J Rural Health*. 2004;20:99–108.

*N Engl J Med*. 2000;343:8–15.

*The American Economic Review*. 1990;80:319–323.

*Am J Cardiol*. 1997;80:35J–39J.

*N Engl J Med*. 1999;340:618–626.

*JAMA*. 2002;287:1288–1294.

*Epidemiology*. 2003;14:451–458.

*Epidemiology*. 2003;14:459–466.

*Computing Science and Statistics*. 1997;29:424–431.

*The Annals of Statistics*. 1997;25:305–327.

*J Am Stat Assoc.*91:444–455.

*Health Service Research Methodology: A Focus on AIDS*. Washington, DC: US Public Health Service, National Center for Health Services Research; 1989:113–159.

*J Am Stat Assoc*. 1996;91:456–458.