Letters

# RE: A Practical Example Demonstrating the Utility of Single-world Intervention Graphs

Cinelli, Carlos; Judea, Pearl

Author Information
doi: 10.1097/EDE.0000000000000896
• Free

## To the Editor:

In a recent communication, Breskin et al1 aimed to demonstrate “how single-world intervention graphs can supplement traditional causal diagrams.” The example used in their demonstration involved selection bias due to attrition, namely, subjects dropping out from a randomized trial before the outcome is observed. Here, we use the same example to demonstrate the opposite conclusion; the derivation presented by Breskin et al is in fact longer and more complicated than the standard, three-step derivation facilitated by traditional causal diagrams. We further show that more natural solutions to attrition problems are obtained when viewed as missing-data problems encoded in causal diagrams.

The trial example of Breskin et al is shown in the causal diagram of Figure A. The task is to estimate the average causal effect E[Y|do(A = a)] in the general population, given complete data on A (vaccine assignment) and W (injection site pain), whereas data on Y (disease outcome) is available only for those subjects who did not drop out of the study (S = 0). U stands for unmeasured health status, and participants with poor health (U = 1) are assumed to be both more likely to experience pain and get the disease. FIGURE.: Causal diagrams for modeling attrition. A, Causal diagram of the vaccine trial used in the study by Breskin et al.1. B, The graphical representation of the vaccine trial when viewed as a missing data problem.

The standard strategy of causal diagrams is to convert the query expression, E[Y|do (A = a)], into an equivalent expression that can be estimated from the available data.2,3 The derivation goes as follows:   The first equality is licensed by randomization (or null backdoor condition), the second by the law of total probability, and the latter by d-separation, that is, YS|{A, W}. All components of the final expression can be estimated from the available data; the first factor from units who remained in the study (S = 0), and the second from all units entering the trial. As noted in the study by Breskin et al, the same derivation holds if the arrows AS and WY are added to the diagram.

The extreme simplicity and transparency of this derivation, vis-a-vis the elaborate derivation introduced by Breskin et al is an illustrative example of the utility of traditional causal diagrams in modeling attrition, censoring, selection bias, and missing data problems Single-world intervention graphs may be useful for researchers who are determined to verify ignorability conditions such as Y(a)╨S(a)|W(a), but d-separation renders such efforts unnecessary. A wide variety of selection bias and cross-population problems can be solved by the same query-conversion strategy that we described above, operating on traditional causal diagrams.2,3 General conditions for identifying causal effects under both confounding and selection bias are presented in the study by Correa et al.4.

As a final remark, we note that the example presented by Breskin et al may be better formulated as a missing data problem. Such formulation would allow us to specify explicitly which variables are still measured for every subject who drops out of the study. For instance, in the current example, missingness only occurs in the outcome variable Y, a fact that is not represented in the diagram of Figure A. Missingness graphs,5 on the other hand, allow us to formally encode this distinction, as shown in Figure B.

Here, the variable Ry replaces S and represents the “missingness mechanism” of the outcome variable Y, which is not observed directly. Instead, the variable Y* stands for what we can observe of Y, such that Y* = Y when Ry = 0, and Y* = missing when Ry = 1. In this case, the derivation would proceed as before, but this formalism has some benefits: (1) it explicitly tells us that the two factors in Eq. 3 can be estimated from the same study and (2) more complicated missingness mechanisms can be easily accommodated. A comprehensive review of graphical methods for missing data can be found in the study by Mohan and Pearl.6

Carlos Cinelli
Judea Pearl
Department of Statistics and Computer
Science
University of California
Los Angeles, CA
carloscinelli@ucla.edu

## REFERENCES

1. Breskin A, Cole SR, Hudgens MG. A practical example demonstrating the utility of single-world intervention graphs. Epidemiology. 2018;29:e20e21.
2. Pearl J, Bareinboim E. External validity: from do-calculus to transportability across populations. Stat Sci. 2014;29:579595.
3. Bareinboim E, Pearl J. Causal inference and the data-fusion problem. Proc Natl Acad Sci. 2016;113:73457352.
4. Correa J, Tian J, Bareinboim E. 2018. In: AAAI Conference on Artificial Intelligence. Retrieved from https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17375/16207
5. Mohan K, Pearl J. Ghahramani Z., Welling M., Cortes C., Lawrence N., Weinberger K. Graphical models for recovering probabilistic and causal queries from missing data. In Advances in Neural Information Processing Systems 27. 2014:Curran Associates, Inc.15201528.
6. Mohan K, Pearl Jl. Graphical models for processing missing data. Journal of American Statistical Association (JASA). 2018. Forthcoming.