Institute for Theoretical Computer Science, University of Luebeck, Luebeck, Germany, firstname.lastname@example.org (Textor)
Institute for Social Medicine, University of Luebeck, Luebeck, Germany (Hardt)
Department of Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany (Knüppel)
To the Editor:
Causal diagrams, also known as directed acyclic graphs,1,2 provide an entirely graphical, yet mathematically rigorous methodology for minimizing bias in epidemiologic studies.3,4 The analysis of causal diagrams can be cumbersome in practice, and lends itself well to automatization by a computer program. Important first steps in this regard include the development of the DAG program by Knüppel and Stang5 and dagR by Breitling.6 We announce the release of DAGitty, which provides a graphical user interface tailored to draw and analyze causal diagrams. DAGitty overcomes some performance obstacles (pointed out by Breitling6) that affect earlier software when analyzing large diagrams.
The performance issues are 2-fold. First, previous software employed back-tracking algorithms5 to enumerate and categorize all paths from exposure to outcome. This is a reasonable approach for small diagrams, but diagrams with tens of variables can already contain millions of paths. A full listing is of little interest to the human user, but can take hours or days to generate. Instead of a path list, DAGitty identifies the subdiagrams involved in causal and biasing paths and highlights them in different colors. This highlighting algorithm7 scales to very large diagrams. It provides a vivid impression about how causal and biasing effects “flow” in the diagram, that is, by which variables and causal arrows these effects are mediated.
The second problem with previous software has arisen when identifying minimally sufficient adjustment sets (MSA sets). According to causal diagram theory, adjustment for the covariates in an MSA set minimizes bias when estimating the total effect from exposure to outcome. A straightforward approach to find MSA sets is to check each covariate set to see whether it is an MSA set. In a diagram with 50 covariates, this means that 250 sets may have to be tested—a 16-digit number that is too large even for computers. To identify MSA sets more efficiently, we adapted an algorithm proposed recently for a related graph-theoretical problem.8 This algorithm is guaranteed to output the list of MSA sets reasonably quickly (ie, in polynomial time per MSA set output). Note, however, that very large or very regularly structured diagrams could in theory have millions of different MSA sets. If such diagrams become practically relevant, further research will be necessary to develop appropriate computational methods for helping the user to choose appropriate MSA sets.
The described algorithms enable DAGitty's graphical interface to instantly reflect changes made to the diagram, such as adding a new arrow or inverting an arrow with unclear causal direction. This way, users can interactively assess the effects of their modifications on minimally sufficient adjustment sets and the flow of causal and biasing effects. We anticipate that these interactive possibilities will help users to develop an intuition about causal diagram theory, and to compare and decide among various causal diagrams.
DAGitty is available under an open-source license, allowing free access, redistribution, and modification. It runs out of the box in most modern web browsers and is available for online use and download at: www.dagitty.net.
We thank Michael Elberfeld for discussions, and Sabine Schipf and Ines Polzer for providing example causal diagrams.
Institute for Theoretical Computer Science
University of Luebeck
Institute for Social Medicine
University of Luebeck
Department of Epidemiology
German Institute of Human Nutrition Potsdam-Rehbruecke
1. Pearl J. Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press; 2000.
2. Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10:37–48.
3. Shrier I, Platt RW. Reducing bias through directed acyclic graphs. BMC Med Res Methodol. 2008;8:70.
4. Glymour MM, Greenland S. Causal diagrams. In: Rothman KJ, Greenland S, Lash TL, eds. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008:183–209.
5. Knüppel S, Stang A. DAG program: identifying minimal sufficient adjustment sets [Letter]. Epidemiology. 2010;21:159.
6. Breitling L. dagR: a suite of R functions for directed acyclic graphs [Letter]. Epidemiology. 2010;21:586–587.
7. Textor J, Liśkiewicz M. Adjustment criteria in casual diagrams: an algorithmic perspective. In: Cozman F, Pferrer A, eds. Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. Corvallis, OR: AUAI Press: in press.
8. Takata K. Space-optimal, backtracking algorithms to list the minimal vertex separators of a graph. Discrete Applied Math. 2010;158:1660–1667.
© 2011 Lippincott Williams & Wilkins, Inc.