We thank the editors for the opportunity to comment on this interesting exchange and thank Banack and Kaufman1 for providing us with additional clarification to illuminate their approach.
Obese people average higher mortality rates than normal weight people,2 but mortality among obese heart failure patients is lower than mortality among normal weight heart failure patients.3 This combination of statistical results is consistent with (at least) two very different causal structures. One possibility—call it the “opposite effects” hypothesis—is that heart failure qualitatively transforms the consequences of obesity: that is, obesity among people without heart failure increases mortality risk but is protective among people with heart failure. Because heart failure patients are only a small fraction of the whole population, under the “opposite-effects” hypothesis the population average effect of obesity would be harmful. As Curtis et al3 note, however, we often wish to know the effect of obesity on patients with heart failure because “recommendations for weight management derived from the general population may not be appropriate for patients with heart failure.”
An alternative causal structure is that obesity harms everyone, even heart failure patients, and the lower mortality observed among obese patients with heart failure is an artifact of selecting the study sample from this subset of the population, in combination with unmeasured confounders of the relationship between heart failure and death. This is a special form of selection bias4 and could arise as follows. Obesity approximately doubles the risk of heart failure5 but is only one of many harmful factors that increase the risk of heart failure; therefore, normal weight people who develop heart failure are more likely than their obese counterparts to have one of these other harmful risk factors. If the effects of these other risk factors on mortality among heart failure patients are larger than the effect of obesity, obesity can appear protective in the selected heart failure population in an analysis not controlling for the other risk factors. Distinguishing between these two possibilities has obvious clinical importance: in the first scenario, heart failure patients need not diet, whereas in the second scenario, dieting might be beneficial.
The selection bias alternative is illustrated by the causal directed acyclic graph (DAG) shown in Figure A. In this DAG, heart failure is a “collider” on the “backdoor path” obesity→heart failure←unmeasured factors→death. It can be shown that selecting a sample of patients with heart failure induces collider stratification bias in the estimated effect of obesity on death in an analysis that does not adequately control for the unmeasured factors6—although such bias is not necessarily large.7
The obesity paradox is an example of a pervasive and extremely important problem: if a risk factor affects disease incidence, how can we evaluate the influence of that risk factor on prognosis among patients with prevalent disease? Other examples of this problem arise in research on knee osteoarthritis (higher weight is associated with increased arthritis incidence and also slower progression among patients with existing arthritis)8; the effect of smoking on infant mortality (smokers are more likely to have low-birth-weight infants, but these babies fare better than similarly low-birthweight infants of nonsmokers)9; and cognitive reserve in dementia (although education delays diagnosis of dementia, highly educated individuals with dementia decline more quickly than dementia patients with less education).10 The challenges of this problem have been extensively discussed in the literature on direct/indirect effects decomposition (mediation analysis).4,11,12
Because this problem is so common, we applaud Banack and Kaufman1 for their efforts to evaluate whether “selection bias” is a plausible explanation for the observed statistical associations presented in the article by Curtis et al.3 We disagree, however, that applying the episens macro, which corrects for differential probability of selecting exposed versus unexposed cases, provides new information. Banack and Kaufman1 calculate a 2 × 3 table of differential probabilities of selection into being a heart failure patient among obese, overweight, and normal weight dead and living people, based on National Health and Nutrition Examination Survey (NHANES) data. For example, they report that 11.3% of obese dead people are “selected” into heart failure, whereas only 2.1% of obese living people are “selected” into heart failure. Weighting the six subsets of the heart failure population by the inverse of these selection probabilities simply reproduces the full population. If the sample of heart failure patients in the article by Curtis is drawn from the same population, then this inverse weighting should reproduce the estimated effect of obesity on mortality in the full population. This holds exactly in the NHANES data provided by Banack and Kaufman’s online appendix. As Nguyen et al13 insightfully note, this does not address the crux of the obesity “paradox.” Curtis et al3 did not suggest that their findings cast doubt on the effects of obesity in the whole population; rather, they noted only the possible effects among heart failure patients.
To evaluate the plausibility of the “opposite effects” versus “selection bias” explanations, we need more information than is reported in the original Curtis article or than is assumed by Banack and Kaufman. What assumptions would we need to make, and are these assumptions plausible? VanderWeele14 has provided simple models of selection bias, assuming a DAG similar to that in Figure A. Under these simple models, the magnitude of selection bias could be estimated based on the independent effect of an unobserved confounder U on mortality among heart failure patients, controlling for obesity, and the prevalence difference of U comparing obese versus normal weight heart failure patients.14
However, it is not obvious how to estimate plausible ranges for the prevalence difference of U among obese versus normal weight heart failure patients. Thus, we use the more intuitive setup proposed by Hafeman.15 If we assume, as shown in Figure A, that in the full population, obesity and U are unrelated (in other words, the effect of obesity on mortality in the full population is not confounded), then the relative bias in the effect of obesity among heart failure patients, absent adjustment for U, is a function of four unknown parameters:
1. Prevalence of U in the population,
2. Effect of U on heart failure risk in obese individuals,
3. Effect of U on heart failure risk in nonobese individuals, and
4. Effect of U on mortality risk of heart failure patients.
Notably, there is no bias if U has the same relative effect on heart failure risk among obese and normal weight persons. Bias also depends on the prevalence of heart failure and the effects of obesity on heart failure and mortality overall—all of which can be estimated from available data. With these inputs, we first calculate the prevalence of U in obese and normal weight patients with heart failure (p1 and p0, respectively) and then calculate the biased relative risk (RR) for the effect of obesity on mortality in an analysis not adjusting for U as RRm.obese × [1 + (RRm.u − 1) × p1]/[1 + (RRm.u − 1) × p0], where RRm.obese and RRm.u are the true causal effects of obesity and U on mortality in heart failure patients, respectively.14 This formula assumes that the effect of U on mortality is the same in obese and normal weight heart failure patients.
How large would these unmeasured factors have to be to generate the observed data patterns under the “selection bias” causal structure? Using the R script shown in the eAppendix (http://links.lww.com/EDE/A734), we calculated the RRs for the effect of obesity on mortality among heart failure patients who would be observed in an analysis not adjusting for U, under a range of input assumptions. Curtis et al3 reported that among heart failure patients, obese people had an RR for mortality of 0.70 compared with normal weight. If obesity is indeed harmful for heart failure patients, then selection bias is consistent with the observed RR of 0.70 only if U has very large effects on mortality and heart failure.
For illustration, suppose the true RR of obesity for mortality among heart failure patients is 1.25, and that U quintuples the risk of heart failure among normal weight persons, while having no effect on the risk of heart failure among the obese. Even if U doubles the mortality risk of heart failure patients, the predicted RR for obesity on mortality never falls much below 1.0, regardless of the prevalence of U. Only if U also quintuples the risk of mortality among heart failure patients would the bias be large enough to generate an RR for obesity on mortality of 0.70. Although there are other possible combinations of input parameters that could generate this RR under the DAG in the Figure, they are similarly extreme (see eAppendix (http://links.lww.com/EDE/A734) for calculations across a wide range of input assumptions).
Two caveats are in order. First, we use a single binary confounder (U) to stand in for the net associations of the set of unmeasured variables that influence both heart failure and mortality. Thus, when considering a plausible range of associations, we should consider the potentially synergistic action of multiple unobserved variables. Second, Nguyen et al13 point out that patients may gain or lose weight after developing heart failure, so the appropriate DAG is as shown in Figure B; however, the bias due to selection in this case is likely to be even smaller.
Whether such a large effect of unmeasured factors is plausible is a matter for substantive experts, but if these values are implausible, selection bias is unlikely to fully account for the results reported by Curtis et al. Attention should move to alternative explanations, such as the opposite effects hypothesis, that losing weight increases mortality of heart failure patients. A middle ground is also possible: the effects of obesity are much smaller among heart failure patients than in the population at large, and selection bias further contributes to the divergence of effect estimates. Conventional confounding or, more intriguingly, different variants of heart failure could also contribute to the divergence of results for the general population compared with heart failure patients. Conclusive evidence would require a randomized trial of an effective weight-loss intervention among heart failure patients. One of the most valuable responses to the current debate would be the reporting of empirical evidence on the parameters suggested above for situations when only observational evidence is available, and “selection bias” and “opposite effects” are competing explanations.
ABOUT THE AUTHORS
M. MARIA GLYMOUR is an associate professor of epidemiology at UCSF and conducts research on lifecourse determinants of stroke and dementia, with a focus on improving causal inference in social epidemiology. ERIC VITTINGHOFF is a professor of biostatistics at UCSF, is co-author of the textbook Regression Methods in Biostatistics, and helps teach a UCSF biostatistics course on causal methods.
1. Banack HR, Kaufman JS. The “obesity paradox” explained. Epidemiology. 2013;24:461–462
2. Adams KF, Schatzkin A, Harris TB, et al. Overweight, obesity, and mortality in a large prospective cohort of persons 50 to 71 years old. N Engl J Med. 2006;355:763–778
3. Curtis JP, Selter JG, Wang Y, et al. The obesity paradox: body mass index and outcomes in patients with heart failure. Arch Intern Med. 2005;165:55–61
4. Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15:615–625
5. Kenchaiah S, Evans JC, Levy D, et al. Obesity and the risk of heart failure. N Engl J Med. 2002;347:305–313
6. Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10:37–48
7. Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. 2003;14:300–306
8. Zhang Y, Niu J, Felson DT, Choi HK, Nevitt M, Neogi T. Methodologic challenges in studying risk factors for progression of knee osteoarthritis. Arthritis Care Res (Hoboken). 2010;62:1527–1532
9. Hernández-Díaz S, Schisterman EF, Hernán MA. The birth weight “paradox” uncovered? Am J Epidemiol. 2006;164:1115–1120
10. Stern Y, Albert S, Tang MX, Tsai WY. Rate of memory decline in AD is related to education and occupation: cognitive reserve? Neurology. 1999;53:1942–1947
11. Cole SR, Hernán MA. Fallibility in estimating direct effects. Int J Epidemiol. 2002;31:163–165
12. VanderWeele TJ. Marginal structural models for the estimation of direct and indirect effects. Epidemiology. 2009;20:18–26
13. Nguyen U-S, Niu J, Choi H, Zhang Y. Effect of obesity on mortality: Comment on article by Banack and Kaufman. Epidemiology. 2013;25:2–3
14. VanderWeele TJ. Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology. 2010;21:540–551
15. Hafeman DM. Confounding of indirect effects: a sensitivity analysis exploring the range of bias due to a cause common to both the mediator and the outcome. Am J Epidemiol. 2011;174:710–717