From the Department of Epidemiology, UCLA School of Public Health, Department of Statistics, UCLA College of Letters and Science, Los Angeles, CA.
Address correspondence to: Sander Greenland, 22333 Swenson Drive, Topanga, CA 90290-3434.
Submitted September 11, 2000; final version accepted January 10, 2001.
In certain special situations, simplification of an exposure measure into a dichotomy results in no bias from nondifferential misclassification when estimating the attributable fraction for “any exposure.” This fact has led to recommendations to use a broad definition of exposure when estimating attributable fractions. I here review the assumptions underlying exposure simplification, focusing on the assumptions that the source and target populations have the same exposure distribution and that complete risk removal is possible. I argue that attributable fraction estimates based on dichotomization can be especially sensitive to violations of these assumptions, and hence misleading for projecting the impact of exposure reduction. I conclude that it is important to obtain and use detailed exposure and covariate information for attributable-fraction estimation.
In special situations, the population attributable fraction does not change when exposure is collapsed into a dichotomy. 1–3 This collapsing-invariance property can entail robustness to bias from nondifferential misclassification in the dichotomous attributable-fraction estimates, provided everyone classified as “unexposed” has the same risk. 3 In unpublished policy debates, these facts have been used to argue against the need to gather or use detailed exposure information, most recently in discussion surrounding a pooling project on magnetic fields and childhood cancer. 4,5 Such arguments are mistaken because they overlook assumptions that are usually doubtful.
Assumptions for Use of Collapsing Invariance
All attributable-fraction estimates assume that their component relative risks accurately reflect exposure effects in the target population (population of policy concern), 1,3 and that the source population (the source of the subjects from which the estimate is derived) has the same exposure distribution as the target population. While the latter assumption is well recognized, the detail in which it must be satisfied to justify use of dichotomous estimates does not seem widely appreciated. Other assumptions needed to justify dichotomous estimates are that the only interventions of interest move everyone to the reference level of exposure, and that no confounder adjustments are needed; Wacholder et al.3 argued however that violation of the latter assumption would have negligible impact.
I believe collapsing invariance is of extremely limited applicability because rarely are the above assumptions satisfied. Source populations of studies usually do not represent the target; even when they are part of the target, they are subject to many restrictions that alter their exposure distribution. Second, feasible exposure interventions (antismoking campaigns, emission controls, etc.) can rarely move everyone to reference exposure levels. As illustrated below, the consequence of either problem may be severe and may not outweigh any benefit of broad exposure definition.
Formulas and Examples
My points follow from basic categorical formulas;1 issues of sampling variation do not arise because the points concern only population distributions. Suppose exposure has K +1 categories 0 (unexposed), 1. . ., K, with pk the proportion of the population in category k and Rk the causal risk ratio comparing categories k and 0 (R0 is necessarily 1). Then Rp = ΣkpkRk is the population risk ratio (the exposure-weighted mean of the Rk) and the excess caseload attributable to exposure above category 0 is AFp = (Rp − 1)/ Rp (AFp is often confused with the “etiologic fraction”, ie, the fraction of cases whose disease was caused by exposure, but the two fractions can be arbitrarily far apart even if exposure is never preventive 6). More generally, the fractional change in caseload attributable to an intervention that shifts the exposure proportions pk to qk is AFpq = (Rp −Rq)/ Rp, where Rq = ΣkqkRk. 1
The estimates below (as well as the much smaller AFp estimate from the pooling project 4) are hypothetical, for it is controversial whether the observed associations reflect any magnetic-field effect. Nonetheless, in response to public concerns, officials have requested such estimates.
To illustrate the irrelevance of collapsing invariance for projections, Table 1 presents odds ratios, exposure among controls, and attributable fractions based a Swedish study of magnetic fields and childhood leukemia. 7 The source population for this study was restricted to transmission-line corridors. (This study produced some of the largest estimates seen at high exposures 4; its estimate in the middle category was 0.74 with 95 percent confidence limits of 0.17 and 3.20, but is here rounded up to 1 on the assumption that exposure is never preventive.) The odds ratios will be taken as the Rk. Applying the Swedish odds ratios to the Swedish distribution, one sees collapsing invariance: Using the exposure dichotomy, MATHusing the trichotomy, MATH Hence the Swedish AFp is 0.144/1.144 = 13 percent whether computed from the dichotomy or the trichotomy. Nonetheless, because the study subjects were selected to have an elevated exposure distribution, these estimates probably overstate the AFp for all of Sweden, despite the collapsing in variance.
Also shown in Table 1 is the exposure distribution from a survey of U.S. residences. 8 When the Swedish odds ratios are applied to the target U.S. distribution using the dichotomy, MATHwhereas using the trichotomy, MATH
The U.S. attributable-fraction estimate from the dichotomy is thus 0.336/1.336 = 25 percent, reflecting a severe upward bias compared to the 0.180/1.180 = 15 percent estimate from the trichotomy. Collapsing invariance fails here because the Swedish and U.S. exposure distributions differ. The Swedish odds ratio of 2.2 for >0.1 μT measures the risk elevation in a mixed category with one-third of its members >0.3 μT (and therefore at elevated risk), whereas the U.S. category of >0.1 μT is a mixture with fewer than one-fifth of its members >0.3 μT; it thus should have much less risk elevation than the corresponding Swedish category (assuming the Swedish associations are causal and generalizable).
If we added nondifferential exposure misclassification to both studies in the preceding example, we would often find that more bias ensues when using the dichotomy, even if no one was ever misclassified into the lowest exposure category. In the next example, the dichotomy would lead to more bias for almost any plausible set of classification probabilities.
To illustrate the failure of invariance when estimating effects of partial exposure reduction, suppose we wish to estimate the fractional caseload drop AFpq in Sweden that might ensue from reducing all exposures above 0.3 μT to the 0.1–0.3 μT range. Using the Swedish data in Table 1 we get Rp = 1.144 and MATH and so AFpq = (1.144 − 1)/1.144 = 13%. This Swedish AFpq equals the Swedish AFp because eliminating exposures >0.3 μT would eliminate all risk elevation (assuming the association is causal). In contrast, dichotomizing exposure at 0.1 μT yields MATH and hence AFpq = (1.144 − 1.144)/1.144 = 0, which is severely biased because the dichotomy is completely insensitive to changes in the distribution of exposures within the >0.1 μT category. Thus, the insensitivity that makes dichotomous estimates robust to certain types of nondifferential misclassification 3 also makes those estimates insensitive to real intervention effects.
Arguing from collapsing invariance, Wacholder et al. suggested that AFp estimates under nondifferential exposure misclassification be based on a broad exposure definition, dichotomizing persons into those with no (or negligible) exposure and all other persons. 3 They noted that certain problems would render moot their argument, including differential misclassification and interest in reduction rather than elimination of exposure, and that AFp varies with the exposure distribution. The examples show how this variation (which should be anticipated) leads to irrelevance of collapsing invariance.
Methods are available that use detailed exposure and covariate information, including model-based methods requiring no categorization. 1,5,9–13AFp projections can use these methods to account for the joint exposure-covariate distribution in the target in as much detail as practical. To address measurement error, I recommend correction methods 14 or sensitivity analysis 15 over dichotomization.
1. Walter SD. Estimation and interpretation of attributable risk in health research. Biometrics 1976; 32: 829–849.
2. Benichou J. Methods of adjustment for estimating the attributable risk in case-control studies. Stat Med 1991; 10: 1753–1773.
3. Wacholder S, Benichou J, Heineman EF, Hartge P, Hoover RN. Attributable risk: advantages of a broad definition of exposure. Am J Epidemiol 1994; 140: 303–309.
4. Greenland S, Sheppard AR, Kaune WT, Poole C, Kelsh MA. A pooled analysis of magnetic fields, wire codes, and childhood leukemia. Epidemiology 2000; 11: 624–634.
5. Greenland S. Estimation of population attributable fractions from fitted incidence ratios and exposure survey data, with an application to electromagnetic fields and childhood leukemia. Biometrics 2001; 57: 182–188.
6. Greenland S, Robins JM. Conceptual problems in the definition and interpretation of attributable fractions. Am J Epidemiol 1988; 128: 1185–1197.
7. Feychting M, Ahlbom A. Magnetic fields and cancer in children residing near Swedish high-voltage power lines. Am J Epidemiol 1993; 138: 467–481.
8. High Voltage Transmission Research Center. Survey of Residential Magnetic Field Sources (in two volumes). Palo Alto, CA: Electric Power Research Institute, 1993.
9. Deubner DC, Wilkinson WE, Helms MJ, Tyroler HA, Hames CG. Logistic model estimation of death attributable to risk factors for cardiovascular disease in Evans County, Georgia. Am J Epidemiol 1980; 112: 135–143.
10. Bruzzi P, Green SB, Byar DP, Brinton LA, Schairer C. Estimating the population attributable risk for multiple risk factors using case-control data. Am J Epidemiol 1985; 122: 904–914.
11. Greenland S, Drescher K. Maximum likelihood estimation of the attributable fraction from logistic models. Biometrics 1993; 49: 865–872.
12. Oja H, Alho OP, Läärä E. Model-based estimation of the excess fraction (attributable fraction): Day care and middle ear infection. Stat Med 1996; 15: 1519–1534.
13. Graham P. Bayesian inference for a generalized population attributable fraction. Stat Med 2000; 19: 937–956.
14. Carroll RJ, Ruppert D, Stefanski LA. Measurement error in nonlinear models. New York: Chapman and Hall, 1995.
15. Greenland S. Basic methods for sensitivity analysis and external adjustment. In: Rothman KJ, Greenland S, eds. Modern Epidemiology, 2nd ed. Philadelphia: Lippincott, 1998.
© 2001 Lippincott Williams & Wilkins, Inc.