# Attributable Fractions: Bias from Broad Definition of Exposure

In certain special situations, simplification of an exposure measure into a dichotomy results in no bias from nondifferential misclassification when estimating the attributable fraction for “any exposure.” This fact has led to recommendations to use a broad definition of exposure when estimating attributable fractions. I here review the assumptions underlying exposure simplification, focusing on the assumptions that the source and target populations have the same exposure distribution and that complete risk removal is possible. I argue that attributable fraction estimates based on dichotomization can be especially sensitive to violations of these assumptions, and hence misleading for projecting the impact of exposure reduction. I conclude that it is important to obtain and use detailed exposure and covariate information for attributable-fraction estimation.

From the Department of Epidemiology, UCLA School of Public Health, Department of Statistics, UCLA College of Letters and Science, Los Angeles, CA.

Address correspondence to: Sander Greenland, 22333 Swenson Drive, Topanga, CA 90290-3434.

Submitted September 11, 2000; final version accepted January 10, 2001.

In special situations, the population attributable fraction does not change when exposure is collapsed into a dichotomy. ^{1–3} This collapsing-invariance property can entail robustness to bias from nondifferential misclassification in the dichotomous attributable-fraction estimates, provided everyone classified as “unexposed” has the same risk. ^{3} In unpublished policy debates, these facts have been used to argue against the need to gather or use detailed exposure information, most recently in discussion surrounding a pooling project on magnetic fields and childhood cancer. ^{4,5} Such arguments are mistaken because they overlook assumptions that are usually doubtful.

## Assumptions for Use of Collapsing Invariance

All attributable-fraction estimates assume that their component relative risks accurately reflect exposure effects in the target population (population of policy concern), ^{1,3} and that the source population (the source of the subjects from which the estimate is derived) has the same exposure distribution as the target population. While the latter assumption is well recognized, the detail in which it must be satisfied to justify use of dichotomous estimates does not seem widely appreciated. Other assumptions needed to justify dichotomous estimates are that the only interventions of interest move everyone to the reference level of exposure, and that no confounder adjustments are needed; Wacholder *et al.* ^{3} argued however that violation of the latter assumption would have negligible impact.

I believe collapsing invariance is of extremely limited applicability because rarely are the above assumptions satisfied. Source populations of studies usually do not represent the target; even when they are part of the target, they are subject to many restrictions that alter their exposure distribution. Second, feasible exposure interventions (antismoking campaigns, emission controls, etc.) can rarely move everyone to reference exposure levels. As illustrated below, the consequence of either problem may be severe and may not outweigh any benefit of broad exposure definition.

### Formulas and Examples

My points follow from basic categorical formulas;^{1} issues of sampling variation do not arise because the points concern only population distributions. Suppose exposure has *K* +1 categories 0 (unexposed), 1. . ., *K*, with *p* *k* the proportion of the population in category *k* and *R* *k* the causal risk ratio comparing categories *k* and 0 (*R* _{0} is necessarily 1). Then *R* *p* = Σ*k* *p* *k* *R* *k* is the population risk ratio (the exposure-weighted mean of the *R* *k*) and the excess caseload attributable to exposure above category 0 is *AF* *p* = (*R* *p* − 1)/ *R* *p* (*AF* *p* is often confused with the “etiologic fraction”, *ie*, the fraction of cases whose disease was caused by exposure, but the two fractions can be arbitrarily far apart even if exposure is never preventive ^{6}). More generally, the fractional change in caseload attributable to an intervention that shifts the exposure proportions *p* *k* to *q* *k* is *AF* *pq* = (*R* *p* −*R* *q*)/ *R* *p*, where *R* *q* = Σ*k* *q* *k* *R* *k*. ^{1}

The estimates below (as well as the much smaller *AF* *p* estimate from the pooling project ^{4}) are hypothetical, for it is controversial whether the observed associations reflect any magnetic-field effect. Nonetheless, in response to public concerns, officials have requested such estimates.

#### Example 1

To illustrate the irrelevance of collapsing invariance for projections, Table 1 presents odds ratios, exposure among controls, and attributable fractions based a Swedish study of magnetic fields and childhood leukemia. ^{7} The source population for this study was restricted to transmission-line corridors. (This study produced some of the largest estimates seen at high exposures ^{4}; its estimate in the middle category was 0.74 with 95 percent confidence limits of 0.17 and 3.20, but is here rounded up to 1 on the assumption that exposure is never preventive.) The odds ratios will be taken as the *R* *k*. Applying the Swedish odds ratios to the Swedish distribution, one sees collapsing invariance: Using the exposure dichotomy, MATHusing the trichotomy, MATH Hence the Swedish *AF* *p* is 0.144/1.144 = 13 percent whether computed from the dichotomy or the trichotomy. Nonetheless, because the study subjects were selected to have an elevated exposure distribution, these estimates probably overstate the *AF* *p* for all of Sweden, despite the collapsing in variance.

#### Example 2

Also shown in Table 1 is the exposure distribution from a survey of U.S. residences. ^{8} When the Swedish odds ratios are applied to the target U.S. distribution using the dichotomy, MATHwhereas using the trichotomy, MATH

The U.S. attributable-fraction estimate from the dichotomy is thus 0.336/1.336 = 25 percent, reflecting a severe upward bias compared to the 0.180/1.180 = 15 percent estimate from the trichotomy. Collapsing invariance fails here because the Swedish and U.S. exposure distributions differ. The Swedish odds ratio of 2.2 for >0.1 μT measures the risk elevation in a mixed category with one-third of its members >0.3 μT (and therefore at elevated risk), whereas the U.S. category of >0.1 μT is a mixture with fewer than one-fifth of its members >0.3 μT; it thus should have much less risk elevation than the corresponding Swedish category (assuming the Swedish associations are causal and generalizable).

If we added nondifferential exposure misclassification to both studies in the preceding example, we would often find that more bias ensues when using the dichotomy, even if no one was ever misclassified into the lowest exposure category. In the next example, the dichotomy would lead to more bias for almost any plausible set of classification probabilities.

#### Example 3

To illustrate the failure of invariance when estimating effects of partial exposure reduction, suppose we wish to estimate the fractional caseload drop *AF* *pq* in Sweden that might ensue from reducing all exposures above 0.3 μT to the 0.1–0.3 μT range. Using the Swedish data in Table 1 we get *R* *p* = 1.144 and MATH and so *AF* *pq* = (1.144 − 1)/1.144 = 13%. This Swedish AF_{pq} equals the Swedish AF_{p} because eliminating exposures >0.3 μT would eliminate all risk elevation (assuming the association is causal). In contrast, dichotomizing exposure at 0.1 μT yields MATH and hence AF_{pq} = (1.144 − 1.144)/1.144 = 0, which is severely biased because the dichotomy is completely insensitive to changes in the distribution of exposures within the >0.1 μT category. Thus, the insensitivity that makes dichotomous estimates robust to certain types of nondifferential misclassification ^{3} also makes those estimates insensitive to real intervention effects.

## Discussion

Arguing from collapsing invariance, Wacholder *et al.* suggested that *AF* *p* estimates under nondifferential exposure misclassification be based on a broad exposure definition, dichotomizing persons into those with no (or negligible) exposure and all other persons. ^{3} They noted that certain problems would render moot their argument, including differential misclassification and interest in reduction rather than elimination of exposure, and that *AF* *p* varies with the exposure distribution. The examples show how this variation (which should be anticipated) leads to irrelevance of collapsing invariance.

Methods are available that use detailed exposure and covariate information, including model-based methods requiring no categorization. ^{1,5,9–13} *AF* *p* projections can use these methods to account for the joint exposure-covariate distribution in the target in as much detail as practical. To address measurement error, I recommend correction methods ^{14} or sensitivity analysis ^{15} over dichotomization.

## References

**Keywords:**

Attributable fraction; attributable risk; bias; biometry; epidemiologic methods; risk assessment; statistics