# An Overview of Methods for Calculating the Burden of Disease Due to Specific Risk Factors

There are a number of measures that quantify the public health burden due to specific risk factors for specific diseases. Although these measures are of importance for policymakers, epidemiologists do not often calculate them or may be unfamiliar with some of the issues involved when they do. The primary measure of interest is the attributable fraction (AF), representing the fraction of cases or deaths from a specific disease that would not have occurred in the absence of exposure to a specific risk factor either in the exposed population or the population as a whole. AFs can be multiplied by the total number of cases of a given disease to obtain a “body count”—the absolute number of preventable cases due to a specific risk factor. Two other measures of public health burden, used in conjunction with AFs, are attributable years-of-life-lost and attributable disability-adjusted life-years. We provide an overview of the AF and related measures and discuss some of the specific issues involved in calculating AFs. These issues include calculating the variance of AFs (such as Monte Carlo sensitivity methods), biases arising from some formulas for the AF, sources of data for calculating AFs, dependence of AFs on basic decisions about what exposure–disease associations are causal, and extrapolation from the source population to the target population.

Supplemental Digital Content is Available in the Text.

From *Rollins School of Public Health, Emory University, Atlanta, GA; and the †Public and Environmental Health Research Unit, London School of Hygiene and Tropical Medicine, London, U.K.

Submitted 27 April 2005; accepted 8 March 2006.

*Editors’ note:**A commentary on this article appears on page 498*.

Supplemental material for this article is available with the online version of the journal at www.epidem.com; click on “Article Plus.”

Correspondence: Kyle Steenland, Rollins School of Public Health, Emory University, 1518 Clifton Rd, Atlanta, GA 30322. E-mail: nsteenl@sph.emory.edu.

**ArticlePlus**

Click on the links below to access all the ArticlePlus for this article.

Please note that ArticlePlus files may launch a viewer application outside of your web browser.

The attributable fraction (AF)—the amount of disease that can be attributed to a specific risk factor—has long held an important place in public health. The AF combines relative risk and the prevalence of exposure to measure the public health burden of a risk factor by estimating the proportion of cases of a disease that would not have occurred absent an exposure. It thereby has important policy implications, pointing to the potential impact of an intervention. Two measures used in conjunction with AF—years of life lost (YLLs) and disability-adjusted life-years (DALYs)—can be used to estimate the years of life lost or lived with disability due to specific exposures. AFs have a long history in epidemiologic and public health literature. Nonetheless, many epidemiologists rarely calculate attributable fractions, and many, if not most, are unfamiliar with YLLs and DALYs. We seek to present an overview of the definition and the use of attributable fractions as well as briefly present basic definitions of YLLs and DALYs. In some instances, we use, by way of example, the calculation of the fraction of cancers attributable to occupational exposures, but most of our discussion is not specific to any exposure or disease.

All of these measures are increasingly being used by public health organizations as guidelines for prioritizing interventions. For example, the World Health Organization (WHO) has made measurement of the public health burden a major thrust of their thinking about future interventions/prevention. In their Global Burden of Disease work, they calculate DALYs attributable to most of the major known (and intervenable) risk factors for common diseases, including high blood pressure, high cholesterol, tobacco, alcohol, undernutrition, obesity, indoor smoke, unsafe sex, unsafe water, and occupation.^{1} The Centers for Disease Control and Prevention (CDC) is currently working with Harvard University investigators to develop detailed U.S. data for attributable DALYs (by time and geographic region) along the lines of the WHO effort).^{2} The CDC already calculates YLLs or DALYs attributable to some specific risk factors (eg, smoking, alcohol, sexual behavior^{3–5}). We believe epidemiologists need to be familiar with how public health burden is measured and to begin to use these measures more frequently when presenting results.

## DEFINITION AND ESTIMATION OF ATTRIBUTABLE FRACTION

### Definitions of Attributable Fraction

The AF is the proportion of cases that would not have occurred in the absence of exposure either among the exposed population or among the total population. Greenland and Robins^{6} call this the excess fraction. Ideally, the AF should be estimated from a lifetime follow-up of exposed and nonexposed cohorts in the population of interest. In practice, the AF is usually based on one or more epidemiologic studies of specific exposed and nonexposed populations with incomplete follow-up, and results are often applied to a larger general population.

There is some ambiguity in the definition about cases that would have occurred in the absence of exposure but that occur earlier due to exposure. These are included in what Greenland and Robins^{6} call the “etiologic fraction” but not in the AF. However, the “etiologic fraction” is not estimable from epidemiologic data.

When exposure is dichotomous (exposed/nonexposed), the AF among the exposed (*AFexp*) may be calculated using risks or relative risks,

where *R1* is the risk of disease among the exposed, *R0* is the risk among the nonexposed, and *RR = R1* */R0* is the risk ratio.^{7,8} Here one assumes no confounding or that *RRs* are adjusted for confounding.

The risk ratios in the AF may be replaced by rate ratios (or odds ratios) when the rate ratios (or odds ratios) approximate the corresponding risk ratios, which is usually true for rare diseases.^{6} Then formula (1) becomes

where *I1* and *I0* are the incidence rates in the exposed and nonexposed, respectively.

This definition of *AFexp* can be extended to the combined exposed and nonexposed populations to define a population AF:

where *pc* is the percentage of *cases* exposed in the combined population.^{8} This expression is a valid estimate in the presence of confounding affecting exposed versus nonexposed if the *RR* is adjusted for confounding.

An alternative expression for the AF is:

where *pp* is the percentage of the *total population* exposed and where *I* is the incidence rate in the combined population of exposed and nonexposed.^{8,9} The first of the 2 expressions in formula (3) is the most commonly used formula for the *AFpop*. The equivalence of the 2 expressions in formula (3) can be derived by noting that *I = pp* *(RR − 1)I0 + (1 − pp)I0).*

### Attributable Fraction in the Presence of Confounding or Effect Modification

Although the point is often ignored, formula (3) is strictly valid only when there is no confounding or effect modification affecting the RR. In the presence of confounding, this estimate is biased, sometimes appreciably so, even if the RR in this formula has been adjusted for confounding.^{9–12} By applying formula (3) to each confounder stratum in the population separately (using confounder-specific RRs), and weighting the resulting AFs by the proportion of cases in each stratum, it is possible to obtain an unbiased estimate of *AFpop*.^{10} Benichou^{10} refers to this method as the “weighted-sum method” with “case-load weighting.” This method is also unbiased in the presence of heterogeneity of the RR (effect modification) or of the AF across confounder strata. Another weighted-sum method, referred to by Benichou as the “precision weighting,” uses as weights the inverse of the variance of each stratum-specific AF, but this method assumes homogeneity of the AF across strata.

Bruzzi et al^{13} earlier described an analogous weighted-sum approach in which a weighted sum of RRs is used to calculate the AF. *RRi* *s* for exposure versus nonexposure (or for each level of exposure vs nonexposure) are calculated at each level *i* of confounders/effect modifiers. These *RRi* *s* may be estimated, for example, from a single regression model that includes a term (or terms) for exposure (or levels of exposure) as well as for any polychotomous confounders and may include interaction terms when effect modification is present (saturated model).^{13} Estimates of *RRi* *s* for exposed versus nonexposed (or each level of exposure vs nonexposed) for each stratum of confounders/effect modifiers are then combined with the proportion of cases in these strata to estimate an overall *AFpop (AF = 1− Σ (pci /RRi)*, where *pci* is the proportion of cases in each exposure/confounder stratum and where *RR1* = 1 for the lowest exposure/lowest confounder(s) stratum. Algebraically, this formula can be shown to be equivalent to *(I−I0* *)/I* in formula (3).

A recently published^{14} close variation of methods of Bruzzi et al, which similarly accounts for possible confounding and effect modification, calculates relative risks (the referent being nonexposed at the lowest level of confounder) for each level of exposure and confounder(s)/effect modifier(s); these *RRi* *s* are weighted by the proportion of the population with a specific level of exposure and confounder (*pi*). The weighted relative risk is then multiplied by the baseline rate of disease (*I*, the rate for nonexposed at the referent level of confounder) to derive a disease rate *R* for the entire population (*I ΣRRi* *pi*). The disease rate *R* for the entire population is then compared with a counterfactual weighted average disease rate (*R**) for the same population assuming the referent level of exposure (*I ΣRRi** *pi*) where *RR** is the relative risk for the nonexposed at each confounder level) to derive the *AF (AF =R−R*)/R*, which again is equivalent to formula (3).

The magnitude of the bias of using formula (3) in the presence of confounding without appropriate stratification can be important.^{15} Flegal et al^{12} have provided an example in which the AF for deaths due to obesity in the United States is biased upward by 17% when formula (3) is used for calculating the AF in the presence of confounding by age and sex compared with the “weighted-sum” method. Greenland^{9} gives another example in which there is a downward bias of 15%. The amount of bias is related to the strength of confounding. Effect modification of the relationship between exposure and disease (RRs) can also result in bias when formula (3) is used as opposed to the weighted-sum method. Flegal and colleagues give examples of this phenomenon, but it is not possible to generalize about the size or the direction of this bias.

### Attributable Fraction Across Different Levels of Exposure

An overall AF can be calculated based on a study with different levels of exposure (eg, none, low, medium, high):

where *i* indexes the exposure level.

Analogously, the population exposure prevalence (*pp*) may again be used instead of the case exposure prevalence, albeit with the same problems of potential bias discussed previously when confounding or effect modification is present:

Formulas (4) and (4a) give the same result for the overall AF as simply adding together the individual AFs for each exposure category versus the referent (ie, the AF has a distributive property, see Wacholder et al^{16}). Wacholder et al also note (as a corollary) that the AF is not affected (ie, is unbiased) by nondifferential misclassification of the nonexposed as exposed (although its variance will increase). In contrast, nondifferential misclassification of the exposed as nonexposed will bias the AF downward. They therefore argue for a strict definition of nonexposure and a broad definition of exposure.

### Attributable Fraction Across Different Exposures

AFs for different exposures may be combined to determine an overall AF (eg, the overall occupational AF for leukemia due to exposure to ethylene oxide, benzene, and radiation). If 3 exposures are statistically independent (ie, experiencing one makes an individual no more or less likely to experience the other), and their joint effects are multiplicative, then,^{8}

Simply summing the exposure-specific AFs may result in an overall AF >1, because it “double counts” cases that could be avoided by removing either one or the other exposure.^{17} If exposure-specific AFs are small, however, a simple sum will approximate *AFoverall*.

### Source and Target Populations: Portability

The previous section implicitly assumed that we want to estimate the fraction of cases attributable to exposure in precisely the same population in which we have results from an epidemiologic study. An example would be a population-based case–control study representative of the population for which we sought to estimate the attributable fraction. However, often we seek to use the results from one or more epidemiologic studies (source populations) to estimate the attributable fraction for a different (target) population. For example, a case–control study (or studies) done in one area of a country may be the source population, but the entire population of the country may be the target population, for which the case–control study is not necessarily representative.^{18} Similarly, a cohort study of a specific exposed population may be the source of an RR, which is then used to estimate an AF in the general population^{19,20}. Alternatively, AFs based on data in one country may be applied to another country.^{1} Target populations may even be hypothetical future populations, for example, if we want to estimate future disease burdens (eg, future mesothelioma due to past asbestos exposures^{21}). In this regard, Murray and Lopez^{22} make the distinction between attributable burden (the current burden) and avoidable burden (the future burden); the latter is potentially preventable through intervention.

This distinction between source and target population raises some important questions.

First, because the attributable fraction depends not only on the relative risk due to exposure, but also the fraction of the population exposed, and somewhat on the distribution of confounders in the population, all these factors must be the same in the source and target populations for the AF estimated from the source population to apply also to the target population. Thus, the AF is not generally portable from one population to another.

In many cases, there is no RR available for the target population, but the exposure prevalence in the target population is known. In this situation, it is common to use the RR from the source population (on the assumption that this is a reasonable approximation of the RR in the target population) and the exposure prevalence in the target population exposed. Hence, using formula 3, we have:

If there are other risk factors associated with exposure in the target population (confounders), then again one should use the “weighted-sum” method described here. This means using the confounder-adjusted RR from the source, separately for each target population stratum as defined by these confounders, and then obtaining an overall weighted-average AF from the stratum-specific AFs. This method assumes no effect modification of RRs by these risk factors in the source population, as discussed subsequently.

If the source RR estimates are by level of exposure, then the process described here can be applied to an extension of formulas (4) or (4a) analogous to the extension of formula (3) to formula (6), thus allowing not only proportion of exposed to vary between source and target populations, but also proportions at different levels of exposure.

There may be problems of portability for the RRs as well as for the proportions exposed. For example, if relative risks for occupational exposures are from cohort studies, they may implicitly assume a rather strict (high) definition of “exposed.” Applying such relative risks to a target population using general census data on occupation to estimate the proportion exposed would be inappropriate because the relative risks would be inappropriate for this different definition of exposure. This is so because cohort studies may represent more highly exposed workers (with correspondingly high RRs), whereas the average level of exposure among all exposed workers in the population is lower (with correspondingly lower RRs). Similar problems may occur for chronic disease when the latency between exposure and disease differs in source and target population (ie, longer latency in the source population resulting in higher RRs for disease requiring long latency).

RRs may also not be portable because of effect modification by other population characteristics such as smoking or diet. For example, if there were more smokers in the target population than the study population, and there was a positive interaction between smoking and the exposure, the overall RR in the target population would be higher in the target than the study population. If there is such effect modification, to obtain an unbiased AF estimate in the target population, it is necessary to use the RRs specific to each level of the effect modifier in the source population for calculating AF in the target population through the “weighted-sum” method.^{12}

### Attributable Fractions Based on Different Study Designs

In general, the formulas for the AF in the first section can be based on data from any of the 3 common study designs (cohort, case–control, prevalence), although in the case of prevalence (cross-sectional) studies, the AF will represent the proportion of prevalent rather than incident cases that could be avoided if exposure were absent.

One common method of calculating AFs is to take the RRs from cohort studies of exposed (source) populations and to estimate the prevalence of the exposure in question in the target population using secondary sources. As noted above, exposed cohorts will frequently have exposure patterns not typical of the exposed among the general population, adding extra difficulty to source–target extrapolations beyond, say, population-based case–control studies. An additional problem sometimes arises in source–target extrapolation from cohort studies when the exposure prevalence estimate for the target population comes from a survey of those “currently exposed” when, for chronic diseases with a required latency, what is needed is an estimate of the “ever exposed.” Sometimes, the percentage “ever exposed” can be estimated from the “currently exposed.”^{19}

Case–control studies may also provide the basis for AFs. If they are population-based, they may also provide a good estimate of the proportion of cases, which are exposed in the general population (which may be the target population).

Some disease–exposure associations are so strong that (RR − 1)/RR ≈ 1, so *AF = pc (RR − 1)/RR ≈ pc*. In this case, the AF is approximately the proportion of cases exposed. This might be assumed, for example, in estimating the proportion of liver cancer due to occupational exposure to vinyl chloride monomer or the proportion of mesothelioma due to exposure to asbestos.

### Variance of the Attributable Fraction

Calculating the variance of the AF is often not straightforward, because it involves 2 estimated parameters, possible confounding, and possible extrapolation from source to target population. We focus on the variance of the AF for the entire population (*AFpop*, formulas 2 and 3), which is usually the AF of interest (rather than the *AFexp*). When confounding affects the estimation of RRs using in calculating *AFpop* (formula 2 or 3), there are in general no closed-form formulas for the variance, and some approximate form must be used. There are a number of variance formulas depending on the method of calculating the AF and whether one is concerned with a target population that is the same or differs from the source population.

*AFpop* (*formula* 2). An approximate expression for the variance^{23,24} of the *AFpop* in the source population that uses the prevalence of exposure among cases is

Formula (7) can be used to get the upper and lower confidence limits of the AF (ULAF and LLAF). *ULAFpop* is *1 − exp(LL(ln(1 − AFpop)*, *LLAFpop* is *1 − exp(UL(ln (1 − AFpop)*, and corresponding limits of ln*(AFpop − 1)* from (7) are *ln* *(1 − AFpop) +/− 1.96(√var(ln(1 − AFpop)*. Formula (7) can be used whether or not the RR is adjusted for confounding, assuming approximate homogeneity of the RR across confounder stratum. It is based on a single summary RR and can be used when the RR is a Mantel-Haenszel RR or is derived from a regression model. The exposure status of the cases in the source population must be known. Formula (7) can be used for data from case–control, cohort, or prevalence studies.^{24}

*AFpop (Formula* 3) *Use of Exposure Prevalence Data From a Separate Target Population, No Confounding in the Target Population*. When *pp* is be used rather than *pc* (*AFpop* formula 3), *pp* may come from ancillary survey data on a target population rather than from the study population. A formula to estimate the variance (or standard deviation) of an *AFpop* in the target population based on *p* _{p} and incorporating the variance of a *pp* taken from a survey is available^{25}:

where *T* = number in survey and O *= pp* */(1−pp)*. This formula assumes that the RR in the source population is applicable (portable) to the target population and that there is no confounding in the target population. This formula can be used to obtain 95% Wald limits (which must be converted back to the original AF scale), but it is unwieldy and gets more complicated if either *RR* or *O* is estimated through a regression analysis or a complex survey.

*Other Methods for the Variance of AFpop in the Source Population in the Presence of Confounding.* Benichou^{10} gives several references for calculating the variance of the AF (either formula 2 or 3) calculated using the weighted-sum approach. These methods involve in general assuming specific distributions of RRs, depending on the study design used to derive the RRs, and then using the delta method to obtain approximate variances within each stratum. Benichou also provides references for obtaining variances when the RRs are obtained from regression models, again based on the delta method. Other authors^{26,27} have suggested bootstrap methods in conjunction with logistic regression models.

*A Monte Carlo Approach to Calculating the AFpop and Its Variance.* An alternative approach to using these formulas is to use a Monte Carlo approach for calculating the variance of the AF.^{25} This approach can incorporate the usual random error (sampling error) in the RR and exposure prevalence as well as other potential sources of uncertainty such as uncontrolled confounding or extrapolation from a source to a target population.

Consider the simple situation in which the only source of uncertainty is random error and one seeks only to estimate the variance of the AF in the source population, and there is no confounding. A distribution for an exposure/disease-specific RR (eg, log normal) and for the exposure prevalence of cases (AF formula 2) or the population exposure prevalence (AF formula 3) may be specified (eg, binomial). Then repeated draws from these distributions will yield repeated AFs, and then a point estimate and distribution for the AF can be obtained as well as a Monte Carlo 95% interval.

If there is confounding in the source population, and one wishes to use formula (3) based on the study population exposure prevalence, the AF can be calculated using the weighted sum method described previously. This can again be done by Monte Carlo simulation, specifying distributions of RR and exposure prevalence in each confounder stratum, and combining stratum with the chosen weights. Again, after repeated draws and repeated calculations of the AF, Monte Carlo limits can be obtained.

Additional uncertainty regarding inference to a target population can be incorporated into the Monte Carlo simulations. Greenland^{25} gives some examples of how this might be done.

An alternative to simulation, in which a distribution of RRs and exposure prevalence is assumed, is a bootstrap approach in which the variance of the AF can be obtained through repeated resampling with replacement from the original data.^{25} However, this requires that the investigator has access to that data.

*Additional Source of Variance in Summary Attributable Fractions Across Risk Factors.* All AF calculations require the assumption that the RRs used are causal. Investigators may want to combine AFs from several risk factors to derive a summary AF across all of them, and judgment about which AFs are causal becomes an important source of variance. For example, investigators may seek to calculate an AF for all cancers due to occupation. A more liberal interpretation of associations likely to be causal will result in a higher summary AF. In 2 recent studies, the AF for cancer due to occupational exposures among men was 5% in one study^{19} versus 14% in the other,^{20} primarily due to the inclusion of more exposure/diseases associations judged to be causal in the latter study (Appendix, available with the online version of the journal). Both studies agreed on 19 well-established associations, but the study with the higher AF accepted another 19 less well-established associations as causal. It would be possible to incorporate this additional uncertainty in estimating the variance of the AF, resulting in wider confidence limits using a Monte Carlo approach by assigning distributions to beliefs about the causality of each association.

## MEASURES USED IN CONJUNCTION WITH ATTRIBUTABLE FRACTIONS: YEARS-OF-LIFE-LOST AND DISABILITY-ADJUSTED LIFE-YEARS

Measures of YLLs or of DALYs are often calculated for specific diseases without regard to the exposures causing the disease. We review these measures and note their use, under appropriate assumptions, to encompass YLLs for a specific risk factor that causes a specific disease (simply by multiplying the measures by the corresponding AF).

This use of YLLs and DALYs is becoming more common.^{1–5}

### Years of Life List

YLLs are the years of life lost due to premature mortality from a specific cause of death. YLLs were originally used to allow meaningful comparisons of burden across different causes of death—without any reference to specific exposures or risk factors. Their key feature is that they take time lost to premature mortality into account. YLLs are sometimes called YPLL or years of potential life lost. There are many ways of calculating YLLs.^{28} All of them involve multiplying the number of deaths in a given population by some life expectancy (eg, the life expectancy at the average age of death for the cause of interest in the target population).

where *N =* total deaths from the specific cause and *L* = standard life expectancy after the average age at death from the cause of interest.

Differences arise regarding what one assumes for age-specific life expectancy. WHO uses age-specific life expectancies from Japan, which has the longest overall life expectancy of any country (life expectancy at birth is 80 for men and 83 for women). However, others (such as the CDC) truncate that life expectancy at some specific age: the CDC truncates life expectancy at age 75 when calculating YLLs. This calculation gives more weight to YLLs at earlier ages. Such truncation not only results in different absolute numbers of YLLs (lower with earlier truncation), but also shifts the proportion of YLLs for different diseases depending on the distributions of the age of death for these diseases. For example, with no truncation, heart disease had the highest proportion of YLLs in the United States in 1986 (41%), whereas cancer had 33%.^{28} With truncation at age 75, heart disease decreased to 32%, whereas cancer was 34%. The absolute number of YLLs decreased approximately 50% due to this truncation.

Once one has calculated the YLLs for a given cause, the YLLs due to a specific risk factor (attributable YLLs) can be calculated simply by multiplying the YLLs by the AF for the risk factor (or risk factors) in question for that cause.

Formula (9) is an approximate one because life expectancy is not a linear function of age. One can obtain a more accurate estimate of YLL by use of life table methods. In this approach, *N* _{i}s, for each age group usually 1- or 5-year intervals are multiplied by the life expectancy at that age, *Li*.

Miller and Hurley^{29} have provided a clear exposition of how to calculate YLLs in this way.

### Disability-Adjusted Life-Years

DALYs offer the advantage of enabling calculation of a burden of disease that includes both morbidity and mortality and that can be compared across different diseases. For a specific disease,

where *YLL* is the years of life lost due to premature mortality, as discussed previously, and *YLD* is the years of life lived with disability due to disease incidence. *YLD* is defined as:

where *I* = number of incident cases, *DW* = disability weight, and *L* = mean duration of disease. If data on incidence are not available, they are sometimes estimated from mortality data and case-fatality rates.

Like YLLs, YLDs may be calculated for a specific disease independent of risk factors. The resulting DALYs are then multiplied by the AF for a specific risk factor to determine the DALYs attributable to that risk factor (there is a simplifying assumption here that disease progression is the same for any given risk factor, although in some cases, a disease category might be really a mixture of heterogeneous diseases, each with a different clinical course and each caused by different risk factors).

The disability weights (DWs) seek to weight the disability of living with different diseases (in some cases, the weight is reduced for treated disease). They are necessarily based on subjective judgment, generally originating from a panel of experts. Table 1 gives some examples of disability weights with the highest possible being 1.0.

DALYs calculated for a range of specific diseases without reference to risk factors are available from WHO.^{30}

WHO uses a somewhat more complicated method for calculating YLLs and YLDs than that presented in formulas (9) and (11). This method downweights or discounts years farther away from the year of death on the theory that these years are less valuable. Formulas (9) and (11) become

and

where *r* = 0.03 is the discount rate.

The disability weights and the discount rate can highly influence DALYs.^{31} Given the subjective nature of the disability weight and discount rate, as a practical matter, one might consider conducting sensitivity analyses that vary these 2 variables.

DALYs are similar to other measures taking quality of life into account such as quality-adjusted life-years (QALYs).^{32,33} QALYs adjust each year lived for the existence of specific health impairment (rather than considering disease-specific disability, like in DALYs) usually based on some rating scale (eg, impairment of mobility or usual activities, existence of pain or anxiety/depression). Ratings typically come from surveys of patients or caregivers. QALYs are age-specific, being the product of life expectancy times the impairment factor (1 for no impairment and 0 for total impairment, ie, death).^{34} They are frequently used to compare interventions (eg, one drug vs another to treat breast cancer), and to calculate a cost–utility ratio between 2 interventions ([cost of intervention A − cost of intervention B]/[QALYs resulting from intervention A − QALYs resulting from intervention B]).

As with AFs, estimation of YLLs or DALYs using these methods is based on assumptions that are typically only approximately met and not testable.^{35} For example, they assume that among the exposed who succumb to the disease in question, vulnerability to other causes of death is not also heightened. These limitations should not be forgotten, but neither should they prevent use of these measures as broad-brush descriptors of disease burden.

## DISCUSSION

We have attempted to provide an overview of the issues in estimating the public health burden of specific risk factors for disease. We believe that estimation of the public health burden is useful for policymakers and for the public, because it indicates the potential benefits of intervention, and that it should be encouraged among epidemiologists.

Several issues make estimation of risk factor burden—whether as AFs, attributable YLLs, or attributable DALYs—somewhat more difficult than the usual epidemiologic estimation of standard measures such as rate ratios and odds ratios. One issue is the standard question of causal inference (ie, are the RRs causal?). AFs have the advantage of forcing epidemiologists to make more clear their judgments about causal inference (ie, does a given study or studies provide sufficient evidence for causality to justify calculating public health burden with the implied possible intervention to reduce exposure?).

A related issue is whether the risk factor for which the AF is calculated is in fact susceptible to an intervention.^{36} Depending on the feasibility of the intervention, the AF may be either meaningless, poorly defined, or overestimated. For example, being over age 60 will be a risk factor with a very high AF for most chronic diseases, but it makes no sense to calculate such an AF because we cannot intervene to prevent aging. Obesity (body mass index >30) is a seemingly straightforward risk factor. However, an AF for obesity is poorly defined because it is not clear what intervention would result in eliminating this risk factor. If it were increased physical activity, then the risk factor of interest (and corresponding RRs) should be lack of physical activity, not obesity (which is itself presumably an intermediate variable). In addition, it is unlikely that interventions to prevent obesity will completely eliminate obesity, so that in practice an AF for obesity will be an overestimate of what can be achieved by current interventions (although in theory a new pill might be invented which would make it possible to eliminate obesity altogether).

An issue specific to some AFs is the additional complexity stemming from the problem of portability of parameters from the source population to the target population. This last issue is analogous to the “external validity” or “generalizability” of findings from any epidemiologic study, but it takes on a more quantitative aspect because a quantitative measure of burden must be estimated for the target population.

Another issue involves the nature of the AF itself. Although the AF is usually thought of as a measure to assess public health burden, it is worth recalling that the AF represents the fraction of cases attributable to exposure rather than the absolute number of cases, and the absolute number may be the more important statistic (DALYs, unlike AFs, are an absolute measure). When comparing the public health burden due to a single exposure across different diseases, it will be necessary to multiply the AF by the total number of cases due to each disease in the population to obtain the actual number of cases attributable to the exposure. An exposure with a high AF for a disease of low incidence may cause fewer cases than for another more common disease for which the exposure has a low AF. Wacholder^{37} gives the example of the BCRA mutation, which has a high attributable fraction for ovarian cancer and a low attributable fraction for breast cancer but will cause more cases of breast cancer because breast cancer is much more common than ovarian cancer. Wacholder, in an argument analogous to the traditional one regarding the risk difference versus the risk ratio, proposes that the “attributable community fraction” (ACR = I – I_{0}, the numerator of formula 3a) may be more relevant to public health than the AF itself.

The AF (or the ACR) provides a bridge by which results of epidemiologic studies can be made relevant to public health policy. It also forces the epidemiologist to examine of the relevance their work—if relative risks and proportions exposed are small, implications for public health may be minor and the usefulness of research questionable. Another question arising naturally in the calculation of the AF is whether the risk factor can be eliminated by intervention, and if so, what is the benefit to society from eliminating or reducing the risk factor? The downside of urging epidemiologists to confront these questions routinely is that it takes epidemiologists as impartial scientists and thrusts them more clearly into the political arena of public health.^{38} The advantage is that it forces them to think of the impact of their work in society as a whole.^{37,39,40} Public health agencies are increasingly using measures of public health burden to make the case for (or against) specific interventions; epidemiologists are somewhat behind.

## REFERENCES

*Lancet*. 2002;360:1347–1360.

*Am J Prev Med*. 2005;28:415–423.

*Sex Transm Infect*. 2005;81:38–40.

*MMWR Morb Mortal Wkly Rep*. 2004:53:866–870.

*Am J Epidemiol*. 1988;128:1185–1197.

*Acta Unio Int Contra Cancrum*. 1953;9:531–541.

*Am J Epidemiol*. 1974;99:325–332.

*Stat Med*.1984;3:131–141.

*Stat Methods Med Res*. 2001;10:195–216.

*Am J Public Health*. 1998;88:15–19.

*Am J Epidemiol*. 2004;160:331–338.

*Am J Epidemiol*. 1985;122:904–914.

*JAMA*. 2005;293:1861–1867.

*Am J Epidemiol*. 1983;117:598–604.

*Am J Epidemiol*. 1994;140:303–309.

*Epidemiologic Methods for Occupational and Environmental Health Studies*. Ann Arbor, MI: Ann Arbor Science Publications; 1983;177–184.

*Int J Cancer*. 1988;42:851–856.

*Am J Ind Med*. 2003;43:461–482.

*Scand J Work Environ Health*. 2001;27:161–213.

*Lancet*. 1995;345:535–539.

*Epidemiology*. 1999;10:594–605.

*Modern Epidemiology*. Philadelphia: Lippincott-Raven; 1998.

*Stat Med*. 1987;6:701–708.

*Int J Epidemiol*. 2004;33:1389–1397.

*Stat Med*. 2000;19:1089–1099.

*Epidemiology*. 1991;2:363–366.

*Epidemiology*. 1990;1:322–329.

*J Epidemiol Community Health*. 2003;57:200–206.

*Health Policy*. 2004;70:137–149.

*Annu Rev Public Health*. 2002;23:115–134.

*Bull World Health Organ*. 2000;78:981–994.

*Stat Med*. 1991;10:79–93.

*Am J Epidemiol*. 2005;182:618–620.

*Epidemiology*. 2005;16:1–3.

*Lancet*. 1998;352:810–813.

*Am J Public Health*. 1996;86:678–683.

*Epidemiology*. 2005;16:124–129.