Click on the links below to access all the ArticlePlus for this article.
Please note that ArticlePlus files may launch a viewer application outside of your web browser.
A question often arising in epidemiology is how much of the effect of an exposure on outcome is explained by a third, intermediate variable. For example, a common problem in social epidemiology involves estimating how much of the effect of social class on a given health outcome is mediated by psychologic or lifestyle factors. The problem also arises in clinical trials when using surrogate markers, also called intermediate endpoints, to assess effects of interventions or possible risk factors. An example of this scenario is the question of how much of the effect of a cholesterol-lowering drug on coronary heart disease can be assessed by the serum cholesterol level at a point between administration of the drug and the onset of the disease.1 A further example is the extent to which human papillomavirus infection mediates the relationship between number of sexual partners and cervical dysplasia.2–4 As always, the causal direction is most obvious in longitudinal studies (cause before effect); in cross-sectional studies, the causal pathway has to be postulated. The method we propose applies to both situations.
Current terminology defines an intermediate variable as one lying at least in part on the causal pathway between exposure and response. This is in contrast to a confounder, for which the effects of exposure should usually be corrected. As emphasized by current texts in epidemiology,5 it is misleading to correct for an intermediate variable; however, as a result, intermediate variables often are ignored. It can nonetheless be useful to ask how much of the effect of the exposure variable A on the response C has been mediated by an intermediate variable B.
The usual approach to this question is to compare a multiple regression model containing B with one without, where B could represent multiple mediators. This is achieved by comparing differences in either effect measures or overall model fit. This approach was used by Lynch et al6 in a study of acute myocardial infarction using Cox proportional hazard models. To assess the impact of risk factor adjustment on the age-adjusted relative hazard (RH), the proportion of excess relative risk accounted for by risk adjustment was calculated using the formula (RHmodel A − RHmodel B)/(RHmodel A − 1), where model B included the risk factors.
A similar method was used by van de Mheen and colleagues7 in a longitudinal study investigating the influence of childhood socioeconomic conditions and selection processes on adult inequalities in health. Explanatory factors were added to a logistic regression model with childhood or adult socioeconomic level and confounders only. The contribution of these explanatory factors was measured by the percentage reduction in the odds ratios (ORs) of childhood socioeconomic groups compared with the first model. The formula used was (ORmodel A − ORmodel B)/(ORmodel A − 1).
Validation of surrogate markers in clinical trials has enjoyed a lively methodological debate in the last decade. A key reference is by Freedman et al1 These authors described the dependence of a binary effect measure C on a discrete surrogate marker B and a binary exposure variable A by an additive (no-interaction) logistic model
where the regression coefficient αa is the usual log odds ratio of the effect of A on C corrected for B. They compared that with a logistic model
for the uncorrected effect, and defined the proportion of the exposure effect explained by the intermediate end point as (1 − αa/α′a), to be estimated by replacing αa and α′a by their usual maximum likelihood estimates.
Freedman et al1 acknowledged the nonuniqueness of this measure due to the arbitrary choice of the logistic link function and the arbitrary choice of the log OR as effect measure. Apparently, however, they did not find it problematic that the 2 above-mentioned models cannot be mathematically consistent (a mixture of logistic regressions is not itself a logistic regression) except in special situations.8,9
Buyse and Molenberghs10 pointed out that when B and C are jointly normally distributed, the analysis of the proportion explained in Freedman et al1 corresponds to elementary analysis of a bivariate normal distribution.
Wang and Taylor9 proposed a measure of proportion explained, defined in its simplest version for binary A, B, and C as
where the denominator is the treatment effect on the probability scale disregarding the surrogate marker B. The numerator expresses that part of the treatment effect that is mediated through B, namely the difference between (1) the probability P(C = 1|A = 0) of success when exposure A = 0 and (2) the probability S, which expresses what the probability of success when exposure A = 0 would be if the values of the surrogate B are distributed as those in the exposure group A = 1. In formulas,
Wang and Taylor9 also noted that their discussion considerably simplifies when B and C are jointly normal; indeed, in this case, their proposed measure coincides with that of Freedman et al.1
The surrogate marker discussion originated in clinical trials in which the exposure measure is binary. The exposition in this article acknowledges the conventional preference in epidemiology for discrete measures of exposure, confounding, mediation, and response. We shall arrive at methods covering these common situations, but there is a considerable technical gain in initially assuming that all 3 variables (exposures, mediators, response) are jointly normally distributed, or at least that the conditional distribution of (mediator, response) given exposure is bivariate normal. In that context, we propose to measure the mediating effect by the indirect effect as a fraction of the total effect. In the case of normally distributed variables, our measure coincides with those of Wang and Taylor9 and Freedman et al.1 We then embed the joint normal distribution as basic latent variables in a structural equation model. This allows for the (possibly ordered) categorical variables, possibly but not necessarily measured with error, that are so commonly encountered in epidemiology. We will call this measure the mediation proportion. Calculation of confidence intervals for the mediation proportion also becomes routine.
The structural equation models are regression equations with less-restrictive assumptions, allowing for measurement error in the explanatory as well as the dependent variables. The models consist of factor analyses that permit direct and indirect effects between factors. There are various advantages of using structural equation models. The central research question often involves quantities (“latent variables”) that are not directly observable, for example, visual acuity, general health, or cynical hostility. The assumptions of normally distributed variables underlying the observed ordered categorical variables in the structural equation models through threshold models avoid problems of more or less arbitrary construction of scales and selection of cut-points. The assumption of normal distribution of the latent variable is primarily a technical device with no intention of restricting the practical application and interpretation. Because structural equations are not widely used in epidemiologic practice, we include some introductory exposition of this methodology.
This article addresses situations in which the causal pathway already is specified, usually by insight or postulate based on substantive considerations, and possibly aided by the natural temporal order of events. In particular, we do not consider time-dependent confounders that change from confounder to mediator depending on the temporal order of events, nor do we consider variables that are partially caused by the exposure and correlated with the outcome.11 Finally, we make the basic assumption of no effect modification, as defined in the scale of the underlying linear normal statistical models. A technical advantage of reasoning in the framework of the underlying models is that these allow the study of the situation with and without intermediate variables within a single statistical model. Because the averaging operations across omitted variables are always linear, we are less prone to the many problems and paradoxes so carefully discussed for the binary-binary-binary case by Robins and Greenland,12 and by Cox and Wermuth,13 and followed up in the modern theory of graphical models.14
The methodology is first illustrated on a randomized clinical trial15 in ophthalmology studying the effects of interferon-α on visual acuity in patients with age-related macular degeneration. This example, where the 3 entering variables are all binary, was earlier discussed by Buyse and Molenberghs10 and by Wang and Taylor.9
The methodology is further illustrated using the study in social epidemiology that initiated our interest in this concept. It is well known that social class affects health. The question asked here is how much of this effect is mediated through the psychologic variable “cynical hostility.” We calculate the mediation proportion and finally enlarge the model also to include the intermediate variable self-efficacy, illustrating how the method applies to several pathways within the same model.
Assume exposure A, intermediate variable B, and response C, in which the conditional joint distribution of B and C is bivariate normal. No normality assumption is necessary for the exposure A as it enters the analysis only as a regressor.
The effect of exposure A on response C would usually be reported by calculating the regression of C on A: C = βA. We want to decompose β into a sum of a “direct” effect of A on C and an “indirect” effect of A on C via B. Now the effect of A on B is expressed by the regression B = γ2A and the effects of A and B on C by the regression C = γ1 A + γ3 B (Fig. 1). Then β = γ1 + γ2γ3 (see Appendix A, available with the electronic version of this article, for the derivation) so that we have obtained the desired decomposition of the total effect β of A on C into the direct effect γ1 and the indirect effect γ2γ3. That is, a unit increase in A would cause an increase of γ1 + γ2γ3 in the expectation of C where γ2γ3 is the part that goes through B. We define the mediation proportion as the dimensionless proportion of the effect of A on C mediated through B, or
The mediation proportion is the percentage change of the regression coefficients when we include an intermediate variable in the model. Obviously the interpretation is easiest when all entering regression coefficients γ1γ2γ3 and, hence, β, are positive.
In epidemiologic practice, not all variables of interest are continuous or normally distributed. We need a framework that allows for confounders as well as for measurement errors on the exposure and intermediate variables and permits generalizations to several intermediate variables and exposures.
In the remainder of this work, we show how structural equation models may contribute to handling these generalizations, and we illustrate by the analysis of the clinical trial from ophthalmology and the cross-sectional survey of social class and health.
STRUCTURAL EQUATION MODELS
Structural equation models are generalizations of linear regression models. They allow latent concepts (such as health, visual acuity, cynical hostility, intelligence, or power) that usually are only indirectly observed, for example, by items in a questionnaire or clinical measurements. We briefly outline the idea behind the structural equation models. A more detailed description can be found in several references.16–19
The model consists of a system of structural equations containing random variables and structural parameters. The random variables are either latent, observed or “disturbance terms,” with the latter modeling the variation among subjects. The system of structural equations is divided into 2 parts: the measurement model and the model for the latent variables.
The measurement model consists of the structural equations that describe the relations between latent and observed variables. Latent variables relate to concepts and are thus hypothetical variables that are not directly observed. Visual acuity, cynical hostility, self-efficacy, and symptom load are such latent variables and are assumed normally distributed. Because we cannot measure a latent variable, we instead measure other variables (termed “indicators”) that we assume are correlated with the latent variable. In the case of cynical hostility these are typically items from a questionnaire. The indicators are related to the latent variables through factor analytic models.
In the simplest description the indicators are assumed to be normally distributed, but other types of indicators such as ordered categorical variables (the typical response from a questionnaire) can be handled within the framework of structural equation models by threshold models (see Appendix B, available with the electronic version of this article). The idea is to superimpose the distribution of answers in each category onto a normal distribution.
The model for the latent variables consists of a system of linear regression models,
where γi are regression coefficients and ζj error terms. Because nothing is observed in the model for the latent variables, we connect the latent variables to the measurement model through the indicators to estimate the structural parameters (routinely done in standard software such as M-plus, MECOSA or SAS). Note that the model allows for random variation in the intermediate variable B, as well as controlling for confounders.
APPLICATION TO OPHTHALMOLOGY DATA
To illustrate the concepts, we first reanalyze an example used by both Buyse and Molenberghs10 and Wang and Taylor.9 This is a randomized clinical trial in ophthalmology on the effects of interferon-α in patients with age-related macular degeneration.15 The treatment group received interferon-α, and the control group received placebo. A patient's visual acuity was assessed through the ability to read lines on a vision chart. The response is whether the patient had lost at least 3 lines of vision at 1 year. The intermediate variable is loss of at least 2 lines of vision at 6 months, and the exposure is whether the patient received placebo or interferon-α. Contrary to what was hypothesized from pilot studies, interferon-α showed no benefit and even a detrimental effect.15 Adopting the notation used in earlier articles,9,10 the variables are defined as
The data are presented in Table 1. Visual acuity is assumed to vary continuously, and we describe it as normally distributed, measured indirectly through loss of ability to read lines on a vision chart. The exposure variable is binary and directly measured. The model is illustrated in Figure 2, which uses the conventional notation of structural equation models (squares symbolize measured variables and ovals symbolize latent variables).
The model was fitted using M-plus20 (see Appendix C, available with the electronic version of this article). Estimates are listed in Table 2. All estimates are positive: a person exposed to interferon-α has higher risk of losing visual acuity at both 6 months and at 1 year, and poorer visual acuity at 6 months increases the risk of losing more visual acuity at 1 year. The mediation proportion of the effect of interferon-α on visual acuity at 1 year, mediated through visual acuity at 6 months, is 0.878 (95% confidence interval [CI] = 0.201–1.555; delta-method; see Appendix A, available with the electronic version of this article):
Thus, the visual acuity at 6 months is a good surrogate marker for the visual acuity at 1 year.
For comparison, Freedman's measure for proportion of treatment explained was P = 0.445 (−0.30 to 4.35),9,10 and Wang and Taylor's measure was F = 0.690 (0.17 to 3.12).9 Our analysis indicates a stronger role of the pathway via the mediator than previous analyses. The discrepancy between the mediation proportion and Wang and Taylor's measure in this example stems from the underlying assumptions. We assume that visual acuity can be approximated by a normal variable, Wang and Taylor assumes it can be approximated by a binary variable. If visual acuity is thought to deteriorate smoothly over the course of time, we find the description by a continuous variable more appropriate. In general, if the underlying data-generating mechanism of a binary or a categorical variable is believed to be continuous, it seems more natural to base the model on a continuous variable.
THE EFFECT OF SOCIAL CLASS ON SYMPTOM LOAD MEDIATED THROUGH CYNICAL HOSTILITY
The study that triggered our interest in the problem is based on a random sample of 40- and 50-year-old individuals from the Danish general population. This sample is part of the Danish Longitudinal Study on Work, Unemployment and Health, and is drawn from the Institute of Local Government Studies in Denmark's longitudinal register at Statistics Denmark. The data are based on a postal questionnaire, which included variables on demographics, socioeconomic factors, somatic and mental health, social relations, health behavior, occupational health, social capital, and psychologic factors.
We focus on determining the role of cynical hostility (defined as a persistent negative attitude toward others involving cognitive, affective, and behavioral components) as a step in the pathway between social class and self-reported symptom load. We restrict attention to exposure (social class), mediator (cynical hostility), response (self-reported symptom load), and confounders (age and sex). A full substantive analysis would usually include more variables in more complex modeling. Cynical hostility was measured by 8 items derived from the Cook-Medley Hostility Scale.21–24 Social class based on occupation was coded in accordance with the standards of the Danish National Institute of Social Research. The classification is similar to the British Registrar General's Classification I-V, with an additional social class VI representing people on transfer income, including unemployment benefits, sickness benefits, disability pension, and other social benefits. Symptom load is measured by 13 questions on physiological and mental symptoms within the last 4 weeks.
The analyses were performed stratified by sex and adjusted for age. Only results for men are shown, as the analyses are merely for illustration. Our question was simple: How much of the effect of social class on symptom load is mediated through cynical hostility?
The Mediation Proportion of One Intermediate Variable
Consider model (2). Suppose A is social class, C is symptom load, and B is cynical hostility. Symptom load and cynical hostility are modeled as latent variables. Cynical hostility is measured by the 8 items x 1 − x 8 and symptom load by the 13 items y 1 − y 13. Social class is considered as directly observed. The variables x 1,...,x 8, y 1,...,y 13 and social class are included in the model using the threshold models described in Appendix B (available with the electronic version of this article) because they are all ordered categorical. A higher level of cynical hostility, symptom load, and social class means that the person is feeling more hostile, has had more symptoms the last 4 weeks, and has lower socioeconomic position, respectively. Age is included as a confounder in all entering regression models. The model is illustrated in Figure 3.
The model was fitted using M-plus,20 (see Appendix C, available with the electronic version of this paper). Estimates are listed in Table 3. The mediation proportion of the effect of social class on symptom load mediated through cynical hostility is 0.264 (0.218–0.315)
Assuming the postulated causal pathways, we would conclude that around a fourth of the effect of social class on symptom load can be explained for men by the fact that lower social class increases risk of being hostile and that being hostile increases risk of poor health.
The Mediation Proportion of Two Intermediate Variables
To further illustrate the possibilities in the structural equation approach to the measuring of the contribution of intermediate variables, we include a second intermediate variable. Self-efficacy (regarded as a self-confident view of one's capability to deal with certain life stressors) was measured by the 10 items in the version of the Generalized Self-Efficacy Scale.25 It is included as a latent variable, measured by items x 9 − x 18 using the threshold models described in Appendix B (available with the electronic version of this paper) because they are all ordered categorical. A higher level of (lack of) self-efficacy means that the person is feeling less self-confident. The model is illustrated in Figure 4.
Given the model, we quantify a given pathway from one variable to another by multiplying the regression coefficients in each individual path contained in that pathway. The total effect of one variable on another is the sum of all pathways leading from the former to the latter variable. The indirect effect of a third variable that lies in 1 or more pathways between the 2 main variables is calculated by taking the sum of the pathways going through this third variable. Then, we can quantify how much of the effect is mediated through that third variable by dividing the indirect effect by the total effect.
In the larger model the effect of social class on symptom load is the sum of one direct and 3 indirect effects, together forming all the pathways from social class to the response health, as illustrated in Figure 4. The total effect of social class on symptom load can be expressed as
The indirect effect of social class through self-efficacy, but not through cynical hostility, is the product of the 2 regression coefficients on that pathway(γ1γ5, Fig. 4), such that the part of the effect of social class on symptom load mediated through self-efficacy can be measured in the following way:
The mediation proportions of the effect of social class on symptom load mediated through indirect pathways are listed in Table 4. Within this model, we would conclude that approximately a fourth of the effect of social class on symptom load is mediated for men through lack of self-efficacy, with 20–25% mediated directly through cynical hostility, and approximately 2% mediated first through lack of self-efficacy and then through hostility.
The mediation proportion provides an estimate of how much of the effect of a given exposure is mediated through an intermediate variable. The approach is through structural equation models, where the relative size of the regression coefficients along the various pathways measures the effect of an intermediate variable on the association of interest. The methods are standard in path analysis but do not seem to have been routinely applied to epidemiologic problems of the kind addressed here.
In our first example on detrimental effects of interferon-α on visual acuity, there was a temporal ordering that made the causal pathways unambiguous. Our analysis indicated a substantially stronger role of the mediator than previous analyses. The discrepancy between the mediation proportion and Wang and Taylor's measure in this example stems from our modeling of a continuous distribution of visual acuity.
In the simplest setting of jointly normally distributed variables, our measure coincides with the measure of Wang and Taylor. The differences come about in the way the measures are generalized and, indeed, Wang and Taylor include situations not covered by the mediation proportion (eg, the general link function incorporated in their measure, thereby covering a broader range of distributions). However, the mediation proportion also covers other situations (eg, latent variables indirectly measured by a set of several related variables, handling of categorical variables, exploration of various indirect pathways within the same model).
We also used the mediation proportion to assess how much of the effect of social class on symptom load is mediated through cynical hostility, with the option of including 2 or more intermediate variables. This example was cross-sectional, and the postulated causal pathways could not be motivated by temporal order. Our statistical analysis, as is always the case, could not compensate for the lack of time relationship in the observed data. However, given a substantively motivated model, the mediation proportion offers an easily interpretable measure that does not depend on arbitrary constructions of scales or choices of cut-points. The structural equation approach is a way to model latent concepts not directly observable. By pretending that we are directly measuring these variables through other variables such as items in a questionnaire, we run a risk of getting unpredictably biased results. The proposed framework allows for error terms on intermediate variables and indicator variables, thus reducing the problem of bias. Moreover, the threshold models yield useful ways of handling ordered categorical variables. The application of latent variables and structural equation models is a practical way to acknowledge the inherent problems of measuring variables that are only partially related to the concepts of real concern.
A limitation of the proposed method is the assumption of no effect modification. This should be considered only a partial solution to the general problem of mediating effects. Preliminary analysis of data should be performed to check whether this is an acceptable assumption.
We are grateful to David Cox, Jeremy Taylor, and John W. McDonald for advice and suggestions for further references.
1.Freedman LS, Graubard BI, Schatzkin A. Statistical validation of intermediate endpoints for chronic disease. Stat Med
2.Schatzkin A, Freedman LS, Dorgan J, et al. Surrogate end points in cancer research: a critique. Cancer Epidemiol Biomarkers Prev
3.Schiffman MH, Schatzkin A. Test reliability is critically important to molecular epidemiology: an example from studies of human papillomavirus infection and cervical neoplasia. Cancer Res
. 1994;54(7 Suppl):1944s–1947s.
4.Schatzkin A, Freedman LS, Schiffman MH, et al. Validation of intermediate end points in cancer research. J Natl Cancer Inst
5.Rothman KJ, Greenland S. Modern Epidemiology
. Philadelphia: Lippincott Raven; 1998.
6.Lynch JW, Kaplan GA, Cohen RD, et al. Do cardiovascular risk factors explain the relation between socioeconomic status, risk of all-cause mortality, cardiovascular mortality, and acute myocardial infarction? Am J Epidemiol
7.van de Mheen HD, Stronks K, Mackenbach JP. A lifecourse perspective on socio-economic inequalities in health. In: Bartley M, Blane D, Smith, Davey G, eds. The Sociology of Health Inequalities
. Oxford, UK: Blackwell; 1998:193–216.
8.Lin DY, Fleming TR, De Gruttola V. Estimating the proportion of treatment effect explained by a surrogate marker. Stat Med
9.Wang Y, Taylor JMG. A measure of the proportion of treatment effect explained by a surrogate marker. Biometrics
10.Buyse M, Molenberghs G. Criteria for the validation of surrogate endpoints in randomized experiments. Biometrics
11.Weinberg CR. Toward a clearer definition of confounding. Am J Epidemiol
12.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology
13.Cox DR, Wermuth N. A note on the quadratic exponential binary distribution. Biometrika
14.Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiological research. Epidemiology
15.Pharmacological Therapy for Macular Degeneration Study Group.Interferon alfa-2a is ineffective for patients with choroidal neovascularization secondary to age-related macular degeneration. Results of a prospective randomized placebo-controlled clinical trial. Arch Ophthalmol
16.Bollen KA. Structural Equations with Latent Variables
. New York: John Wiley; 1989.
17.Arminger G, Wittenberg J, Schepers A. MECOSA 3 User Guide
. Friedrichsdorf, Germany: ADDITIVE GmbH; 1996.
18.Budtz-Jørgensen E, Keiding N, Grandjean P, et al. Estimation of health effects of prenatal mercury exposure using structural equation models. Environ Health
[serial online]. 2002;1:2. Available at: http://www.ehjournal.net/home/.
Accessed January 15, 2004.
19.Budtz-Jørgensen E, Keiding N, Grandjean P, et al. Statistical methods for the evaluation of health effects of prenatal mercury exposure. Environmetrics
20.Muthén LK, Muthén BO. Mplus, User's Guide
. Los Angeles: Muthén & Muthén.; 1998.
21.Cook WW, Medley DM. Proposed hostility and pharisaic-virtue scales for the MMPI. J Appl Psychol
22.Greenglass ER, Julkunen J. Construct validity and sex differences in cook-medley hostility. Pers Individ Dif
23.Greenglass ER, Julkunen J. Cook-Medley hostility, anger, and the Type A behavior in Finland. Psychol Rep
24.Everson SA, Kauhanen J, Kaplan GA, et al. Hostility and increased risk of mortality and acute myocardial infarction: the mediating role of behavioral risk factors. Am J Epidemiol
25.Jerusalem M, Schwarzer R. Self-efficacy as a resource factor in stress appraisal processes. In: Schwarzer J, ed. Self-Efficacy: Thought Control of Action
. Washington, DC: Hemisphere; 1992:195–213.
Supplemental Digital Content
© 2005 Lippincott Williams & Wilkins, Inc.