To the Editor:
Many studies aim to evaluate the association between an outcome variable Y and an exposure variable X. When a direct measurement of exposure X is expensive or impractical, researchers may use a proxy of X, designated here as X*. We address the consequences of using a surrogate exposure measure. Suppose that the (unobserved) correlation between the true exposure X and the surrogate X* (rXX*) is 0.95, and the observed correlation between the surrogate exposure and the outcome (rYX*) is 0.4. Obviously, it is not possible to know the true correlation (rYX) between Y and X; however, because of theoretical constraints among the correlations between 3 variables, rYX cannot be outside the interval 0.094 to 0.666. This is graphically displayed in Figure 1, which represents the admissible values1 for rYX under the assumed conditions. Furthermore, when rYX* is less than 0.312, the associated interval crosses the horizontal axis. Thus, even if the observed correlation is positive, the true correlation rYX could be negative.
A hypothetical example of trend reversal was proposed by Weinberg et al,2 in which rXX* was 0.533 and rYX* was −0.007. Figure 2 presents the admissibility interval for rYX (−0.850, +0.842) when rYX* is −0.007 (very close to zero). The observed negative correlation rYX* would have to be stronger than −0.846 to guarantee that the sign of rYX is negative. Because the correlation rYX* was only −0.007, a trend reversal can occur. The value of −0.846 for rYX* can be obtained from the formula
substituting rXX* with 0.533 and assigning the sign of the correlation rYX*.1 Because the exposure and the surrogate play a symmetric role, it is possible to reverse the reasoning, such that the direction of the trend is preserved whenever the correlation between the outcome and the true value of the exposure is stronger than 0.846. This result holds without any other assumptions.
Researchers must use a proxy exposure variable that is highly correlated with the true measure to observe a valid correlation between the outcome and the proxy. We give a precise, quantitative meaning to this statement. The tightness of the ellipse of admissibility depends on the correlation between the proxy and the true measure.
In the early 1990s, a number of articles dealt with the topic of nondifferential misclassification of an exposure, and of the consequent bias in the estimate of the true effect on an outcome variable.2–4 In particular, Weinberg et al2 gave a proposition (based on the covariance between the outcome Y, the true exposure X, and its measurement X*) to ensure that trend reversal cannot occur. Because covariance and correlation are strictly related (but correlation is bounded), a more general, quantitative statement is possible. The sign of the true correlation between Y and X and the sign of the observed correlation between Y and X* will not differ (so that the direction of the true trend is preserved) so long as the absolute value of rYX* is greater than
We would emphasize that we consider this topic without any assumptions, eg, no assumptions of linearity or of the relative positions of X and X* in the causal pathway to Y. Such assumptions would likely narrow the range of possible correlations between X and Y. However, adjustments (as well as methods of correction) are dependent on a set of assumptions that may not be correct. The results presented here show in a quantitative way that improving the quality of exposure measurements is the best strategy for accurately estimating the association between X and Y.
Department of Cognitive Sciences and Education
University of Trento
Department of Sociology and
University of Trento
1. Canal L, Micciolo R. Admissibility intervals for linear correlation coefficient. Stat Papers
2. Weinberg CR, Unbach DM, Greenland S. When will nondifferential misclassification of an exposure preserve the direction of a trend? Am J Epidemiol
3. Dosemeci M, Wacholder S, Lubin JH. Does nondifferential misclassification of exposure always bias a true effect toward the null value? Am J Epidemiol
4. Flegal KM, Keyl PM, Nieto FJ. Differential misclassification arising from nondifferential errors in exposure measurements. Am J Epidemiol