Secondary Logo

Journal Logo


Regression Calibration Is Valid When Properly Applied

Liao, Xiaomei; Spiegelman, Donna; Carroll, Raymond J.

Author Information
doi: 10.1097/EDE.0b013e31828b284b

To the Editors:

Guo et al1 address an important, previously unsolved problem: how to adjust for measurement error when a confounder is unavailable in the validation study. They propose a multiple imputation approach to solve this problem, under the assumption that f(Y,X,ZW) follows a multivariate normal distribution with linear means and a constant variance-covariance matrix.

The authors compare their approach to two alternative estimators. The “regression prediction” estimator is what is usually called the “regression calibration” estimator.2 If Z is a confounder, that is, a risk factor for the outcome and a correlate of the exposure, X, regression calibration is not valid when Z is not available in the validation study.3,4 This was clearly stated in Discussion section of Rosner et al3: “To implement this method, a validation study must be performed to relate true exposure for the covariates measured with error to observed exposure for the covariates measured with error and to the covariates measured without error.” It has also been long known that if Z is a risk factor for the outcome and not a correlate of exposure (ie, not a confounder), then it drops out of the measurement error model, f(XW,Z) = f(XW), and regression calibration will be unbiased even when Z is missing from the validation study.5 Guo et al1 focus on the case when Z is a confounder but missing from the validation study. The authors show through extensive simulations that a method known from basic theory to be biased is in fact biased.

Guo et al1 consider an additional estimator, which they call the “classical calibration” estimator. It is easy to show that this estimator is valid only in the trivial case when there is no measurement error (see eAppendix 1, In the first set of simulations by Guo et al1, β0 = 0 and β1= 1.1. Because this scenario is so close to the classical measurement error case, theory tells us that a large bias should be expected, as found in the simulations presented in Table 1 (see eAppendix 1, case 1, Even if the mismeasured variable is generated by the linear measurement error model W = β0 + β1X + e, what we need for calibration is E(XW) = α0 + α0W. Using

to estimate E(XW) as Guo et al1 did results in a biased estimator (see eAppendix 1, case 2, Because the classical calibration estimator is biased and will typically approach the naive estimator, just about any alternative approach is likely to have better performance. Thus, Guo et al1 applied two methods to a setting where the methods are theoretically biased and then showed that their method, designed for the case considered, was valid.

An additional point—the multiple imputation method can be viewed as Monte Carlo integration of the likelihood for f(Y,X,ZW) over the observed data, with f(Y,ZW) identified in the main study and f(XW) identified in the validation study. It is clear that the more restrictive surrogacy assumption Guo et al1 imposed is required for identifiability of their method (see eAppendix 2,; it is not clear if the linear, homoscedastic multivariate normality assumption for f(Y,X,ZW) is also needed. The simulations suggest that possibly not; however, then a misspecified likelihood is being used that will therefore deliver an inconsistent estimator.

Finally, it is worth mentioning that, to maintain some uniformity of terminology, in the measurement error literature, an external validation study is one in which Y is missing but Z is present if it is a confounder, and an internal validation study is one in which Y is present along with Z.6–8

Xiaomei Liao

Departments of Epidemiology and Biostatistics

Harvard School of Public Health

Boston, MA

[email protected]

Donna Spiegelman

Departments of Epidemiology and Biostatistics

Harvard School of Public Health

Boston, MA

Raymond J. Carroll

Department of Statistics

Texas A&M University

College Station, TX


1. Guo Y, Little RJ, McConnell DS. On using summary statistics from an external calibration sample to correct for covariate measurement error. Epidemiology. 2012;23:165–174
2. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM Measurement Error in Nonlinear Models. 20062nd ed London Chapman & Hall
3. Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am J Epidemiol. 1990;132:734–745
4. Spiegelman D, McDermott A, Rosner B. Regression calibration method for correcting measurement-error bias in nutritional epidemiology. Am J Clin Nutr. 1997;65(4 suppl):1179S–1186S
5. Armstrong BG, Whittemore AS, Howe GR. Analysis of case-control data with covariate measurement error—application to diet and colon cancer. Stat Med. 1989;8:1151–1163
6. Spiegelman D, Gray R. Cost-efficient study designs for binary response data with Gaussian covariate measurement error. Biometrics. 1991;47:851–869
7. Spiegelman DColton T, Armitage P. Reliability studies. In: Encyclopedia of Biostatistics. 1998 Sussex, UK Wiley:3771–3775
8. Spiegelman DColton T, Armitage P. Validation studies. In: Encyclopedia of Biostatistics. 1998 Sussex, UK Wiley:4694–4700

Supplemental Digital Content

© 2013 by Lippincott Williams & Wilkins, Inc