Underlying Disease Model
Survival time to cardiovascular event occurrence (T) followed an exponential distribution with mean,
Censoring was based on 2 mechanisms: cardiovascular disease-free survival for 10 years (study duration) and loss to follow-up (drop-out). Ten percent of all subjects dropped out in the first 10 years with time determined by the uniform distribution so the probability of drop-out was constant over all study years.
Predicted Exposure Model
For each of the 6 exposure models, 2 versions of predicted PM2.5 were estimated from the simulated (“measured”) PM2.5 concentration at the 22 monitor locations. Nearest-monitor exposures (W*) assigned the measured PM2.5 at the closest monitor to each individual residential location. Kriged exposures (W**) interpolated measured PM2.5 by universal kriging methods. Kriging prediction consists of 2 parts: estimating the spatial structure and interpolating the spatial surface. We estimated parameters from the monitor measurements under a semi-variogram with a spherical covariance and a second-order polynomial mean model. Given the estimated parameters, we predicted PM2.5 at individual residence locations.
Fitted Disease Model
The effect of PM2.5 on time to cardiovascular event was estimated by the Cox proportional hazards model, conditional on true (W′ = W) or predicted exposure (W′ = W* or W**),
Assessment of Validity and Reliability
To evaluate validity and reliability of health effect estimates, bias, variance, mean square error, and 95% coverage probability were estimated. Coverage probability is the proportion of 95% confidence intervals of health effect estimates that include the true β.
To gain insight into properties of health effect estimates based on properties of exposures and their prediction, we also investigated validity and reliability of predicted exposure by summarizing average prediction error (difference of predicted PM2.5 minus true PM2.5), variance of true and predicted exposures at subject locations, mean square prediction error, and 1 minus mean square prediction error relative to true exposure variance (constrained to always be greater than or equal to 0), which is an estimate of R2.
Summary statistics of true and predicted long-term PM2.5 exposure in the first simulation are shown in the eTable (http://links.lww.com/A980). Across the exposure models, the true PM2.5 exposure means for the 2000 residential sites were between 17.1 and 26.1 μg/m3. Predicted PM2.5 had smaller standard deviations and ranges than true PM2.5, particularly for kriging prediction. Relative to the variation in the true exposure data, kriged PM2.5 varied much less in the second and third true exposure models (TEM 2 and TEM 3), which assumed little spatial correlation. A more detailed picture emerges from the true versus predicted exposure scatter plots displayed in Figure 2. Predicted PM2.5 was more correlated with true PM2.5 when the true exposure models had more spatial structure. Although models 1 and 4 (TEM 1 and TEM 4) were based on different fits to the same data, model 1 predictions were more correlated with true exposures.
Table 2 gives summary statistics for the quality of predictions over all simulations. Across the models, kriged PM2.5 had smaller average prediction error and variance than nearest-monitor PM2.5. Mean square prediction error was smaller for kriging as well, except for model 4. Likewise, estimated R2 was generally higher for kriged predictions.
Table 2 also summarizes qualities of health effect estimates. Kriged PM2.5 health effect estimates were less biased and more variable than health effects from nearest-monitor PM2.5. Estimates with the highest bias were from exposures with the least amount of spatial structure. However, because variance dominated bias, nearest-monitor predictions had smaller mean square error than kriging predictions. This trend was consistent across all 6 exposure models. Moreover, the same pattern was found for higher (RR = 1.34) and lower (RR = 1.14) true effects (results not shown). Coverage for kriged exposure was generally better than for nearest monitor, but was mostly less than the target 95%. With respect to the relationship between predicted exposure and health effect estimates, spatial exposure structures with higher estimated R2 produced higher coverage of health effect estimates.
Figure 3 shows scatter plots of the relationship between health effect estimates. Predictions from the first exposure model (TEM 1) gave the best health effect estimates from kriged exposure. Similarly, the fourth, fifth, and sixth exposure models with larger spatial correlation (TEM 4, TEM 5, and TEM 6) produced a stronger association between health effect estimates for true and predicted exposures than the exposure models with no or little spatial dependence (TEM 2 and TEM 3). Nearest-monitor prediction underestimated true effects on average, whereas kriging only underestimated on average when there was little spatial correlation in the exposure. Note that because there are estimates in the second and fourth quadrants of the nearest compared with kriged exposure scatter plots, there are multiple realizations for each exposure model where the kriging and nearest-monitor prediction approaches would lead to health effect estimates with opposite signs.
Figure 4 shows a sensitivity analysis of the relationship between spatial correlation (represented by the range) and health effect estimate properties (mean square error and coverage) across exposure models with different covariance parameters and mean structure. The models that produced the best health effect estimates (as defined by good coverage and small mean square error) had large range parameters relative to the partial sill. As the range increased, coverage increased toward 95% and mean square error decreased. Moreover, the difference between prediction approaches for both coverage and mean square error became smaller. Addition of mean structure had a dramatic effect on mean square error but relatively a small impact on coverage. In particular, with mean structure in the model there was only a small difference in mean square error between kriging and nearest-monitor exposures for all range parameters.
We explored 2 exposure prediction approaches in a simulation study to estimate the effect of long-term PM2.5 exposure on cardiovascular events. Kriging prediction estimated more accurate exposure because it had smaller average mean square prediction error. However, kriged exposure predictions were also less variable, which has implications for health effect estimates. Overall, health effect estimates were better with kriged exposure when comparing results based on coverage probability. In contrast, for mean square error kriging generally performed worse than nearest-monitor prediction due to greater variance in the health effect estimates resulting from smaller exposure variance. Differences between the 2 approaches diminished in more spatially dependent exposure structures where the dependence is represented by a longer range or the polynomial mean model.
Health effect estimates for both exposure prediction approaches had better properties as spatial correlation in the covariance model increased and a spatially varying mean structure was included. Coverage converged to 95% and mean square error decreased as the range became larger. This improvement was consistent across different partial sill parameters (Fig. 4). In addition, when the spatially varying mean model was included, coverage improved slightly, whereas mean square error reduced more dramatically. In particular, mean square error of kriging prediction dramatically declined and even became smaller than that of nearest-monitor prediction.
The mean square error and coverage criteria led to generally different conclusions about the preference of nearest-monitor versus kriging predictions. Coverage is a very important property of health effect estimates because interval estimates are the basis for inference. Coverage of 95% means that the 95% confidence interval estimated in an observational study has the correct interpretation, that is, that the interval has a 95% probability of covering the true effect. Although nearest-monitor prediction had generally smaller estimated mean square error, this was due to less variable but mostly biased effect estimates that produced generally smaller coverage below 95%. When we examined the expanded set of exposure models with different range, partial sill, and mean parameters, all models except 5 showed better coverage for kriged compared with nearest-monitor exposures. Four models among them had medium range (120 km) and large partial sill (46.9 and 90), with and without mean structure. The coverages of the health effect estimates of nearest monitor were slightly better (by 1%–4%). In the remaining models, coverage in kriging was much better (> 50%) when spatial correlation was small with the range of 10 km and became slightly better (by 1%–3%) as spatial correlation increased up to the range of 500 km (Fig. 4).
Health effect estimates from the 2 prediction methods showed features typically attributed to classic and Berkson measurement errors depending on the spatial structure. Nearest-monitor health effect estimates were always attenuated, as seen from the lower fitted regression line for nearest monitor compared with true exposure (Fig. 3). Attenuation was also present for kriged exposure with no spatial correlation or little range (TEM 2 and TEM 3). These underestimated effects behave like attenuation due to classic measurement error. In contrast, health estimates from kriged predictions of exposure models with strong spatial structure showed little or no attenuation but the estimates were more variable than for true exposures. This is behavior typically attributed to a Berkson measurement error model, although in this application with a spatially correlated surface the error structure is not independent and identically distributed so the Berkson analogy is only approximate. Because we also showed kriged exposures in models with poor spatial structure gave health effect estimates that were both attenuated and more variable, our results suggest that there is additional deviation from the Berkson error structure due to uncertainty in estimating the prediction model parameters. Szpiro et al12 propose a method to correct the standard error estimates that accounts for this additional source of uncertainty. In future work, we plan to evaluate the performance of their methods for our examples, and we expect the corrected coverage probabilities will be closer to the target 95%.
There are other ways to incorporate prediction uncertainty in health analyses. In a 2-stage approach, exposure can be simulated to capture the full exposure distribution. Then multiple realizations of the simulated exposures can be used in health analyses and the estimates combined as with multiple imputation. However, the approach is not true multiple imputation because the outcome is not included in the simulation (imputation) step. Gryparis et al14 have shown that this “exposure simulation” approach leads to notable attenuation bias in the health effect estimate. A priori, many researchers would argue that the best approach would use a joint model for both the exposure and the disease outcome. It is notable that a few recent simulation studies have not achieved consistently good coverage for the health effect estimate obtained from a joint model, fit either with maximum likelihood14 or using a fully Bayesian model,13 contrary to expectations. In addition, joint models allow feedback between health and exposure, and this feedback could influence inference.15 It is a defensible scientific choice to keep the exposure modeling separate from the health effect estimation, thus ensuring there is no feedback between the health and exposure models. Tools are being developed to correct second-stage health effect estimates for exposure prediction uncertainty.12
One of the features of our approach is that we considered a monitoring network and exposure models based on existing data to ensure the relevance of our results. Health effect estimates from exposure models derived from analysis of LA data (TEM 1 and TEM 4) showed good properties. Although the Los Angeles area has a great deal of spatial variability of PM2.5 and a relatively large number of monitors compared with other areas, the monitoring network is still relatively sparse from a spatial statistics perspective. Thus, it may be challenging in practice to model the spatial structure. More work is needed to determine the best approaches to spatial prediction for air pollution exposure data. As a sensitivity analysis for one aspect of this question, we looked at the impact of monitor density on our simulation results. First we increased the number of monitors by 5 up to 42 in the first true exposure model (TEM 1). Locations of additional monitors were uniformly distributed within the area covered by the existing 22 monitors. The denser network did produce marginally better exposure predictions and health effect estimates. With 20 additional monitors, mean square error for health effect estimates decreased by 16% and 15% in nearest-monitor and kriging predictions, respectively. However, the coverage probability was almost identical to that of the original number of monitors. We also explored the influence of a sparser monitoring network, a more frequent occurrence in air pollution epidemiology, by reducing the number of monitors by 5 down to 12 in total. With only 12 monitors (10 less than original monitoring network), our health effect estimates yielded similar coverage probability estimates and a small increase of mean square error estimates for kriged exposures. In contrast, nearest monitor prediction performed much worse. Mean square error increased by over 50%. Coverage probability dropped by about 20%. This sensitivity analysis suggests that a large number of monitors may not be necessary for kriging prediction in a health effect analysis, at least when there is adequate spatial variability and the form of the spatial model is not in question. Health effect estimates were more sensitive to monitor reduction when individual exposure was predicted using nearest monitor.
We observed that for the kriged exposure, a small percentage of health effect estimates resulted in large outliers (omitted from Fig. 3). In further evaluation of those simulations, we found the exposure model covariance parameter estimates were unreasonably large. Under the spatial structure with very small range, poorly estimated covariance parameters induced the most extreme outlier occurrences. As part of our analysis, we examined both universal and ordinary kriging fitting procedures for all exposure models. We found that estimation using ordinary kriging (ie, with a constant mean) produced consistently poorer covariance parameter estimates across all exposure models. The universal kriging approach gave more stable parameter estimates even when there was no underlying mean structure in the true exposure model. Although it is not practical to investigate a large number of mean and covariance models in a simulation study, further observational research should be done to determine whether other spatial statistical models or fitting algorithms would produce more reasonable parameter estimates in such cases.
To focus this research on features of spatially dependent exposure and prediction that affect health effect parameter estimates, we made some simplifying assumptions. Most important, we assumed ambient air pollution concentration was identical to ambient source exposure. This assumption does not reflect the difference of exposure from concentration due to time activity and local sources or the attenuation of ambient exposure by infiltration. Future research will need to incorporate these exposure features. We also assumed the multivariate normal distribution for the underlying true exposure. In sensitivity analyses, we relaxed this assumption and introduced skewness in the exposure data by changing the underlying distribution to a log-normal. In the few cases we examined, the pattern of health effect estimates remained and kriging still performed better than nearest-monitor prediction. Finally, we assumed there was no additional spatial structure in the disease model. There may be spatial dependence in the health outcome in addition to the spatial structure induced from the exposure distribution. Work is needed to incorporate spatial dependence in the disease model and determine how separable the 2 sources of spatial structure may be in the health analysis.
Although the focus of this paper is specific to the field of air pollution epidemiology, this research is broadly relevant to inference in epidemiologic studies in the presence of limited exposure data. We present a problem in which there are no exposure measurements for study subjects but one in which it is possible to use an external dataset to develop an exposure prediction model to plug into the health effect analysis. We evaluate the merits of 2 exposure prediction approaches directly in terms of the parameter of interest in the epidemiologic study. We learned that it is not sufficient to assess the accuracy and precision of the exposure predictions when the target of inference is the health effect parameter in the disease model. We showed that the bias and precision of the resulting health effect estimates depend on the approach to exposure prediction and the features of the underlying exposure distribution. Although the details of our results are specific to the air pollution epidemiology context, our general approach and overarching conclusions are relevant to other epidemiologic study settings.
In summary, based on the generally better coverage and lower bias properties of the health effect parameter estimate of interest, we conclude that kriging exposure prediction is preferable to nearest-monitor prediction for estimation of long-term air pollution health effects, particularly when the underlying true exposure is less spatially correlated. Both prediction approaches perform well when there is good spatial structure in the underlying exposure distribution.
We thank Kristin A. Miller for her constructive suggestions and comments.
1. Dockery DW, Pope CA III, Xu X, et al. An association between air pollution and mortality in six US cities. N Engl J Med
2. Pope CA III, Burnett RT, Thun MJ, et al. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. JAMA
3. Basu R, Woodruff TJ, Parker JD, Saulnier L, Schoendorf KC. Comparing exposure metrics in the relationship between PM2.5 and birth weight in California. J Expo Anal Environ Epidemiol
4. Miller KA, Siscovick DS, Sheppard L, et al. Long-term exposure to air pollution and incidence of cardiovascular events in women. N Engl J Med
5. Ritz B, Wilhelm M, Zhao Y. Air pollution and infant death in Southern California, 1989–2000. Pediatrics
6. Wilhelm M, Ritz B. Local variations in CO and particulate air pollution and adverse birth outcomes in Los Angeles County, California, USA. Environ Health Perspect
7. Finkelstein MM, Jerrett M, Sears MR. Environmental inequality and circulatory disease mortality gradients. J Epidemiol Community Health
8. Jerrett M, Burnett RT, Ma R, et al. Spatial analysis of air pollution and mortality in Los Angeles. Epidemiology
9. Kunzli N, Jerrett M, Mack WJ, et al. Ambient air pollution and atherosclerosis in Los Angeles. Environ Health Perspect
10. Leem JH, Kaplan BM, Shim YK, et al. Exposures to air pollutants during pregnancy and preterm delivery. Environ Health Perspect
11. Banerjee S, Carlin BP, Gelfand AE. Hierarchical Modeling and Analysis for Spatial Data
. Boca Raton, FL: Chapman & Hall/CRC Press; 2004;21–68.
12. Szpiro AA, Sheppard L, Lumley T. Accounting for errors from predicting exposures in environmental epidemiology and environmental statistics. UW Biostatistics Working Paper Series.
2008. Working paper 330.
13. Gryparis A, Paciorek CJ, Zeka A, Schwartz J, Coull BA. Measurement error caused by spatial misalignment in environmental epidemiology. Harvard University Biostatistics Working Paper Series
. 2007. Working paper 59.
14. Madsen L, Ruppert D, Altman NS. Regression with spatially misaligned data. Environmetrics
15. Wakefield J, Shaddick G. Health-exposure modeling and the ecological fallacy. Biostatistics
This article has been cited
Supplemental Digital Content
© 2009 Lippincott Williams & Wilkins, Inc.