Air pollution epidemiology studies of ambient outdoor pollution frequently make use of predicted exposures through a two-stage procedure. First, a spatial or spatiotemporal model is developed using monitoring data from regulatory networks to predict exposures. Next, health outcomes are regressed on the predicted exposure and confounder variables in a second model. The use of predicted exposures in the health model can introduce measurement error in coefficient estimates.1–8
Traditional measurement error is classified into classical error, in which the measured value is an error-prone version of true exposure, and Berkson error, in which the measured value is a smoothed version of true exposure.9 For predicted spatial exposures in air pollution epidemiology, two similar types of measurement error arise. Reduced variability in exposure predictions, due to smoothing in the exposure regression model, induces Berkson-like error that can underinflate standard errors and, for some exposure models, induce bias in the point estimate.2,5,8 Classical-like error arises from having a finite number of monitors for building an exposure model, which results in variability in the exposure model parameter estimates and can induce bias and impact standard errors.2
Regression calibration9 is a standard approach to measurement error correction, which uses validation data to estimate the relationship between the true exposure and the error-prone measurement and then corrects bias in regression coefficients estimated from the primary data. Although mathematically one can view the two-stage exposure and health modeling paradigm as a specific form of regression calibration, with the monitor data used as external validation data for calibrating geographic covariates,10 the spatial or spatiotemporal nature of the data and practical sample size considerations imply that further correction is needed for the residual classical- and Berkson-like error.11
The paradigm for modeling the exposure surface determines the appropriate method for accounting for the induced measurement error. If the exposure surface is assumed to be random with fixed monitor and cohort locations, such as in the classical formulation of kriging problems,12 then a parametric bootstrap, or a similar approximate procedure, can be used to correctly remove bias and adjust standard errors.2 A second approach to measurement error correction under this paradigm is the spatial simulation extrapolation method developed by Alexeeff et al.,7 which introduces additional measurement error into simulated datasets and then calculates a back-transformed value that corresponds to no measurement error.
An alternative paradigm considers the exposure surface as fixed, with the locations of monitors and subjects providing the sampling variability.5,8 In studies with long-term exposures, the fixed surface paradigm better reflects the frequentist sampling nature of the study, in that replicate experiments would involve different subjects at different locations, and possibly different monitor locations, but not a different long-term pollution concentration surface. Szpiro and Paciorek5 outlined the conditions necessary for consistent estimation of a health association using predicted exposures under the fixed surface paradigm with unlimited monitoring data. An important criterion is spatial compatibility, which means that the monitor and cohort locations are samples from identical, or at least very similar, spatial distributions. This presents a challenge to air pollution epidemiology studies in which monitoring data are limited by regulatory siting procedures while cohort data are collected independently. Szpiro and Paciorek5 suggest excluding exposure or outcome data as one way to improve spatial compatibility.
Improving spatial compatibility reduces bias from Berkson-like error, but bias may still exist from classical-like error with finite monitoring data. For both types of error, corrections must still be made for incorrect estimates of the standard error.5 Under the fixed surface paradigm, a nonparametric bootstrap that resamples monitors and subjects provides an appropriate method to estimate bias and corrected standard errors.5
While these measurement error correction methods are well developed for long-term spatial exposures, they have not been applied to spatiotemporal exposures that vary across space and in time. In the spatiotemporal setting, the fixed surface paradigm extends to a setting in which the three-dimensional exposure “surface” is fixed, meaning that pollutant concentrations do not change in repeat experiments, but the location, time, and duration of observations may change due to monitor placement in space and time.
In this article, we extend these measurement error correction methods to an analysis of ambient fine particulate matter (PM2.5; particles less than 2.5 μm in aerodynamic diameter) and birth weight in a cohort from the US state of Georgia. Exposure to ambient PM2.5 during pregnancy has been associated with low birth weight in many studies, although the estimates reported in the literature are heterogeneous.13–16 In birth outcome studies, estimates of time-varying exposures are needed for specific pregnancy windows using reported gestational age and birth date. Other than the study of Alexeeff et al.,7 who applied spatial simulation extrapolation to spatial exposure models to correct for measurement error in a Massachusetts birth cohort, published studies of birth weight and PM2.5 have not accounted for exposure measurement error. Here, we extend bootstrap-based measurement error correction methods to spatiotemporal exposures at the trimester scale and compare the results obtained from applying different correction approaches.
We obtained vital record data for births from the Office of Health Information and Policy (OHIP), Georgia Division of Public Health. The study cohort included all singleton, full-term (gestational age ≥37 weeks) births without major structural congenital birth defects and with conception date between 1 January 2002 and 31 December 2005 to mothers ages 15 to 44 (n = 442,436). Gestational age was defined according to date of last menstrual period, and date of conception was estimated by adding 14 days to the date of last menstrual period. We limited the study cohort to births to non-Hispanic white, non-Hispanic black, and Hispanic mothers. Each record was geocoded to one of the 4,788 Census 2000 block groups in Georgia by OHIP using maternal residential address at delivery. After removing records with incomplete covariate information, 403,881 birth records were available for analysis (Figure 1). This study was approved by the Institutional Review Board of Emory University.
To predict trimester-specific ambient PM2.5 exposures for each maternal record, we fit a spatiotemporal model for PM2.5 with 2-week temporal and 1 km grid cell spatial resolution. The PM2.5 modeling region included all of Georgia and portions of surrounding states (Figure 2). We divided the monitoring domain into 598,320 1 km grid cells to accommodate gridded covariate information.
We obtained daily 24-hour average PM2.5 concentrations from the EPA Air Quality System (http://www.epa.gov/aqs) for 74 Federal Reference Method monitors (Figure 2). We aggregated the daily measurements to 2-week average concentrations, dropping 2-week averages derived from fewer than four measurements. For modeling, monitor values were assigned to the center of the grid cell containing the monitor. We obtained measurements of average elevation within each grid cell from the national elevation database (http://ned.usgs.gov). We computed the average percent forest cover using data from the National Land Cover Database (http://www.mrlc.gov) and the total road length within each grid cell from ESRI Streetmap road data (Environmental Systems Research Institute Inc., Redland, CA). We calculated the distance to emissions point sources using data from the 2002 National Emissions Inventory report. Two-week average values of temperature, wind speed, and relative humidity were calculated at each grid cell for the entire study period, as described by Hu et al.17
The hierarchical spatiotemporal model for outdoor ambient PM2.5 can be written as follows:
is a long-term spatial mean,
is a time trend with spatially-varying coefficients,
are spatiotemporal covariates, and
is a space-time residual term.18–21 This model is implemented in the SpatioTemporal package in R.22 The long-term mean component included percent forest cover and distance to emissions sources as covariates and a spatially correlated error structure, parameterized as an exponential covariance with range and sill. The time trend was estimated from observations at the Air Quality System monitors.21,23 The trend coefficients had a mean depending on elevation, forest cover, local and highway road length, and a spatially correlated error structure.
We evaluated the predictive accuracy of the model using leave-one-out cross-validation, a procedure that involves leaving out one monitor location at a time and refitting the model using all remaining monitors. We quantified cross-validation performance via mean-squared error-based R2,18,19 root mean-squared error (RMSE), and mean absolute error (MAE). We computed these measures with 2-week values, which incorporate spatial and temporal variability, and with nonoverlapping 12-week averages, to better reflect the trimester exposures of interest.
We predicted 2-week average concentrations of ambient PM2.5 at grid cell locations across the modeling domain for the period 1 January 2002 to 31 December 2006. Ambient concentrations at each census block group were assigned based on the value of the grid cell containing the block group, and so each birth record linked to a single block group had the same exposure. Trimester average exposures were computed for each record by assigning trimester start and end dates to nearest 2-week modeling period and averaging predicted exposures for the corresponding block group between these dates.
The spatial distribution of births (Figure 1) encompasses a much broader area of the state than is covered by the PM2.5 monitors (Figure 2), although most monitors are in counties with high birth counts. The siting guidelines for PM2.5 monitors favor areas of high population density,24 but also include criteria addressing areas of high pollution levels and locations near important sources. Therefore, we do not regard the spatial distributions of monitor and subject locations as equivalent. To improve spatial compatibility between the distributions of monitor locations and maternal residences, we restricted our primary analysis to births to mothers residing in a county with a monitor, which left 180,440 records for analysis. As a sensitivity analysis, we also fit the health model using records from all counties, which we expect to suffer from measurement error due to spatial incompatibility between the subjects and the monitors.
We modeled birth weight as a continuous outcome in a linear regression model with trimester PM2.5 exposure as the covariate of interest. Following the study of Darrow et al.,25 we included indicators for maternal education (less than 9th grade, 9th through 12th grade, high school diploma or equivalent, some college or higher), maternal race, reported tobacco and alcohol use during pregnancy, and county as potential confounders. We included natural cubic splines for the potential confounders census block group percent below poverty (3 degrees of freedom [df]), conception date (16 df), and maternal age (3 df) and its interaction with maternal race. Due to their strong relationship with birth weight, we included indicators for infant sex, gestational week, and being firstborn.
We conducted a sensitivity analysis that varied the extent of temporal adjustment in the birth weight model by increasing the degrees of freedom in the spline for conception date. We explored possible nonlinearity in the association between PM2.5 and birth weight by representing PM2.5 via a spline term in the birth weight model.
The restriction of the study cohort to subjects residing in a county with a monitor provides one correction for measurement error, by approximately matching the distribution of subject and monitor locations to improve spatial compatibility. To address remaining measurement error, we implemented a nonparametric bootstrap that extends the method of Szpiro and Paciorek5 to spatiotemporal data.
The nonparametric bootstrap resamples monitor locations to reflect variation in the predicted exposure surface derived from different monitor locations and resamples birth records to capture sampling variability in the epidemiologic analysis arising from different subjects. The procedure can be outlined as follows:
- Resample 74 PM2.5 monitors, together with their observations, with replacement.
- Estimate spatiotemporal exposure model parameters using the sampled monitors.
- Make new predictions of 2-week average PM2.5 concentrations at each grid cell center.
- Resample 180,440 birth records with replacement, and compute the average trimester exposure for each record.
- Fit a regression model using the sampled birth records and their new predicted exposures.
This procedure was repeated 5,000 times. If a monitor was selected more than once, we jittered its location by up to 300 m to avoid model-fitting problems arising from exact colocation. eAppendix 1 (https://links.lww.com/EDE/B164) provides code demonstrating this procedure.
We used the average PM2.5 coefficient from the bootstrap datasets to estimate bias in the association estimate from the primary model.26 The empirical standard error of the bootstrap point estimates was used to estimate the standard error of the bias-corrected estimate.
We conducted a sensitivity analysis to examine an alternative measurement error correction method that assumes a random spatiotemporal surface paradigm for PM2.5 concentrations. For this analysis, we implemented the parameter bootstrap, which is a computational approximation to the parametric bootstrap.2 The parameter bootstrap holds locations and birth records fixed and simulates new exposures and monitor observations. This procedure can be outlined as follows:
- Simulate new monitor observations and “true” grid cell concentrations using the spatiotemporal exposure model with parameter estimates from the original data.
- Estimate the spatiotemporal exposure model parameters using the simulated monitor observations.
- Make new predictions of 2-week average PM2.5 concentrations at each grid cell center.
- Simulate a new birth weight for each record from the “true” exposure at the corresponding grid cell and all other covariates, using the coefficients from the original health model fit.
- Fit a regression model using the simulated birth weights and all covariates as in the original analysis.
The bootstrap point estimates are used to bias-correct the original estimate and estimate its standard error in the same manner as the nonparametric bootstrap.
Mean birth weight for infants in the restricted cohort was 3,355 g (SD of 465 g), compared with 3,368 g (SD: 466 g) in the statewide cohort. Mothers in the restricted cohort had a higher prevalence of self-reported tobacco use during pregnancy and were more likely to be non-Hispanic black than mothers in the statewide cohort (Table 1).
The number of 2-week observations at PM2.5 monitoring sites during the study period ranged from 5 to 156, with a median of 89 observations. The mean of the 2-week concentrations at each monitoring site ranged from 10.0 to 18.1 μg/m3, with an average concentration of 14.2 μg/m3 (SD: 1.4 μg/m3). The estimated PM2.5 time trend showed regular seasonal variability across the study period, with slightly higher amplitude in later years (see eFigure 1; https://links.lww.com/EDE/B164). The estimated range parameters for the long-term mean, time trend coefficients, and residual term were 65, 191, and 659 km, respectively, indicating a large amount of spatial smoothing in all model components. Full model coefficients are provided in eTable 1 (https://links.lww.com/EDE/B164). Cross-validated model performance was excellent at the 2-week scale (R2 = 0.81, RMSE = 1.9 μg/m3, MAE = 1.4 μg/m3) and at the 12-week scale (R2 = 0.82, RMSE = 1.4 μg/m3, MAE = 1.0 μg/m3). eFigure 2 (https://links.lww.com/EDE/B164) shows a scatterplot of predicted and observed 2-week values at monitor locations.
Predicted PM2.5 exposure was highest during the third trimester for the restricted cohort (mean of 14.6 μg/m3), because births were most frequent in the summer months, when PM2.5 concentrations were also highest. In all trimesters, the mean exposure was higher for the restricted cohort than for the statewide cohort (Table 2). Correlations between exposures in consecutive trimesters, without controlling for seasonality, were small (0.26–0.39), while exposures in the first and third trimester were negatively correlated. Differences in predicted exposure across space can be seen in Figure 3, which shows the long-term average prediction at each block group.
In births to mothers living in a county with a PM2.5 monitor, a difference of 1 μg/m3 in ambient PM2.5 during the first, second, and third trimesters was associated with differences in birth weight of −1.5 g (95% confidence interval [CI]: −3.1, 0.1), −1.6 g (95% CI: −3.2, 0.1), and −2.4 g (95% CI: −3.9, −0.8), respectively, when no bootstrap correction was made (Table 3). The nonparametric bootstrap-corrected estimates for each trimester were −1.5 g (95% CI: −3.2, 0.1), −1.6 g (95% CI: −3.3, 0.0), and −2.5 g (95% CI: −4.2, −0.8), respectively. The bootstrap-corrected standard error for the third trimester was 0.85 g, compared with 0.78 g in the uncorrected analysis. In contrast to the nonparametric bootstrap, the parameter bootstrap results showed little bias in the point estimates but much larger underinflation in the standard errors. For the third trimester, the parameter bootstrap-corrected estimate was −2.4 g with a standard error of 0.92 g.
In the sensitivity analysis that included births from all counties, the estimated associations for all trimesters were attenuated. The estimates for second and third trimester exposure were similar (−0.62 [95% CI: −1.7, 0.45] and −0.66 [95% CI: −1.7, 0.35] g per 1 μg/m3, respectively), and the association for the first trimester was weaker and had opposite sign (0.28 g per 1 μg/m3). The standard errors for these estimates were much smaller (0.51–0.55) than those from the analysis of the restricted cohort. As a further sensitivity analysis, we applied the bootstrap correction methods to the statewide dataset. The results, provided in eTable 2 (https://links.lww.com/EDE/B164), are similar to the uncorrected statewide estimates.
Sensitivity analyses for increased temporal adjustment did not yield substantively different estimates for the association of interest (see eTable 3; https://links.lww.com/EDE/B164). There was no evidence that the PM2.5–birth weight association was nonlinear.
We have presented an approach for correcting for measurement error induced by using predicted spatiotemporal pollutant exposures in regression models. Using Georgia birth records, we demonstrated these methods in an analysis of the association between birth weight and ambient fine particulate matter. To improve spatial compatibility between the monitor and subject location distributions, we restricted the analysis to births to mothers residing in a county with a monitor. We observed a strong association between third trimester exposure and reduced birth weight. This association was robust to further correction for measurement error from a nonparametric bootstrap.
The large difference in point estimates between the primary analysis of the restricted cohort and the sensitivity analysis of the statewide cohort highlights the importance of accounting for spatial compatibility in air pollution epidemiology studies. However, the small number of monitors makes it difficult to be certain of the true extent of spatial incompatibility and our success in correcting it by restricting the cohort. Theoretical results of Szpiro and Paciorek5 established the importance of spatial compatibility when using unpenalized spatial splines, but this study suggests that it is an important condition for spatiotemporal kriging models as well.
Within the restricted cohort analysis, the nonparametric bootstrap showed different amounts of measurement error between trimesters. The slight bias in the first trimester estimate is within the range of simulation error expected for the number of bootstrap replications. Although the magnitudes of the absolute bias in the second and third trimester estimates differed (0.08 and 0.14 g, respectively), their relative biases were quite similar (5.1% and 5.9%, respectively). Bias due to classical-like measurement error has been shown to be multiplicative for unpenalized spatial spline exposure models under the fixed surface paradigm,5 and these results suggest this may also hold for spatiotemporal kriging models.
The contrasting results from the parameter bootstrap and nonparametric bootstrap corrections demonstrate the importance of the assumed underlying framework for the exposure surface across space. The trimester-length averaging period is long enough that we believe long-term geographical and commercial features and seasonal meteorology determine the average concentrations and therefore the fixed spatiotemporal surface paradigm is appropriate. Under this paradigm, we do not assume that the statistical model can fully represent the underlying spatiotemporal surface. Resampling of the monitor locations and times leads to spatially- and temporally-varying bias, but the finite variability in the underlying surface limits variation between predicted exposures derived from different monitor locations and times. Thus, we expect the nonparametric bootstrap to partition measurement error into both bias and increased variability of the point estimate. In contrast, the random surface paradigm assumes a correctly specified exposure model that varies across studies due to different realizations of a spatiotemporal process. We expect that the parameter bootstrap, which simulates new observations from the model, to identify additional variability in the point estimate but little bias.
The birth weight results of this analysis are consistent with previously reported findings that corrected for exposure measurement error. In an analysis of a Massachusetts birth cohort from 2008, Alexeeff et al.7 found that a 1 μg/m3 difference in ambient PM2.5 exposure during the third trimester was associated with 3.5 g lower birth weight. This difference increased in magnitude to 4.9 g lower birth weight when they applied their spatial simulation extrapolation measurement error correction method. Unlike our approach, they corrected exposures estimated from averages of independent monthly spatial exposure models instead of correcting predictions from a spatiotemporal model.
Although our correction methods address measurement error introduced from the use of predicted exposures, they do not account for all sources of measurement error, particularly factors that influence the relationship between ambient level and personal exposure. For example, maternal residential mobility may introduce measurement error when exposures are assigned based on address at delivery,27,28 although a recent study suggests this impact may be limited.29 For mothers who do not move during pregnancy, the assignment of each birth record to a location at the resolution of census block group potentially introduces exposure misclassification because this location may not well reflect the daily activity pattern of the mother. Although the spatiotemporal modeling framework provides more spatially refined estimates of exposure than using assignment to values at a central site, the 1 km grid cannot capture fine-scale gradients below this resolution. However, particulate matter concentrations (PM2.5 and coarser fractions) tend to be more spatially homogeneous than primary pollutants such as NO2, so this may not substantively impact inference.30–32 Less daily mobility in the final weeks before birth may reduce this source of measurement error in the third trimester analysis. In general, these sources of measurement error could affect both point estimates and standard errors, and are difficult to correct for without further individual-level information.
We believe we have adequately accounted for residual temporal confounding, as the results did not change substantively when additional degrees of freedom were added to the temporal adjustment; however, some residual within-county spatial confounding might be present. The regression model for birth weight does not include several potentially confounding maternal factors for which information was not available, including illicit drug use, stress, and socioeconomic differences beyond education level and census tract level poverty measures.33 We controlled for gestational age in the analyses because preterm birth is a complex disease with multiple causes that exhibits strong socioeconomic and spatial patterning at the population level.34 As reduced birth weight shares many of these same causes, we hope to have limited the potential for bias from spatial confounding by controlling for gestational age. Furthermore, gestational age is very strongly correlated with birth weight, and controlling for gestational age greatly improves the precision of the association estimates. The restriction of the cohort to full-term, live births without structural defects could potentially introduce some collider stratification bias in an estimate of the association between particulate matter and birth weight. However, most of the variability in gestational age is caused by factors other than ambient particulate matter concentrations, so the magnitude of this potential collider bias would likely be small.
While the restriction to births to mothers residing in counties with a monitor reduces measurement error by improving spatial compatibility, it also results in the analysis being performed on a subset of the population. Regulatory monitors tend to be sited near urban areas, and the restricted cohort differed from the statewide cohort in racial composition and tobacco use. As such, the effect estimates from the restricted cohort analysis likely better represent the associations in urban areas than for Georgia as a whole. Methods exist for generalizing these results to the full state population,35,36 but they rely on strong assumptions about the relationships between PM2.5, birth weight, and measured covariates that we do not make here. The tradeoff between reducing measurement error and analyzing only a subset of the larger population is an important challenge facing birth cohort studies. Supplemental air quality monitoring provides one approach to mitigating this challenge,37 but can be prohibitively difficult for large cohorts. Satellite-based measurements of surrogate exposure are becoming more widely used in areas with limited monitoring, but do not directly solve the problem of limited monitoring data because satellite measurements require ground-based monitoring data for calibration.17,38 Furthermore, other sources of measurement error discussed above, such as variability in daily activity patterns, would remain.
In summary, we extended bootstrap correction methods to spatiotemporal exposures to further reduce measurement error and improve inference about health effects of environmental exposures. We analyzed a restricted cohort to reduce measurement error, and we demonstrated the importance of spatial compatibility between subject and monitor locations. Our results support an association between third trimester ambient fine particulate matter exposure and reduced birth weight.
1. Gryparis A, Paciorek CJ, Zeka A, Schwartz J, Coull BA. Measurement error caused by spatial misalignment in environmental epidemiology. Biostatistics. 2009;10:258–274.
2. Szpiro AA, Sheppard L, Lumley T. Efficient measurement error correction with spatially misaligned data. Biostatistics. 2011;12:610–623.
3. Basagaña X, Aguilera I, Rivera M, et al. Measurement error in epidemiologic studies of air pollution based on land-use regression models. Am J Epidemiol. 2013;178:1342–1346.
4. Lopiano KK, Young LJ, Gotway CA. Estimated generalized least squares in spatially misaligned regression models with Berkson error. Biostatistics. 2013;14:737–751.
5. Szpiro AA, Paciorek CJ. Measurement error in two-stage analyses, with application to air pollution epidemiology. Environmetrics. 2013;24:501–517.
6. Bergen S, Sheppard L, Sampson PD, et al. A national prediction model for PM2.5 component exposures and measurement error-corrected health effect inference. Environ Health Perspect. 2013;121:1017–1025.
7. Alexeeff SE, Carroll RJ, Coull B. Spatial measurement error and correction by spatial SIMEX in linear regression models when using predicted air pollution exposures. Biostatistics. 2016;17:377–389.
8. Bergen S, Szpiro AA. Mitigating the impact of measurement error when using penalized regression to model exposure in two-stage air pollution epidemiology studies. Environ Ecol Stat. 2015;22:601–631.
9. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement Error in Nonlinear Models. 2006.2nd ed. Boca Raton, FL: Chapman & Hall/CRC.
10. Spiegelman D. Regression calibration in air pollution epidemiology with exposure estimated by spatio-temporal modeling. Environmetrics. 2013;24:521–524.
11. Szpiro AA, Paciorek CJ. Rejoinder. Environmetrics. 2013;24:531–536.
12. Cressie N. Statistics for Spatial Data. 1993.Revised Ed. Hoboken, NJ: John Wiley & Sons.
13. Dadvand P, Parker J, Bell ML, et al. Maternal exposure to particulate air pollution and term birth weight: a multi-country evaluation of effect and heterogeneity. Environ Health Perspect. 2013;121:367–373.
14. Shah PS, Balkhair T; Knowledge Synthesis Group on Determinants of Preterm/LBW births. Air pollution and birth outcomes: a systematic review. Environ Int. 2011;37:498–516.
15. Stieb DM, Chen L, Eshoul M, Judek S. Ambient air pollution, birth weight and preterm birth: a systematic review and meta-analysis. Environ Res. 2012;117:100–111.
16. Fleischer NL, Merialdi M, van Donkelaar A, et al. Outdoor air pollution, preterm birth, and low birth weight: analysis of the world health organization global survey on maternal and perinatal health. Environ Health Perspect. 2014;122:425–430.
17. Hu X, Waller LA, Lyapustin A, et al. Estimating ground-level PM2.5 concentrations in the Southeastern United States using MAIAC AOD retrievals and a two-stage model. Remote Sens Environ 2014;140:220–232.
18. Keller JP, Olives C, Kim SY, et al. A unified spatiotemporal modeling approach for predicting concentrations of multiple air pollutants in the multi-ethnic study of atherosclerosis and air pollution. Environ Health Perspect. 2015;123:301–309.
19. Lindström J, Szpiro AA, Sampson PD, et al. A flexible spatio-temporal model for air pollution with spatial and spatio-temporal covariates. Environ Ecol Stat. 2014;21:411–433.
20. Sampson PD, Szpiro AA, Sheppard L, Lindström J, Kaufman JD. Pragmatic estimation of a spatio-temporal air quality model with irregular monitoring data. Atmos Environ. 2011;45:6593–6606.
21. Szpiro AA, Sampson PD, Sheppard L, Lumley T, Adar SD, Kaufman J. Predicting intra-urban variation in air pollution concentrations with complex spatio-temporal dependencies. Environmetrics. 2009;21:606–631.
22. Lindström J, Szpiro AA, Sampson PD, Bergen S, Oron AP. SpatioTemporal: Spatio-Temporal Model Estimation. R package version 1.1.1. 2012.
23. Fuentes M, Guttorp P, Sampson PD. Held L, Finkenstadt B, Isham V. Using transforms to analyze space-time processes. In: Statistical Methods for Spatio-Temporal Systems. 2007:Boca Raton, FL: Chapman Hall/CRC; 77–150.
24. Georgia Department of Natural Resources. Ambient Air Surveillance Report; 2010.2009
25. Darrow LA, Klein M, Strickland MJ, Mulholland JA, Tolbert PE. Ambient air pollution and birth weight in full-term infants in Atlanta, 1994–2004. Environ Health Perspect. 2010;119:731–737.
26. Davison AC, Hinkley DV. Bootstrap Methods and Their Applications. 1997.New York, NY: Cambridge University Press.
27. Miller A, Siffel C, Correa A. Residential mobility during pregnancy: patterns and correlates. Matern Child Health J. 2010;14:625–634.
28. Bell ML, Belanger K. Review of research on residential mobility during pregnancy: consequences for assessment of prenatal environmental exposures. J Expo Sci Environ Epidemiol. 2012;22:429–438.
29. Pereira G, Bracken MB, Bell ML. Particulate air pollution, fetal growth and gestational length: the influence of residential mobility in pregnancy. Environ Res. 2016;147:269–274.
30. Sellier Y, Galineau J, Hulin A, et al; EDEN Mother–Child Cohort Study Group. Health effects of ambient air pollution: do different methods for estimating exposure lead to different results? Environ Int. 2014;66:165–173.
31. Brauer M, Lencar C, Tamburic L, Koehoorn M, Demers P, Karr C. A cohort study of traffic-related air pollution impacts on birth outcomes. Environ Health Perspect. 2008;116:680–686.
32. Dionisio KL, Baxter LK, Chang HH. An empirical assessment of exposure measurement error and effect attenuation in bipollutant epidemiologic models. Environ Health Perspect. 2014;122:1216–1224.
33. Hao H, Chang HH, Holmes HA, et al. Air pollution and preterm birth in the U.S. state of Georgia (2002–2006): associations with concentrations of 11 ambient air pollutants estimated by combining Community Multiscale Air Quality Model (CMAQ) simulations with stationary monitor measurements. Environ Health Perspect. 2016;124:875–880.
34. Lorch SA, Enlow E. The role of social determinants in explaining racial/ethnic disparities in perinatal outcomes. Pediatr Res. 2016;79:141–147.
35. Stuart EA, Cole SR, Bradshaw CP, Leaf PJ. The use of propensity scores to assess the generalizability of results from randomized trials. J R Stat Soc Ser A Stat Soc. 2001;174:369–386.
36. Lesko CR, Cole SR, Hall HI, et al; CNICS Investigators. The effect of antiretroviral therapy on all-cause mortality, generalized to persons diagnosed with HIV in the USA, 2009-11. Int J Epidemiol. 2016;45:140–150.
37. Cohen MA, Adar SD, Allen RW, et al. Approach to estimating participant pollutant exposures in the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). Environ Sci Technol. 2009;43:4687–4693.
38. Kloog I, Koutrakis P, Coull BA, Lee HJ, Schwartz J. Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements. Atmos Environ. 2011;45:6267–6275.