What this study adds
A large number of studies have utilized birth records to examine associations between preterm birth and gestational exposure to ambient air pollution. One main criticism of these studies is that reported gestational age is subject to error, potentially resulting in outcome misclassification. Using birth records that contain gestational age estimated using both the last menstrual period and clinical assessment, this study finds that the positive associations between preterm birth and air pollution exposures are robust against uncertainty in gestational age. Moreover, estimated associations were most elevated when a more stringent definition of preterm birth was used.
Preterm birth (PTB), defined as gestational age (GA) less than 37 completed weeks, is a known predictor of increased infant mortality and morbidity, as well as long-term health consequences.1–5 While numerous epidemiologic studies have found positive associations between PTB and maternal exposure to ambient air pollution, recent systematic reviews and meta analyses have reported substantial heterogeneity in associations across studies.6–8 The most recent US Environmental Protection Agency Integrated Science Assessment concluded that relationships between air pollution and reproductive outcomes were “suggestive of a causal relationship.”9–11
In most previous studies, associations between ambient air pollution exposure and PTB were investigated by retrospectively linking live birth certificates and exposures based on maternal residential address. Compared to prospective birth cohorts, the use of birth records is cost-effective for acquiring sufficient sample size with large spatial-temporal coverage to estimate small but public health-relevant associations at the population level. Limitations of using birth records are well recognized, including bias in response (e.g., under-reporting of maternal alcohol and cigarette use12,13), lack of important confounders (e.g., diet, physical activity, and body mass index), and random recording errors. For PTB studies, uncertainty in GA leads to several unique challenges.14 First, uncertainty in GA can lead to outcome misclassification, particularly around the 37-week cutoff. Second, GA is used to back-calculate conception date and construct the exposure profile during pregnancy.
In the United States after 2000, birth certificates provide two sources of information on GA, and both sources are subject to errors. The first estimate uses the reported date of the last menstrual period (LMP), which may suffer from recall errors and inter-individual variability in timing between LMP and conception.15 A second clinical estimate is based on a combination of various clinical measurements and physician judgment. However, accuracy can depend on whether these measurements are based on newborn assessment or prenatal ultrasounds and on the quality of the clinical examination.16 Some work has been done comparing clinical estimates and estimates of GA from birth certificates, often showing only moderate concordance.17–20 Previous studies of air pollution and PTB have utilized GA defined a priori by the investigators using either LMP21–23 or the clinical estimate.14–26 Often, when the preferred source of GA information is missing, the other GA estimate is used.
Few studies have evaluated effects of uncertainty in GA estimates when examining associations with ambient air pollutant exposure or consider the use of different GA estimates as a sensitivity analysis. Recently, Rappazzo et al.27 found that results can be sensitive to using clinical or LMP GA estimates in an analysis of fine particulate matter and PTB in New Jersey, Pennsylvania, and Ohio, United States. In this study, we evaluated the impact of GA definitions on air pollution risk associations using birth certificates in Atlanta, Georgia, between 2002 and 2006. We expand the work of Rappazzo et al. by considering additional GA definitions and ambient air pollutants, and we implement a multiple imputation approach to incorporate GA uncertainty in analyses.
Health and air quality data
We obtained individual-level birth certificate data for the 20-county Atlanta metropolitan area (Barrow, Bartow, Carroll, Cherokee, Clayton, Cobb, Coweta, DeKalb, Douglas, Fayette, Forsyth, Fulton, Gwinnett, Henry, Newton, Paulding, Pickens, Rockdale, Spalding, and Walton counties) from the Office of Health Indicators for Planning, Georgia Department of Public Health. Georgia birth certificates recorded two estimates of GA in complete weeks: a clinical estimate and an LMP-based estimate. GA estimates were used to back-calculate conception date, which was assumed to occur at the second gestational week based on obstetric convention. We included singleton pregnancies with conception dates between January 1, 2002, and February 28, 2006, to avoid the fixed-cohort bias (n = 587,937).28,29 Additional exclusion criteria included (1) maternal residential address at delivery unsuccessfully geocoded to the 2000 Census block group (n = 12,562), (2) birth weight less than 400 g (n = 213), (3) GA estimates of below 27 weeks or above 44 weeks (n = 1,442), (4) mother’s age less than 15 years or greater than 44 years (n = 851), (5) presence of one or more identified congenital anomaly (n = 2,086), and (6) PTBs with a procedure code for induction of labor (n = 5,335).
Exposure to ambient air pollution during pregnancy was calculated using a previously developed gridded data fusion product at a 12-km spatial resolution.30 Specifically, numerical model simulations from the Community Multi-scale Air Quality Model (CMAQ) were bias-corrected with monitoring measurements in Georgia. Each birth was linked to a CMAQ grid cell based on the maternal address census block group at delivery. Exposures during the first and second trimester were obtained by averaging daily concentration estimates for five pollutants: 1-hour maximum carbon monoxide (CO) and nitrogen oxides (NOx); 24-hour average particulate matter less than 2.5 μm in aerodynamic diameter (PM2.5); and the PM2.5 components elemental carbon (EC) and organic carbon (OC). Trimester exposures were calculated separately based on either the clinical or LMP-based GA.
We considered four different PTB definitions. A birth was designated as a PTB if (1) the LMP-based GA was <37 weeks, (2) the clinical GA was <37 weeks, (3) either the LMP-based or the clinical GA was <37 weeks, or (4) both the LMP-based and the clinical GA were <37 weeks. For PTB definitions (3) and (4), we used the average of trimester exposures calculated using conception dates estimated from LMP-based and clinical GA as the exposure.
We first analyzed how PTB outcome uncertainty varied across demographic variables and air pollution exposures. Among births diagnosed as PTB using either the clinical or the LMP-based GA, we defined a discordant indicator when these two PTB diagnoses differed. Using logistic regression, we first regressed the discordant indicator on a set of demographic covariates. Associations between discordance and exposures were evaluated one-at-a-time by adding air pollution exposure to the model with demographic covariates. We excluded concordant full-term births in this analysis to avoid comparing the subset of PTBs to a reference group dominated by full-term births.
For each PTB definition, we used logistic regression to estimate associations between pollutant exposures during the first and second trimesters and PTB. In the air pollution and PTB models, we adjusted for maternal education (less than 9th grade, 9th to 12th grade, high school graduate, college), race (Asian, black, Hispanic, white, other), tobacco use during pregnancy, residential county, a smooth function of poverty level as measured by block group–level percent below poverty, and a smooth function of estimated conception date. Smooth functions were parameterized using natural cubic splines with five and twelve degrees of freedom for poverty and conception date, respectively. Other variables including maternal age, alcohol use, and number of previous births were examined as potential confounders but did not impact the air pollution association estimates and were ultimately removed.
We also directly incorporated the additional uncertainty in the PTB definition using a multiple imputation approach. Binary PTB status was imputed through draws from a binomial distribution defined based on the two estimates of GA. Specifically, the probability of PTB, P, is defined as the proportion of weeks less than 37 among the GA range given by the clinical and the LMP-based estimates. For example, if the two GA estimates for a birth were 33 and 39 weeks, the probability of PTB is P = 4/7 (4 weeks of being PTB among seven total weeks). Concordant PTB status from LMP-based and clinical estimates of GA had P = 1 and concordant full-term births had P = 0. We took draws from the resulting binomial distributions for each birth to obtain 25 imputed data sets and performed separate logistic regressions to estimate air pollution associations with the aforementioned covariates for each set. The resulting 25 coefficient estimates and standard errors for pollutant effects were combined using the method by Rubin.31
The study cohort consisted of 267,801 singleton births from the 20-county Atlanta metropolitan area. Of these births, 8.31% (n = 22,262) were preterm using LMP estimates of GA; 7.40% (n = 19,828) were preterm using clinical estimates; 9.67% (n = 25,903) were preterm based on either the LMP or clinical determination; and 6.04% (n = 16,187) were preterm when there was concordance between LMP and clinical estimates. Hence, agreement in PTB diagnoses only occurred in 62.5% of PTBs identified using either LMP or clinical estimate of GA. Table 1 provides additional summary statistics of the study cohort characteristics stratified by preterm status. Supplementary Table S1; http://links.lww.com/EE/A23 provides summary statistics among PTBs based on the four definitions and shows negligible differences in maternal characteristics.
Overall GA estimates were similar between LMP and clinical definitions, but larger disagreements occurred among PTBs. Among all births, 54.1% of GA estimates were identical; 32.4% of estimates differed by 1 week; 10.2% of estimates differed by 2 weeks; and 3.3% of estimates differed by 3 weeks or more. However, among births with either an LMP or clinical PTB diagnosis, only 39.9% of GA estimates were identical and 12.8% differed by 3 weeks or more.
Trimester-wide average pollutant exposures were similar across the three different assessment methods: using the conception date derived from LMP, clinical estimate, or an average of the previous two. Table 2 summarizes the mean exposure level for each pollutant and trimester, as well as the interquartile range (IQR) for the LMP definition. Correlations between exposures based on LMP and clinical estimates were very high, ranging from 0.976 to 0.999, indicating uncertainty in GA had minimal impacts on trimester-average exposures.
Among births with at least one PTB diagnosis (either clinical or LMP), higher odds of disagreement between diagnoses was associated with maternal race/ethnicity (Hispanic versus non-Hispanic, Asian versus White, and White versus Black), married mothers, and tobacco use during pregnancy. Higher trimester-wide exposures to CO and NOx were associated with lower odds of disagreement. Specific odds ratios (ORs) and 95% confidence intervals for this disagreement analysis are given in Supplementary Table S2; http://links.lww.com/EE/A23.
Figure 1 shows the estimated associations between PTB and average pollutant concentration during trimester 1 and trimester 2 using various PTB definitions. Log ORs and 95% confidence intervals for all exposure and PTB definition combinations are given in Supplementary Table S3; http://links.lww.com/EE/A23. Adjusting for demographic covariates and spatial-temporal trends, exposure to CO, EC, NOx, and OC during the first trimester was consistently associated with increased odds of PTB using all PTB definitions. CO, EC, NOx, OC, and PM2.5 exposures in the second trimester were associated with most PTB definitions. Second trimester exposure to NOx, on a per-IQR level, was most strongly associated with PTB.
Estimated ORs per IQR exposure using the clinical PTB definition are generally similar to estimates using the LMP PTB definition. Using the most stringent definition of PTB (agreeing diagnoses) consistently yielded the largest ORs. In contrast, ORs obtained from PTB defined using either clinical or LMP-based GA (i.e., least stringent definition) tended to be the lowest among the PTB definitions. For example, average PM2.5 during the second trimester was associated with ORs: LMP OR = 1.07, Clinical OR = 1.08, Either OR = 1.04, and Both OR = 1.13. Across pollutants and trimesters, differences in OR estimates for these two PTB definitions ranges from 0.0% to 8.3%. Similar patterns were observed in stratified analyses by maternal race (black versus non-black), maternal ethnicity (Hispanic versus non-Hispanic), and maternal marital status as shown in Supplementary Figures S1, S2, and S3; http://links.lww.com/EE/A23.
Using imputed PTB status gives point estimates that tend to be between estimates based on either LMP or clinical diagnoses and estimates based on agreeing diagnoses. More importantly, confidence intervals from the imputed estimates were between 7.4% and 43.8% wider than the other PTB definition estimates. Median increases in interval length across exposures are 30.1%, 25.1%, 10.6%, and 40.5% comparing imputed PTB status to LMP-based, clinical, both, or either PTB diagnosis, respectively.
We observed positive associations between several pollutants and PTB in both the first and second trimester using different PTB definitions. Using the most stringent definition of PTB (agreeing diagnoses) resulted in elevated associations, while using the least stringent definition of PTB (either diagnosis) resulted in the weakest associations. This observation may be attributed to increased sensitivity that may minimize outcome misclassification among true PTB, leading to less effect attenuation. It is also possible that the larger air pollution OR for the more stringent PTB definition is due to the lower baseline rates of PTB.
Uncertainty in GA can contribute to both outcome misclassification and exposure measurement error when timing of exposure during gestation is important. A previous study of PM2.5 and PTB by Rappazzo et al.27 found that substantially more births were classified as PTB using LMP estimates. This is consistent with our data. However, the degree of difference in estimated air pollution associations across different PTB definitions in our study was smaller. This may be because (1) we used trimester-wide averages while Rappazzo et al. used weekly averages, and (2) the comparison by Rappazzo et al. was carried out using two different cohorts because not all birth certificates contained both LMP and clinical GA estimates.
Previous studies had predominantly defined PTB using only the LMP-based GA or only the clinical GA. In our study, these two PTB definitions gave nearly identical OR estimates, suggesting that the choice of LMP-based or clinical GA for defining PTB may have limited impact on between-study heterogeneity. However, several factors have been suggested as potential contributors to the observed heterogeneity in studies of air pollution and birth outcomes.6,7,32 These include differences in particulate matter composition, the distributions of effect modifiers, residual confounding due to the use of various proxy measures of socioeconomic status, the magnitude of air quality measurement errors, and statistical methodologies.
Using a stringent definition of PTB (e.g., concordant LMP and clinical diagnoses), we may minimize true full-term births being classified as preterm, but some truly PTB will be classified as full term. However, we consider this pattern of misclassification preferable due to its increased specificity. In our study, more than 90% of births were classified as full term using any definition of PTB. Incorrectly classifying full-term births as preterm would have a large impact by diluting the smaller PTB group with full-term births. Conversely, incorrectly classifying PTBs as full term would have negligible impact due to the large number of full-term births.
We found that trimester-wide average exposures were not sensitive to the choice of PTB definition. Hence, GA uncertainty likely contributes minimal exposure measurement error relative to other sources such as maternal residential mobility33, 34 and spatiotemporal exposure modeling of air pollution concentration.35
We found several demographic variables (e.g., married versus unmarried mother, and maternal race White versus Black) to be associated with higher rate of discordant diagnoses. These associations may reflect differences in GA across subpopulations, where very preterm GA is likely to have fewer discordant diagnoses. For example, among births with at least one PTB diagnosis, the average LMP GA was 35.1 weeks for married mothers versus 34.8 weeks for unmarried mothers and 35.2 weeks for maternal race whites versus 34.7 weeks for maternal race blacks.
Even though the birth certificate provides two estimates of GA, the true GA cannot be ascertained given the retrospective nature of the study design. We hence implemented a multiple imputation approach to introduce uncertainty and variability associated with the estimated GA and, consequently, the PTB diagnosis. Multiple imputation has been utilized to address outcome misclassification when validation data are available to estimate sensitivity and specificity.36 Given the large study sample size, we found robust associations between air pollutant exposure and PTB in our analyses with imputation, despite increases in confidence interval widths. This result suggests that findings from previous studies may not be qualitatively different despite the presence of outcome misclassification.
Several additional issues regarding PTB misclassification warrant future investigations. First, our imputation model assumes that the true GA is between the clinical and LMP estimates from the birth records; the true GA may be outside this range. Second, we focused solely on the first and second trimester exposure where the exposure window has fixed length and is only referenced by the estimated conception date. For time-varying and short-term exposures, further methods development is needed in order to handle outcome misclassification when time-to-event models are used to analyze PTB or log-linear models are used to analyze time-series of PTB counts.
Our study does not call into question results from previous ambient air pollution and PTB research using either LMP or clinical birth record estimates of GA, although reported associations may be underestimated compared to those obtained using a more stringent definition of PTB. Furthermore, associations reported in previous studies are likely not due to outcome misclassification, based on our findings using a multiple imputation approach to incorporate uncertainty in PTB diagnosis. The impacts of PTB uncertainties should be further examined in other study regions and time periods. We encourage exploring different definitions of PTB when possible and recommend the use of a PTB definition based on both clinical and LMP-based criteria. While using an agreeing definition of clinical and LMP PTB determinations will reduce power due to a decreased number of cases, studies leveraging birth records can likely achieve sufficient sample size. In our study, we did not observe a substantial increase in standard error between the use of agreeing LMP and clinical definitions compared to using either LMP or clinical definition of PTB.
Conflicts of interest statement
The authors declare that they have no conflicts of interest with regard to the content of this report.
1. Hack M, Klein NK, Taylor HG. Long-term developmental outcomes of low birth weight infants.Future Child19955176–196
2. Institute of MedicinePreterm Birth: Causes, Consequences, and Prevention2006Washington, D.C.The National Academies Press
3. Goldenberg RL, Culhane JF, Iams JD, Romero R. Epidemiology and causes of preterm birth.Lancet2008960675–94
4. Moster D, Lie RT, Markestad T. Long-term medical and social consequences of preterm birth.N Engl J Med2008359262–273
5. Saigal S, Doyle LW. An overview of mortality and sequelae of preterm birth from infancy to adulthood.Lancet2008371261–269
6. Li X, Huang S, Jiao A, et al. Association between ambient fine particulate matter and preterm birth or term low birth weight: an updated systematic review and meta-analysis.Environ Pollut2017227596–605
7. Stieb DM, Chen L, Eshoul M, Judek S, et al. Ambient air pollution, birth weight and preterm birth: a systematic review and meta-analysis.Environ Res2012117100–111
8. Shah PS, Balkhair T. Air pollution and birth outcomes: a systematic review.Environ Int2011372498–516
9. US Environmental Protection AgencyIntegrated Science Assessment for Particulate Matter2009Research Triangle Park, NCNational Center for Environmental Assessment
10. US Environmental Protection AgencyIntegrated Science Assessment for Carbon Monoxide2010Research Triangle Park, NCNational Center for Environmental Assessment
11. US Environmental Protection AgencyIntegrated Science Assessment for Ozone and Related Photochemical Oxidants2013Research Triangle Park, NCNational Center for Environmental Assessment
12. Howland RE, Mulready-Ward C, Madsen A, et al. Reliability of reported maternal smoking: comparing the birth certificate to maternal worksheets and prenatal and hospital medical records, New York City and Vermont, 2009.Matern Child Health J20151991916–1924
13. Northam S, Knapp TR. The reliability and validity of birth certificates.J Obstet Gynecol Neonat Nurs2006353–12
14. Parker JD, Schoendorf KC. Implications of cleaning gestational age data.Paediatr Perinat Epidemiol200216181–187
15. Alexander GR. The accurate measurement of gestational age: a critical step toward improving fetal death reporting and perinatal health.Am J Publ Health1997871323–1327
16. Lynch CD, Zhang J. The research implications of the selection of gestational age estimation method.Paediatr Perinat Epidemiol20072186–96
17. Wingate MS, Alexander GR, Buekens P, Vahratian A. Comparison of gestational age classifications: date of last menstrual period vs. clinical estimate.Ann Epidemiol2007176425–430
18. Qin C, Hsia J, Berg C. Variation between last-menstrual-period and clinical estimates of gestational age in vital records.Am J Epidemiol20081676646–652
19. Mustafa G, David RJ. Comparative accuracy of clinical estimate versus menstrual gestational age in computerized birth certificates.Publ Health Rep200111615–21
20. Ananth CV. Menstrual versus clinical estimate of gestational age dating in the United States: temporal trends and variability in indices of perinatal outcomes.Paediatr Perinat Epidemiol200721Suppl 222–30
21. Hao H, Chang HH, Holmes HA, et al. Air pollution and preterm birth in the U.S. state of Georgia (2002–2006): associations with concentrations of 11 ambient air pollutants estimated by combining Community Multiscale Air Quality Model (CMAQ) simulations with stationary monitor measurements.Environ Health Perspect2016124875–880
22. Rappazzo K, Daniels J, Messer L, et al. Exposure to fine particulate matter during pregnancy and risk of preterm birth among women in New Jersey, Ohio, and Pennsylvania, 2000–2005.Environ Health Perspect20141229992–997
23. Zhao N, Qui J, Zhang Y, et al. Ambient air pollutant PM10 and risk of preterm birth in Lanzhou, China.Environ Int20157671–77
24. Chang HH, Reich BJ, Miranda ML. Time-to-event analysis of fine particle air pollution and preterm birth: results from North Carolina, 2001–2005.Am J Epidemiol2012175291–98
25. Hyder A, Lee HJ, Ebisu K, et al. PM2.5
exposure and birth outcomes: Use of satellite- and monitor-based data.Epidemiology201425158–67
26. Mendola P, Wallace M, Hwang B, et al. Preterm birth and air pollution: critical windows of exposure for women with asthma.J Allergy Clin Immunol20171832432–440
27. Rappazzo K, Lobdell L, Messer L, et al. Comparison of gestational dating methods and implications for exposure-outcome associations: an example with PM2.5 and preterm birth.Occup Environ Med201774138–143
28. Barnett A. Time-dependent exposure and the fixed-cohort bias.Environ Health Perspect201111910a422–a423
29. Strand L, Barnett A, Tong S. Methodological challenges when estimating the effects of season and seasonal exposures on birth outcomes.BMC Med Res Methodol201111149
30. Friberg MD, Zhai X, Holmes HA, et al. Method for fusing observational data and chemical transport model simulations to estimate spatiotemporally resolved ambient air pollution.Environ Sci Technol2016503695–3705
31. Rubin DB. Multiple Imputation for Nonresponse in Surveys1987New York, NYWiley & Sons
32. Dadvand P, Parker J, Bell ML, et al. Maternal exposure to particulate air pollution and term birth weight: a multi-country evaluation of effect and heterogeneity.Environ Health Perspect20131213367–373
33. Pennington AF, Strickland MJ, Klein M, et al. Measurement error in mobile source air pollution exposure estimates due to residential mobility during pregnancy.J Expo Sci Environ Epidemiol201727513–520
34. Pereira G, Bracken MB, Bell ML. Particulate air pollution, fetal growth and gestational length: the influence of residential mobility in pregnancy.Environ Res2016147269–274
35. Keller JP, Chang HH, Strickland MJ, Szpiro AA. Measurement error correction for predicted spatiotemporal air pollution exposures.Epidemiology2017283338–345
36. Edwards JK, Cole SR, Troester MA, Richardson DB. Accounting for misclassified outcomes in binary regression models using multiple imputation with internal validation data.Am J Epidemiol20139904–912