Although ozone exposure has well-documented influences on respiratory health, including increased airway resistance and lung function decrements,^{1,2} determining and quantifying its causal effect on premature mortality remains a challenge. In recent regulatory impact analyses of air pollution control measures,^{3–5} the U.S. Environmental Protection Agency (EPA) excluded the ozone–mortality relationship from primary benefits estimates, stating that the epidemiologic literature was too uncertain to infer causality and provide reasonable quantitative estimates.

For ozone–mortality time-series studies, there are a few primary concerns. First, ozone concentrations are frequently correlated with concentrations of other air pollutants, which themselves have been causally linked with mortality. Because of the strong and consistent epidemiologic literature linking particulate matter (PM) and mortality,^{6–9} any observed relationship between ozone and mortality could simply reflect PM effects that were not adequately captured in the analysis. Other air pollutants such as carbon monoxide (CO), sulfur dioxide (SO_{2}), or nitrogen dioxide (NO_{2}) may be influential as well. Second, because ozone formation is greatest on hot and humid days, which are independently associated with increased mortality,^{10,11} proper statistical controls for weather in the analysis are crucial (and often difficult to model).

In addition, because ozone is highly reactive in indoor environments (where people spend most of their time), ambient ozone concentrations tend to be higher than personal ozone exposures.^{12,13} This makes epidemiologic findings based on central site monitors difficult to interpret. Air conditioning further reduces indoor ozone concentrations, especially on the hot and humid days when ozone concentrations are generally highest.^{12–14} Finally, annually averaged ozone–mortality risks are hard to interpret due to seasonal differences in ozone levels, an inadequate dynamic range during the winter, and differences among personal exposure–ambient concentration or ambient pollutant relationships by season.

Because of these factors, as well as differences in the analytical methods used to address them, simply pooling the results of the published literature has limited interpretability. An alternative approach is to exclude studies that did not adequately address all potentially influential factors. However, given the current literature, this would result in the exclusion of most studies. For example, the U.S. EPA^{3} evaluated 31 published studies and chose 4 studies in the United States^{15–18} that were potentially applicable for regulatory impact analyses. None of these studies controlled for fine particulate matter (PM_{2.5}), because monitoring data were not available at the times of the analyses. Similarly, an evaluation of 50 published estimates^{19} yielded 6 estimates (4 in the United States and 2 in Europe) that met basic screening criteria for a pooled estimate. These pooled estimates are therefore hard to interpret or to apply in regulatory impact analyses, and substantial uncertainties remain.

One approach for dealing with a varied and uncertain literature is to conduct a metaregression, controlling for site or study characteristics that could account for between-study variability. Known confounders or effect modifiers can be characterized external to the study, thus increasing the number of studies eligible for the analysis and allowing insight about the relationship between ozone and mortality. In this study, we identified the published literature on short-term effects of ozone on mortality and conducted a metaregression to determine the influence of other air pollutants, concentration–exposure relationships, weather, and study design on the magnitude of this relationship.

#### METHODS

We constructed our database by identifying time-series studies from recent meta-analyses,^{19,20} the EPA PM criteria document,^{9} a database provided by the U.S. EPA (Lisa Conner, U.S. EPA, personal communication, 2003), and a Medline search in October 2003 using the keywords “ozone” and “mortality.” We excluded studies that were not time-series studies of all-cause, all-age mortality risks or were not peer-reviewed and publicly available.

This approach yielded 71 potential studies,^{6,8,15–18,21–85} with some studies reporting multiple city-specific estimates. We removed estimates in which city-specific relative risks or their variances were not reported by the authors,^{8,27,44,51,53,56,64,71,72,82,85} estimates that were exact duplicates (same authors, cities, study years, and estimates) or superseded by more recent work by the same authors,^{23,34,40,46,56,60,68,76} and studies that lacked all-age, year-round relative risks.^{30,32,36,63,66,77,83} We also omitted publications from the National Morbidity and Mortality Air Pollution Study (NMMAPS),^{6,70} because these data were being reanalyzed separately by other investigators, and we omitted studies for which we were unable to obtain air pollution data necessary for the metaregression.^{31,42,43,49,50,57,61,73,74,80,81} Because only 2 of the remaining studies were in developing countries,^{24,25} we excluded these studies to avoid highly influential observations but tested the sensitivity of our findings to their inclusion. Twenty-eight studies remained, from which 48 city-specific relative risk estimates were available. When possible, we also extracted season-specific relative risk estimates, yielding 14 summer/ozone season values^{16,21,22,37,39,41,55,58,75,77,78,83} and 10 winter/nonozone season values.^{16,21,22,37,39,41,55,75,77}

We selected ozone relative risk estimates that were derived without controlling for other air pollutants. In studies that presented estimates with numerous lag times and averaging times, we selected same-day ozone concentrations when available. This approach was consistent with the majority of the literature, although it did not always yield the primary estimate reported by the authors. All relative risks were converted into percentage increases in mortality per 10 μg/m^{3} of 1-hour maximum ozone. Estimates that were based on ozone measured in parts per billion were converted to μg/m^{3}, assuming standard temperature and pressure (1 ppb = 1.96 μg/m^{3}). We applied conversion factors to move among 1-hour maximum, 8-hour maximum, and 24-hour average concentrations, with a ratio of 4:3:2 assumed at all sites based on the difference between the median and 95th percentile concentrations across sites for 1996–2000 monitoring data in the United States.^{86} Because these conversions could vary across sites and years, we evaluated indicator variables within the regression models to ensure that neither conversion was influential.

We categorized studies as GAM-affected or not, in which GAM-affected means that a study used generalized additive models without updating the default convergence criteria, an approach that could influence effect estimates and standard errors.^{87} We also classified studies by their approach for temperature control using categories proposed elsewhere^{20}: 1) studies that used a linear term for temperature (potentially missing the U-shaped relationship between temperature and mortality), 2) those that added dummy variables for extreme hot/cold days to the linear term, or 3) those that incorporated nonlinear temperature terms.

To capture the relationship between ozone concentrations and concentrations of other criteria pollutants (PM_{10}, PM_{2.5}, CO, SO_{2}, and NO_{2}) for each study, we constructed univariate regressions with ozone as the independent variable at each study site. This approach will yield the magnitudes of concentration changes associated with changes in ozone concentrations rather than providing only the correlations. Because we did not have access to the raw data used within the studies, we relied on publicly available monitoring data gathered from the U.S. EPA Air Quality System,^{88} the European Environmental Agency Air Base,^{89} the U.K. National Air Quality Information Archive,^{90} and the Canadian National Air Pollution Surveillance System.^{91} When data were available for the years of the epidemiologic studies within the cities or counties studied, those data were used. If those data were unavailable, we first expanded the years surrounding the study dates, followed by the counties surrounding the study site. PM_{2.5} data were not available for European cities. We used 1-hour maximum concentrations for ozone and CO, and daily average concentrations for all other pollutants. Regressions were run across the entire year as well as restricted to ozone season (May through October) and nonozone season (November through April). In addition, we calculated annual average 1-hour maximum ozone concentrations to capture potential nonlinear concentration–response relationships.

We considered 2 variables as proxies for the ozone personal exposure–ambient concentration relationship. Cooling degree days theoretically captures changes in activity patterns and ventilation used to reduce exposure to high temperatures, and residential central air-conditioning prevalence indicates whether climate control is likely to be based primarily on air-conditioning use (which would decrease indoor ozone levels) or on open windows (which would increase indoor ozone levels). Because daily temperature data were not available for all sites, we estimated cooling degree days using monthly mean temperatures.^{92} For the cities where daily data were available,^{93,94} the estimated cooling degree days were highly correlated with the actual (r = 0.99).

Air-conditioning prevalence data for the years approximately corresponding to the epidemiologic studies were obtained for U.S. cities^{95} and Canadian provinces.^{96,97} Data were available for selected cities in Canada and all provinces in 2001,^{98} and the ratio between city and province prevalence was used to adjust the earlier province data when available. Air-conditioning prevalence data for Europe were not available. Because there is no theoretical reason to suspect a linear relationship between cooling degree days and ozone–mortality relative risk, we used an indicator variable for cooling degree days above or below the median across sites. We similarly used an indicator variable for air-conditioning prevalence above or below the median among the sites with data, assuming all sites in Europe have prevalence below the median. We used actual air-conditioning prevalence for analyses on subsets of the data.

For our metaregression, we applied a hierarchical linear model as derived by Raudenbush and Bryk^{99} and applied in a previous air pollution meta-analysis.^{100} For this application, the level-1 model is defined as *βi* = *μi* + *εi*, where *βi* is the reported effect for study *i*, *μi* is the true effect, and *εi* is ∼N(0,*si*^{2}), where *si*^{2} is the reported variance of the effect estimate. In the second-level model, *μi* = *Wi**′γ* + *δi*, where *W*_{i} is a vector of site or study characteristics (ie, air-conditioning prevalence, GAM-affected), *γ* is a vector of regression coefficients to be determined, and *δi* represents the unexplained between-study variability, which is ∼ N(0,*τ*^{2}). *τ*^{2} is derived by maximizing the log of the likelihood function, which is proportional to

where γ* is the maximum likelihood estimate for the vector of derived coefficients, defined as (Σ *λi**WiWi*′)^{−1} Σ *λi**Wi*β_{i}, where *λ*_{i} is *τ*^{2}/(*si*^{2} + *τ*^{2}).^{99,100} Posterior empiric Bayes estimates can be calculated as weighted averages of the reported effect *βi* and the best-fit model output σ̂_{i}, where *βi* is weighted by *τ*^{2}/(*τ*^{2} + *si*^{2}) and σ̂_{i} is weighted by *si*^{2}/(*τ*^{2} + *si*^{2}). Thus, as the regression model explains more between-study variability, the posterior estimate is more heavily weighted toward the model output rather than the observed value, but studies with greater statistical power have posterior estimates closer to the reported values. The metaregression analysis was conducted using HLM (version 5.05; Scientific Software International, Lincolnwood, IL).

For the regression models, given a relatively small sample size and large number of potential covariates, we used an approach similar to forward regression in which the empiric Bayes residuals are linearly regressed against the remaining predictors and the most significant terms are entered sequentially,^{101} with a threshold of t = 1.7 (corresponding to *P* = 0.10).

#### RESULTS

Single-pollutant central effect estimates ranged from a 1.1% decrease in daily mortality per 10-μg/m^{3} increase of 1-hour maximum ozone to a 1.7% increase, with many estimates between 0.1% and 0.5% (Table 1). Figure 1 shows that a clear “funnel”-shaped trend occurs with increasing sample size, suggesting that many extreme effect estimates are due to small sample size.

Table 1 Image Tools |
Figure 1 Image Tools |

The maximum and minimum effect estimates have central estimates approximately 4 standard deviations from the mean, whereas all other values are within 2 standard deviations of the mean. Because these outliers might skew estimates of heterogeneity and influence regression results, we removed those estimates for the primary analyses. Of the 46 remaining estimates, 18 were reported by the authors to be “statistically significant” (*P* < 0.05). When rank-ordered by total number of deaths (the product of the length of the time-series and number of daily deaths), only 2 of 16 estimates in the lowest tertile are statistically significant, versus 4 of 15 in the second tertile and 12 of 15 in the highest tertile.

There is significant heterogeneity among the effect estimates as determined using Cochran's Q-statistic (*P* < 0.001), even after removing the 2 outliers. With no covariates in the metaregression, the resulting single-pollutant grand mean is a 0.21% increase in daily mortality per 10-μg/m^{3} increase of 1-hour maximum ozone (95% confidence interval [CI] = 0.16% to 0.26%). The grand mean is unchanged by inclusion of the outliers or 2 Mexico City estimates.^{24,25}

The magnitude of the ozone–mortality relationship differs substantially across seasons (Fig. 2). The single-pollutant grand mean for summer estimates is a 0.43% increase in daily mortality per 10-μg/m^{3} increase of 1-hour maximum ozone (95% confidence interval = 0.29% to 0.56%) versus a grand mean for winter estimates of −0.02% (−0.17% to 0.14%).

Although univariate relationships between predictive variables and the ozone–mortality relationship have limited interpretability given correlations among predictors, we present selected stratified values in Table 2. Most predictors do not substantially influence the estimates, although same-day effects appear somewhat greater than lagged effects, and effects appear greater at sites with lower annual average ozone concentrations. Effect estimates are higher for studies using nonlinear temperature terms, as hypothesized, although only 3 of 46 estimates were derived using linear temperature terms. There is a modest inverse relationship between effect estimates and air-conditioning prevalence, although few estimates were available in locations with high central air-conditioning prevalence (Fig. 3).

Table 2 Image Tools |
Figure 3 Image Tools |

There is a broad distribution across sites in the regression relationships between ozone and other pollutants, with only limited evidence of a correlation with the ozone–mortality effect estimate (Fig. 4). Positive slopes, which might indicate potential confounding, are seen only for annual and summer PM_{2.5}, although the relationships are weak. For the gaseous pollutants, the regression coefficients at many sites are negative, indicating that higher ozone concentrations are associated with lower levels of these pollutants. The relationship between PM and ozone is generally positive, especially during the summer (Fig. 4). However, these univariate relationships must be interpreted with caution given the influence of other factors.

Applying our hierarchical linear model to the full set of estimates, the 3 predictors entering the forward regression model for the ozone–mortality effect estimate are the lag time (in days), the residential central air-conditioning prevalence (above or below the median), and the ozone–NO_{2} regression coefficient (all year) (Table 3). This model indicates that ozone has a greater effect in cities with less air conditioning and where there is a positive relationship between ozone and NO_{2}. Also, same-day ozone effects are greater than lagged effects. This model yields posterior city-specific effect estimates that range from −0.1% to 0.4% (Fig. 5).

Table 3 Image Tools |
Figure 5 Image Tools |

Clearly, our regression findings could be sensitive to many factors. First, to determine the influence of the assumed conversion factors, we evaluated whether indicator variables for reported units (ppb vs. μg/m^{3}) or measurement time (1-hour maximum, 8-hour maximum, 24-hour average) were significant if added to the final multivariate model. Only the 8-hour dummy variable was statistically significant (*P* = 0.01), indicating that studies using 8-hour maxima had slightly lower estimates than studies using 1-hour maxima or 24-hour averages.

To get a better sense of the influence of both air conditioning and PM_{2.5}, we restricted the analysis to the 27 U.S. and Canadian estimates for which data on these variables were available. Applying the model from Table 3 yields nearly identical results, with coefficients close to the original values. Using our forward regression approach, there is no evidence of a positive effect from the ozone–PM_{2.5} regression coefficient, and using the actual air-conditioning prevalence does not change our model findings.

Given concern about influential points, we tested whether the ozone–NO_{2} regression coefficient changed when extreme values were deleted. If the minimum value is deleted, the optimal forward regression model includes only lag time and air-conditioning prevalence, and the coefficient for the ozone–NO_{2} regression coefficient decreases from 0.77 to 0.48.

Our final sensitivity analysis considers only studies with number of deaths above the median to determine whether studies with limited statistical power might be influential. Applying the regression model defined in Table 3, the coefficients for lag time and air-conditioning prevalence are similar, whereas the coefficient for the ozone–NO_{2} regression coefficient decreases from 0.77 to 0.22. Only the lag time enters into the forward regression model (*P* = 0.08).

#### DISCUSSION

Results from our primary model imply that between-study variability in ozone-related mortality can be partially explained by differences in the lag time, air-conditioning prevalence, and relationship between ambient ozone and nitrogen dioxide concentrations. For lag time and air conditioning, the results are robust and intuitive, and suggest that same-day ozone effects exceed lagged effects and that the ambient ozone–mortality relationship might be lower in cities with greater prevalence of residential central air conditioning (and therefore lower personal exposure to ozone).

The less robust influence of NO_{2}, along with the weak effect of PM_{2.5}, is harder to interpret. Given the evidence demonstrating a relationship between ambient PM_{2.5} and mortality, a stronger association for the PM_{2.5}–ozone regression coefficient may have been anticipated. Furthermore, univariate relationships (Fig. 4) appear stronger for PM_{2.5} than for NO_{2}, and correlation analyses show that summertime PM_{2.5} has the strongest mean correlation with ambient ozone among the criteria pollutants and seasons (Fig. 6). Our findings could be related to difficulties in identifying causal factors in a multivariate context, limitations in our ambient pollution data, or might indicate that the use of air pollution regression coefficients in hierarchical linear models is not the optimal approach for evaluating confounding. Although the magnitude of the relationship between ozone and copollutants is influential, so is the strength of the correlation, and the latter might better capture confounding. However, repeating our primary analysis using correlations rather than regression coefficients yielded identical findings.

Beyond the regression findings, we can reach some broad conclusions about the ozone–mortality relationship. Fewer than half of the studies in our analysis reported “statistically significant” findings, which is largely a function of the statistical power of the studies. This observation provides justification for a meta-analytic approach, which helps to combine evidence from individual studies lacking statistical power. In addition, we documented a substantial difference in the ozone–mortality relationship between the summer and winter (Fig. 2).

Clearly, our metaregression has many limitations. Although we attempted to capture the crucial dimensions of methodologic heterogeneity, there are many factors either difficult to quantify or unreported by the authors that could influence effect estimates. This is exemplified by the fact that estimates sometimes differed for studies conducted within the same city, although many of the regression covariates were identical (a factor that limited the predictive power of our regressions). More complex terms reflecting the degrees of freedom used in temperature spline models, for example, might capture some of this uncertainty. More broadly, our pooled estimates depend on the statistical methods applied in the past. For example, recent time-series studies have applied distributed lag models to evaluate the influence of longer time windows,^{7,102,103} something that cannot be done in a metaregression if the original studies did not follow this approach.

Furthermore, if the ozone–mortality relationship varies geographically, then studies included in the metaregression must be spatially representative to yield generalizable results. Although air conditioning appeared to modify the ozone effect, it is difficult to evaluate potential effect modification given few studies in settings and time periods with high central air-conditioning prevalence. Of our 46 estimates, only 4 were in settings with air-conditioning prevalence above 50%.^{33,47,59} All 4 of these estimates lacked statistical power, with 3 based on only 1 year of data.^{33,47} It is therefore difficult to make definitive conclusions about the influence of residential air conditioning on the ozone–mortality relationship. Further time-series studies should be conducted in warm settings with high air-conditioning prevalence to determine the importance of this factor, and studies such as NMMAPS should examine potential effect modification by air-conditioning prevalence. Given the growth of air-conditioning use in many locations, especially in the United States, understanding this influence would be crucial in developing concentration–response functions for prospective regulatory impact analyses. Furthermore, air-conditioning prevalence is only a rough surrogate of residential ventilation and personal exposure patterns, and so more refined indicators should be investigated.

Another limitation is the fact that multiple studies found that ozone was “statistically insignificant” without reporting quantitative estimates.^{53,56,71,72,82} Other time-series mortality studies may not have mentioned small ozone effects because the results did not reach statistical significance, and in general, such findings may be less likely to be published. Omitting all 3 categories of studies potentially biases our pooled estimates. To bound the influence of this factor, we follow an approach adopted previously^{19} and calculate the grand mean with no covariates, assuming that omitted studies have a central estimate of zero and the minimum variance among included studies. Adding 5 such estimates would reduce our pooled estimate to 0.17% (95% CI = 0.12% to 0.21%). Although the addition of more studies with no ozone effects would further reduce the central estimate, it would take 161 studies of minimum variance (and many more studies of greater variance) before the pooled estimate became statistically insignificant.

Finally, even if our metaregression accurately captures the relationship between ambient ozone concentrations and mortality risk, this may not reflect the actual exposure–risk relationship. Each of the studies in the current analysis used ambient ozone measurements as surrogates of personal ozone exposures. However, personal ozone exposures have consistently been much lower than corresponding ambient ozone levels of 12- and 24-hour durations.^{14,105–110} Hence, observed relative risk estimates from studies using ambient concentrations may be underestimating true risks associated with exposure to ozone, presuming that these are the health-relevant averaging times or that these relationships hold for shorter time periods.

Furthermore, exposure assessment studies have not provided conclusive evidence that ambient ozone concentrations are, in fact, strongly correlated with personal ozone exposures. Although only a limited number of exposure assessment studies have examined personal–ambient ozone associations, results suggest that the associations are stronger during the summer, when people spend more time outdoors and within well-ventilated indoor environments.^{14,105,106,110} The only study to examine hourly relationships between ambient and personal ozone showed weak personal–ambient ozone correlations indoors (r = 0.05) and stronger correlations outdoors (r = 0.8).^{111} Thus, the amount of exposure error between ambient ozone measurements and corresponding personal exposures may be greater during the winter as compared with the summer, which may contribute to the observed differences in the season-specific risk estimates. Together these results provide some evidence that ambient ozone monitors serve as better surrogates of actual exposure to ozone during warm seasons than during cold seasons.

Despite these limitations, we can draw some conclusions that are useful for public policy. First, our grand mean estimate appears comparable to estimates from previous meta-analyses. Thurston and Ito^{20} concluded that 6 studies with appropriate temperature characterization had a pooled relative risk of 1.056 per 100 ppb of 1-hour maximum ozone, corresponding to a 0.27% increase in daily mortality per 10-μg/m^{3} increase of 1-hour maximum ozone. The U.S. EPA estimated a 2.9% increase in deaths per 100-ppb increase in 1-hour maximum ozone^{4} or an approximate 0.15% increase in daily mortality per 10-μg/m^{3} increase of 1-hour maximum ozone. These values bound our central estimate of a 0.21% increase.

In addition, the relationship between ozone and mortality appears lower in settings with high residential central air-conditioning prevalence, in agreement with past ozone exposure studies^{12,13,107} and PM epidemiology.^{112} Finally, the robustness of the ozone–mortality relationship, even when controlling for key confounders and effect modifiers, indicates that inclusion of ozone-related mortality in future regulatory impact analyses may be warranted, although further investigation is needed into potential PM_{2.5} confounding in the summer and the personal exposure–ambient concentration relationships by season. Future studies should also explicitly incorporate air-conditioning prevalence or other personal exposure surrogates into the estimation of an appropriate national average ozone–mortality relationship.

#### ACKNOWLEDGMENTS

We thank Michelle Bell, Francesca Dominici, Kaz Ito, and their colleagues for their participation in this joint effort.