The model used to analyze the study should include the exposure of interest (air pollutant level prior to outcome occurrence, APa,i) and relevant covariates. Equation (1) illustrates a linear form:
for infant i in area a, E(Ya,i) is the expected birth weight and APa,i is a relevant, prenatal air pollutant level.
To use future area-specific air pollutant levels as an indicator, we also fit the same model that now also includes the indicator variable (area-specific, pollutant level measured after infant i's birth, say APa,if):
If residual confounding and model misspecification are absent and the assumed causal relationships are adequate approximations, APa,if should be unassociated with Ya,i after adjustment for covariates. An association between APaf and the outcome (δ ≠ 0) suggests residual confounding or other model misspecification. Thus, we use the statistic I to test for residual confounding:
where is the estimated slope and its standard error. Under the null hypothesis of no model misspecification, I is approximately normally distributed.
We assess the indicator's ability to detect residual confounding using data from a spatial study of ambient air pollution and birth weight in Atlanta. Use of simulations allows us to specify the “true” causal relationships; using estimated parameters to calculate the true, expected birth weights makes the simulations realistic. We consider 2 ambient pollutants commonly assessed in the birth outcomes literature, PM2.5 (μg/m3) and NO2 (parts per billion). We assess relationships between pollutant levels during the first month of pregnancy and birth weight of full-term infants. Zip code-specific levels for each pollutant were a weighted average estimate.7
Each full-term infant (37–43 weeks' gestation) was assigned to the Zip code of the mother's residence. We calculated the ambient air pollution level for each infant's Zip code, averaged over the 4 weeks after the estimated conception date. Modeled covariates included gestational age (weekly indicators); maternal education, age (linear spline with 3 knots), tobacco use (yes/no), Medicaid (yes/no), and race/ethnicity (non-Hispanic white, African-American, and Hispanic); and the Zip code's percent of population below the poverty line. We included indicators for date-of-conception in 2-week intervals, so comparisons were spatial.
We also calculated the air pollution level for each Zip code averaged over the 4-week period beginning 1 year after the conception date. Because these exposures occurred after birth (8+ weeks), they cannot have affected birth weight. Future levels (APi,af) are included only in models that also include pollutant levels prior to birth and a subset of the covariates.
We used the model in Eq. (1), with Zip code as the area. We fit this model (once) to the observed birth weights to obtain model-predicted weights for each infant, and then treated the model-predicted weights as the true expected values in the simulations. To assess the indicator's ability to detect confounding, we first generated a birth weight for each infant, using the true predicted values and adding a random Gaussian error. We fit the correct model including the indicator, and then, to simulate confounding, we fit a misspecified model (incorrectly omitting a factor) along with the indicator. In most scenarios, we omitted an actual, measured factor (eg, smoking). In a few situations, we created 3 hypothetical factors and included them when fitting the model to obtain alternative true predicted weights. We subsequently omitted one of the variables to simulate additional patterns of confounding. We calculated the proportion of simulations in which I exceeded 1.96 in absolute value, rejecting the null hypothesis of no confounding. We also calculated the area under the receiver operating curve as a measure of discriminatory ability. We include a program for simulating the power to detect model misspecification due to omission of a confounder (eAppendix 2, http://links.lww.com/EDE/A504). The user can either specify parameters to generate data hypothetically, or use actual observations to fit a model and base simulations on the fitted parameters.
PM2.5 was negatively associated with birth weight in these spatial analyses (Table 1, scenario 1; = 20.6/g/10 μg/m3). Compared with the true model that generated the data, improperly omitting various variables led to varying degrees of simulated confounding. changed by about 70% when age was omitted and by 700% when race was omitted (Table 1, column 3). The indicator's ability to detect simulated confounding also varied substantially. For the situations considered, the indicator's ability was weak when confounding was weak-to-modest, for example, when age or tobacco was omitted (AUC = 0.51–0.56, Table 1, column 5). However, this was sample-size dependent: with quadrupling of the sample size, the ability to detect confounding (created when the poverty variable is incorrectly omitted) increased from 19% (Table 1) to >50% (data not shown). With stronger degrees of simulated confounding, the indicator consistently signaled that confounding might be a problem (eg, scenarios 5 and 6; AUC = 0.93–1.00, Table 1).
Simulation results were similar for NO2 (Table 2) and also when we considered confounding by the hypothetical factors (Table 3). Although the ability to detect confounding again varied, the indicator consistently signaled possible residual confounding with these sample sizes when the degree of simulated confounding was moderate-to-strong (eg, when race or several variables were omitted [Table 2, scenarios 5 and 6]).
We extend a method to detect important residual confounding2,3 by describing and evaluating the method for spatial studies. The ability to detect residual confounding was excellent for some scenarios, such as when race was intentionally omitted. As with any statistical technique, the ability to detect residual confounding improves with stronger confounding and larger sample size. We omitted measured variables, such as race, merely to illustrate possible scenarios based on relationships of real factors. In actual applications, the factor creating confounding, if any, could be completely unrecognized and unmeasured. Although few researchers would omit race from a study of air pollution and birth weight, an investigator could conceivably be unaware of, and therefore omit, some other factor that affected air pollution and birth weight in a manner similar to race.
The validity of this approach depends on the assumptions. False-positive indications could arise if, for example, a factor affected both the outcome and future exposures but not the exposure of interest. Our simulations suggest that the method can discriminate situations where residual confounding is present from those where it is not, although the strength of this discrimination ability varies according to the situation.
1.Hernan MA, Hernandez-Diaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol.
2.Flanders WD, Klein M, Strickland M, et al. A method of identifying residual confounding and other violations of model assumptions. Epidemiology.
3.Flanders WD, Klein M, Strickland M, et al. A method for detection of residual confounding in time-series and other observational studies. Epidemiology.
4.Strickland MJ, Klein M, Darrow LA, et al. The issue of confounding in epidemiological studies of ambient air pollution and pregnancy outcomes. J Epidemiol Community Health.
5.Greenland S, Pearl J, Robins J. Causal diagrams for epidemiologic research. Epidemiology.
6.Greenland S, Pearl J. Causal diagrams. In: Boslaugh S, ed. Encyclopedia of Epidemiology
. Thousand Oaks, CA: Sage Publications; 2007:149–156.
7.Ivy D, Mullholland JA, Russell AG. Development of ambient air quality population-weighted metrics for use in time-series health studies. J Air Waste Manag Assoc.
Supplemental Digital Content
© 2011 Lippincott Williams & Wilkins, Inc.