The expected distribution of total deaths according to x and z is given in Table 1.
The odds ratio for the association between the exposure x and the modifier z among deaths is, thus, from Table 1:
This confirms that the exposure–modifier interaction odds ratio from a case-only sample (ie, deaths only) tabulated as in Table 1 estimates the interaction rate ratio describing the modification of the effect of exposure on mortality by the presence of z.
Note that k 0 and k 1, which are functions of nuisance variables and parameters, cancel out. This is a major and very useful simplification, as the choice of specific models for these is one of the difficult and debated aspects of time-series regressions. However, the canceling of the main effect β of x is a limiting feature of the approach—main effects are not estimable, and only interactions can be estimated.
Polytomous and Numerical x with Dichotomous z
The overwhelming majority of mortality time-series regression studies have focused on numerical explanatory variables of interest (pollution or temperature). As noted by Albert, 7 we can extend the above argument to derive the probability of z conditional on a numerical x among deaths. Imagine, for example, Table 1 extended to as many rows as there are unique x values or groups. For each value of x in the table, the probability of z being 1 reduces simply to:
Here, subscripts i and j have been omitted for simplicity. Equation 2 is a logistic model, implying that the interaction parameter λ is estimable from a logistic regression of z on x.
For polytomous x with L levels, we reinterpret x as a vector of L-1 indicators, and λ as an vector of length L-1.
To extend the model to polytomous modifier z (say z = 1,, K), we extend the logistic regression (model 2) to a polytomous (multinomial) logistic regression model:
This expression extends the definition of the function expit. This can again be motivated heuristically by imagining a table set out like Table 1, but with K columns rather than 2.
Most major statistical packages fit this model, using maximum likelihood. For dichotomous and numerical x, there will be K-1 interaction parameters (λ1 is set to 0), reflecting how the baseline (z = 1) effect of exposure on mortality is modified by each other level of z. For polytomous x of L levels, there will be (K-1)(L-1) parameters, probably best avoided by dichotomizing x or z, or assuming a numerical score for one of them (see below for numerical z).
An alternative and probably equivalent Poisson log-linear model for polytomous modifiers in case-only analyzes has also been proposed. 5 In this context, we prefer the polytomous logistic formulation because of its simple motivation as an extension of the binary logistic model, and its natural handling of numerical continuous explanatory variables (x) of interest.
We have found no presentation of methods for this situation in the case-only literature, presumably because genotype is intrinsically categorical. In the full model 1 a numerical–numerical interaction term is defined as the product of x and z. The interaction coefficient λ is, however, invariant to centering x and z around alternative origins (hence, product (x −x o)(z −z o)), which is sometimes useful to reduce correlation of the product with x or z.
We propose using constrained polytomous regression to fit this model in the case-only approach. Thus, we will use the approach of the previous subsection, but constrain the parameters λk to reflect a linear increase or decrease in the modification of the effect of x by z as z increases.
Assume first that there are only a few (K) distinct values of z:z 1,, zK. We can then apply model 3 but with a constraint to the parameters λk as follows:
This reproduces the baseline (λ1 = 0) of the unconstrained model, and substituting it in model 3 gives
The case-only model 4 can thus be deduced from the full-data model 1 if z is centered around its lowest value z 1, using the same heuristic argument we used above for the unconstrained logistic model. Constraints are accommodated in some statistical packages (eg, Stata, College Station, TX). If there are too many distinct values of z to allow each to be a level in such a model, an approximation is obtained by grouping z, and using group means as zk. However, we have not investigated the impact of this approximation, and expect to encounter lost precision and power.
Extension to Multiple Time Series
It is not uncommon to study multiple time series (say from several cities) simultaneously. 9,10 Usually, the main focus of such studies is the modification of exposure effects by factors measured at a city level (eg, average SES). This type of modification is not amenable to study using the case-only approach, because we can no longer assume that the distribution of exposure (weather or pollution) is independent of the modifier (SES) in the population.
However, sometimes there are modifier subgroups within cities, and a researcher may be interested in making an estimate of the exposure–modifier interaction that draws information from all the cities. This can be achieved by combining city-specific estimates (either by case-only or conventional methods) using meta-analytic techniques. It can also be achieved by including deaths from all cities in a case-only logistic regression stratified by city (ie, in which an indicator for city is included).
EXAMPLE: HIGH TEMPERATURE AND Socio-economic Status IN SAO PAULO
Daily mortality (in persons age 65 years and above) and temperature measurements were obtained for the period 1991-194 (1,461 days). We were interested in the modification of the effects of high temperature by the SES of area of residence (58 areas), classified in quartiles of areas. 3 High temperature was considered as a numerical variable, defined as the number of degrees that the 2-day mean temperature (index and previous day) rose above 20°C, and zero if this mean was below 20°C. We estimated this by conventional and case-only analysis:
- By conventional time-series regression (model 1) with a set of 3 indicator variables for SES. Potentially confounding variables were particulate air pollution (PM10) the day before, humidity, holidays, cold temperature in the previous week (degrees below 20°C), and day of the week. Long-term changes over time were modeled as smoothing splines with 7 degrees of freedom per year (in STATA, convergence tolerance 10−8).
- By case-only methods as described above. Because the potential modifier SES was polytomous (4 levels), we used a polytomous logistic regression model 3 with SES as outcome and high temperature as explanatory variable. The constrained polytomous model 4 was used to estimate the trend across SES groups, by scoring the groups zk = 1, 2, 3, 4.
Key results are shown in Table 2. The interaction parameters (λk and the trend λ) and their standard errors, as estimated by the two methods, are very similar. None of the individual parameters were much larger than their standard errors, although the point estimate indicated a lower heat effect in the highest SES group. The middle column (βk) gives in its first row the baseline main effect of high temperatures (% increment in log rate per degree), which is that in the first SES group; ie, β1 = β from model 1. The heat effects from the first column in each of the other groups is derived from this baseline and the interaction term (βk = β + λk). These estimates are only possible from the conventional analysis.
The key assumption of independence of time-varying factors and time-fixed modifiers is more secure than is the analogous assumption for gene–environment interactions. However, there may be situations in which the assumption is violated. For example, persons of high SES might move out of the city in times of heat, and their deaths might be uncounted. This would cause an association between proportion in the high SES group (the modifier) in the population at risk and temperature, thus invalidating the independence assumption, and causing a spurious reduction in mortality during heat among high SES persons.
If the deaths among migrating high-SES persons were counted, but not affected by heat because they had moved to cooler places, then this would also cause an apparent reduced effect of heat on mortality in that group. This is not a violation of the assumption of independence, as the observed temperature series is not associated with the proportion in the high SES group, which does not change appreciably over time. We could say that the high SES group is truly less affected by heat in the city, the mechanism being its migration to cooler places. The true temperature series, however, is different in the high SES group, so true temperature is associated with high SES; we could say that the reduced effect of heat in high SES persons is an artifact due to bias. We suggest below that this can better be viewed as a measurement error issue—temperature is not correctly recorded for high SES persons.
A sufficient (though not necessary) condition for the independence assumption is that the distribution of the modifier over persons at risk does not change over time. Plausibility arguments may indicate whether this is likely to be approximately met. If not, it might be possible to test the independence assumption if information is available on variation in time of z in the population (eg, variations over time in proportion in the high SES group).
The validity of the case-only approach does not require assumptions about other time-varying factors w, which are not restricted in model 1. Recall that these risk factors for daily mortality are specified in the conventional model 1, but not specified in the case-only model 2 or 3, because their effects “cancel out” in the case-only odds ratios. They may be correlated with the factor of interest x, and can in particular include terms representing interactions between x and other time-varying factors.
Neither the case-only nor the conventional model 1 requires specification of time-fixed factors other than z. In the conventional model, the main effects of such factors are of no interest in the absence of denominators, and they do not confound effects of time-varying factors. Furthermore, there seems no reason to expect such factors to confound the interaction of interest, even if they were associated with the time-fixed factor of interest (or included interaction terms with it). Robustness of the case-only approach to the presence of unspecified time-fixed risk factors is suggested by analogy with the time-varying effects, and can also be seen by noting that the effect of time-fixed factors would apply equally to all days; ie, in Table 1, both rows in each column would be affected equally, so odds ratios would not change.
There are some modeling difficulties that the case-only approach does not simplify. It is critical to the argument for the validity of this approach that there are no other interactions in model 1 of time-varying factors with time-fixed factors. These would not “cancel out” in Table 1 and, hence, could confound the interaction of interest. The following two types of such interactions would almost inevitably confound:
- a. Interaction of the putative modifier under investigation with other time-varying factors. (In the example, SES modifies, say, the pollution effect.)
- b. Interaction of the time-varying factor of interest with another time-fixed variable. (In the example, the temperature effect is modified by, say, housing density.)
These types of interactions would also confound in the conventional approach. An alternative conventional approach that is robust to interactions of type a is to fit completely separate models of form 1 to each modifier group (SES = 1, 2, 3, 4 in the example), and compare the coefficients for x (heat in the example). This approach is equivalent to fitting an interaction between SES and each term in the model 1 thus allowing the possibility that the effects of risk factors other than hot temperature, including those captured in the time smooth, are modified by SES. The robustness of this model comes at the cost of reduced power and precision if these other interactions do not in fact exist. Alternatively, explicit interaction terms could be added to the model. This could also incorporate interactions of type b.
Confounding interactions of type a could also be modeled in the case-only approach. For example, air pollution could be entered as an additional regressor in the case-only model 2 or 3. Inclusion of an annual sine-cosine pair might be sufficient to control for a seasonal pattern not related to temperature that might plausibly be modified by SES. However, details have not been worked out, and incorporation of interactions of type b seems less obvious.
In conclusion, the case-only approach may ignore risk factors varying over time, providing their affect is not modified by the time-fixed variable of interest (or correlates of it). Also, time-fixed risk factors may be ignored, providing they do not modify the effect of the time-varying factor of interest or its correlates.
If conditions for the validity of the case-only approach are met, is it useful? We noted above that the main motivation is reduced modeling complexity and dependence on model assumptions. Concerns about such assumptions have been highlighted by recognition of convergence and inference problems of generalized additive models in the time-series context. 11 Analysis of modification of effects in a manner not dependent on those assumptions can provide reassurance that conclusions are not sensitive to them. However, the inability of the case-only approach to provide estimates of the main effects would lead most investigators to consider it as a supplement to rather than as a replacement of conventional methods.
In the Sao Paulo data of the example, conventional analyzes in fact showed rather little sensitivity of the estimate of heat–SES interaction to model specification. If all time-varying factors other than heat were omitted, point estimates of heat–SES interactions changed little (eg, −1.18 vs. −1.11 for category 4 vs. 1). It may be that the conditions for validity of the case-only approach also limit confounding of the interaction in the conventional model. However, the resulting model fitted very poorly. Scale overdispersion was 1.37 and residuals were highly autocorrelated—sources of concern for most analysts. If a simple correction was made to standard errors for overdispersion, the standard error of λ was higher in the reduced conventional model (0.086) than in the case-only model (0.072).
A further potential advantage of the case-only approach is its practical simplification of data analysis. In the example described, analysis with conventional methods was not particularly complex, but in multi-city studies, computational complexity of conventional methods have led investigators to carry out two-step analyzes, with city-specific time-series analyzes followed by “meta-analytic” combination of effect parameters from each city. 9 Investigation of modification by factors varying within-city would add further complexity. Case-only analysis would be computationally much simpler (many fewer parameters), and could probably be carried out without the need to use two steps. The case-crossover approach provides another alternative to the conventional analysis with similar computational simplification. 12,13 Unlike the case-only approach, it has the ability to estimate main effects, but it has some problems of its own. 14
It seems likely that the assumption that modifiers are fixed in time could be relaxed to allow modifiers that change in time much more slowly than does the exposure of interest, by stratifying by time periods. Examples are age, chronic disease, and some long-term treatments. Details, however, remain to be worked out.
There are limitations that affect case-only and conventional approaches equally. For example, errors in measuring the principal exposure may distort patterns of effect modification: the lower heat effect in high SES areas in Sao Paulo might result from temperature in these areas in fact being lower than the central meteorological station measurements—an issue similar to that discussed above in the context of the independence assumption. Also, in both models the interaction measures departure from the multiplicative model, rather than the additive model, which is arguably more fundamental. However, given the usually small rate ratios involved, this distinction typically will be minor.
In conclusion, the case-only approach provides a method for analyzes of effect modification in time-series regressions with fewer assumptions than the conventional approach. Furthermore, it provides computational simplification that is potentially useful in complex data sets for which conventional methods are problematic.
Nelson Gouviea provided the data from Sao Paulo, and Sam Pattenden made helpful comments.
1. Schwela D. Air pollution and health in urban areas. Rev Environ Health. 2000; 15: 13–42.
2. Goldberg MS, Burnett RT, Bailar JCIII, et al. Identification of persons with cardiorespiratory conditions who are at risk of dying from the acute effects of ambient air particles. Environ Health Perspect. 2001; 109( suppl): 487–494.
3. Gouveia N, Fletcher T. Time series analysis of air pollution and mortality: effects by cause, age and socioeconomic status. J Epidemiol Community Health. 2000; 54: 750–755.
4. Khoury MJ, Flanders WD. Nontraditional epidemiologic approaches in the analysis of gene-environment interaction: case-control studies with no controls! Am J Epidemiol. 1996; 144: 207–213.
5. Umbach DM, Weinberg CR. Designing and analysing case-control studies to exploit independence of genotype and exposure. Stat Med. 1997; 16: 1731–1743.
6. Hamajima N, Yuasa H, Matsuo K, et al. Detection of gene-environment interaction by case-only studies. Jpn J Clin Oncol. 1999; 29: 490–493.
7. Albert PS, Ratnasinghe D, Tangrea J, et al. Limitations of the case-only design for identifying gene-environment interactions. Am J Epidemiol. 2001; 154: 687–693.
8. Weinberg CR, Umbach DM. Choosing a retrospective design to assess joint genetic and environmental contributions to risk. Am J Epidemiol. 2000; 152: 197–203.
9. Dominici F, Samet J, Zeger SL. Combining evidence on air pollution and daily mortality from the 20 largest cities: a hierarchical modelling strategy. J Royal Stat Soc A. 2000; 163: 663–302.
10. Katsouyanni K, Touloumi G, Samoli E, et al. Confounding and effect modification in the short-term effects of ambient particles on total mortality: results from 29 European cities within the APHEA2 project. Epidemiology. 2001; 12: 521–531.
11. Dominici F, McDermott A, Zeger SL, et al. On the use of generalized additive models in time-series studies of air pollution and health. Am J Epidemiol. 2002; 156: 193–203.
12. Bateson TF, Schwartz J. Control for seasonal variation and time trend in case-crossover studies of acute effects of environmental exposures. Epidemiology. 1999; 10: 539–544.
13. Neas LM, Schwartz J, Dockery D. A case-crossover analysis of air pollution and mortality in Philadelphia. Environ Health Perspect. 1999; 107: 629–631.
14. Lumley T, Levy D. Bias in the case-cross-over design: implications for studies of air pollution. Environmetrics. 2000; 11: 689–704.
Keywords:© 2003 Lippincott Williams & Wilkins, Inc.
epidemiologic methods; statistics; air pollution; weather; environment