**ArticlePlus**

Click on the links below to access all the ArticlePlus for this article.

Please note that ArticlePlus files may launch a viewer application outside of your web browser.

Ambient temperature is a major cause of fluctuations in mortality over time.^{1,2} To study this association from time-series data, investigators have used a wide variety of statistical methods recently dominated by explicit “time-series regression” approaches. Many of the developments in these methods for air pollution epidemiology, in particular, for modeling temperature as a potential confounder, have direct relevance for studies in which temperature is the focus.^{3} Furthermore, several studies of effects of temperature have extended the air pollution methodology for this purpose.^{2,4–16} However, there has been no systematic discussion of methods for studies seeking to elucidate temperature effects. By providing such a discussion, we believe that the selection of models is liable to be less ad hoc and better suited to its purpose.

Specifically, we seek to review, clarify, and extend methods for investigating associations of mortality with temperature focusing on 2 key issues: the nonlinear form of the relationship and the spread of effect over several days or weeks (distributed lags). We begin the article by describing the data we use to illustrate the problems and models. After describing the general time-series regression model, we focus on specific components to address nonlinearity in single-lag models and multiple lags in linear-threshold models. We propose a general modeling framework for multilag nonlinear relationships. This framework places into a single context several methods that have been used previously^{2,4–16} and suggests some new alternatives. Finally, we discuss model choice.

For easier reading of the main text, we have placed more technical content in an appendix, available with the online version of this article.

## EXAMPLE DATA SET: LONDON

Table 1 shows univariate descriptive statistics for daily cardiovascular disease (CVD) mortality, weather, and potential confounder data from London for 1993 through 2003. Relative humidity, black smoke (an indicator of particulate air pollution), and counts of laboratory confirmed influenza (A and B) were potentially confounding variables for which data were available.

The left 2 graphs on Figure 1 show the patterns of temperature and of mortality over time showing the expected seasonal patterns. These figures and all results shown and programs used were obtained using Stata software (Stata Corp., College Station, TX). The right graphs show scatter plots of the crude relationship of mortality with temperature; the top right panel displays one point per day and the bottom right panel shows data aggregated to centile bins as a crude transparent smoothing device. The figures on the right suggest a reverse “J” shape with increases in mortality at temperatures lower and higher than approximately 17 degrees. However, the plots on the left indicate that there is potential confounding by other seasonal factors.

## TIME-SERIES REGRESSION MODEL

To model more formally, and in particular to control for confounding, we follow usual practice^{3} in assuming an overdispersed Poisson regression model for Y_{i}, mortality count on day *i* on observed temperature *t _{i}*, and other time-varying factors:

Our main focus in this article is the functional form of the temperature term *f*(*t _{i}* | β), for which we discuss options in depth subsequently. For the remainder of the model (confounder control), we merely summarize a single variant we used for the London illustration, which is broadly in line with current practice. We acknowledge that this approach places optimal choice of these terms (an important discussion) beyond the scope of this article. For our London analyses, specific potentially confounding time-varying risk factors (eg, pollution) comprised:

- Black smoke (linear, lags 0, 1, and 2);
- Weekly reports of influenza virus isolated by laboratories routinely making assays for the National Health Service (linear, lags 0, 1, and 2 weeks); and
- Relative humidity (natural cubic spline 3 df, lag 0).

The smooth function of time (date), to allow for changes due to seasonal effects and demographic shift or other slow change not captured in the covariates, comprised a combination of trigonometric terms to 6 harmonics to capture repeated annual (ie, seasonal) patterns and a natural cubic spline of date with 2 degrees of freedom to capture slow change over the 11 years.

In Figure 2, we show the same-day temperature-mortality relationship aggregating temperature to centile bins as before unadjusted and with confounders modeled as described previously.

## MODELING THE TEMPERATURE EFFECT

Generally, like in the London data, mortality is found to be lowest at average temperatures and higher at low and high temperatures. Thus, a simple log linear function is not usually adequate. The relationships are often described as U-, V-, or J-shaped.^{9} In air pollution studies, investigators usually model temperature as a general smooth curve, and this also serves well (at least for graphic presentation) if temperature is the focus of interest.

We prefer the natural cubic spline because it is constrained to be linear at the boundaries of the temperature range, in which data are usually too sparse to support more detail. For all smooth curves, investigators must choose the degree of smoothness (usually degrees of freedom) and, for some (such as natural cubic splines), the placing of “knots.” Choice of smoothness (number of knots) is usually more important than precise knot placement, with many knots tending to show many wiggles (some possibly spurious) and too few knots losing real wiggles. (We discuss approaches to this and other model choices subsequently.)

General smooth curves are useful for describing relationships, but a simpler quantitative summary is also useful, in particular if relationships are to be compared across places or population subgroups. Classification of temperature into bins is possible (ie, indicator basis variables). Another simple model attractive for interpretability is the 3-piece linear spline (segmented linear) function in which the middle section is constrained to have zero slope. In this model:

with

and

The basis variables *t _{C,i}* and

*t*represent excesses of temperature below the cold-effect threshold (

_{H,i}*τ*) and above the heat-effect threshold (

_{C}*τ*). The attraction of this model is the interpretability of its 4 parameters:

_{H}- The high threshold temperature
*τ*above which mortality rises;_{H} - The coefficient
*β*of the high temperature effect;_{H} - The low threshold temperature
*τ*below which mortality rises; and_{C} - The coefficient
*β*of the low temperature effect._{C}

We call this model the linear-thresholds model. The simpler “V” model^{5,17} is a special case, with *τ _{H}* =

*τ*; as is the hockey-stick model,

_{c}^{2}with

*β*= 0 or

_{C}*β*= 0. An obvious extension is to allow the “middle” section to have nonzero slope, although this loses some of the interpretational advantages of the simpler model. The algorithm of Muggeo

_{H}^{18}can be used to estimate thresholds; although in smaller and irregular data (especially rounded temperature data), this can converge to a local rather than the global maximum. Evaluation of profile likelihoods (ie, maximized for slopes) over a grid of threshold values is a more robust alternative but takes more processor time. Confidence intervals (CIs) for thresholds can be estimated from standard errors using Muggeo's method or from the grid of profile likelihoods described previously.

Figure 3 shows a linear-thresholds model fit to (lag 0) temperature alongside a natural cubic spline with 5 df (default knot placing). Mortality is shown in the graph relative to the minimum fitted mortality over the observed temperature (see the online appendix).

For the thresholds model, the deviance was minimized at *τ̂ _{C}* = 12.1,

*τ̂*= 22.3. The likelihood profile 95% CI for

_{H}*τ*was 9.6 to 12.9 and for

_{C}*τ*, it was 21.2 to 24.4 found as the minimum and maximum values of

_{H}*τ*within 3.84 of the minimum deviance. Mortality increased by 0.7% (0.5 to 0.9) per degree below 12.1 degrees and by 8.6% (7.1 to 10.2) per degree above 22.3 degrees. Here and typically, the slopes are highly dependent on the thresholds. For example, the heat slope was correlated at r = 0.73 with the heat threshold (given by Muggeo's method).

_{H}, τ_{C}### A General Framework for Modeling Nonlinear Functions: Basis Variables

To aid the extension to distributed lag models below, we use a framework—“basis variables”—for describing a wide range of nonlinear functions. The linear thresholds and other segmented linear models, cubic and natural cubic spline functions, as well as indicators (dummy variables) can all be represented in the general notation

where the x_{b} are called basis functions.

The form of basis functions depends on the basis used (eg, x_{b} = x^{b} for a polynomial). In the regression context, we fit these nonlinear functions by 1) calculating variables (basis variables) from the original variable and then 2) including each of these basis variables in the regression model. The number of basis variables B is called the degrees of freedom for the curve.

Nonparametric curves such as LOESS, smoothing splines, or penalized splines cannot be represented directly as basis variables. Because later development uses the basis variable framework, we restrict our attention to curves that can be represented in this way.

## DISTRIBUTED LAG MODELS

Several studies have found that the cold effect is spread over a week or more after a cold day (the delay varying by cause of death), whereas the heat effect is more immediate.^{2,7,14} Some form of distributed lag model seems a natural choice to describe such patterns.

Distributed lag models were introduced to epidemiologists in the air pollution literature,^{19} but for linear models. To extend this standard theory most simply to apply in our nonlinear context, we describe first the distributed lag model applied to the linear-thresholds model^{2} with thresholds (*Τ _{H}, Τ_{c}*) constrained to be the same at all lags. The distributed lag model extension of expression

^{2}is then:

The function is now of t_{∼i} representing history of temperatures on days up to day i; {t_{i},t_{i-1},t_{i-2},…}. In this unconstrained distributed lag model, L+1 low and L+1 high temperature variables, with lags 0 to L, are simultaneously entered into the model. The parameters *β _{C,l}* and

*β*represent the effect of low and high temperatures delayed by

_{H,l}*l*days.

This model with lags up to L = 27 in London maximum likelihood gave *τ̂ _{C}* = 22.3 (19.9 to 23.1) and

*τ̂*= 22.7 (22.1 to 23.1). (The thresholds, estimated by maximum likelihood, are different from those estimated using lag one only because of the additional information in lags 1–28.) The top graphs on Figure 4 show the 2 X 28 coefficients of this model for cold (left) and heat (right).

_{H}The coefficients of linear distributed lag models have 2 equally valid interpretations. First, if temperature increases by 1 degree over the heat threshold (or decreases 1 degree below the cold threshold), on 1 day, then mortality will change on that day and on each of the next 27 days as estimated by the 28 coefficients (points on the graph). Second, if temperature had increased 1 degree over the heat threshold for one day (say, day *i*) and each the previous 27 days, then each of those days would have an impact on mortality on day *i*, also represented by the 28 coefficients.

From either of these viewpoints, one summary of interest is the combined effect of a 1 degree increase over all 28 lags:

(and analogously for heat). In the London data, this total effect of a 1 degree decrease in temperature below the cold threshold over 28 days was 1.9% (95% CI = 1.4 to 2.5), and above the heat threshold was 21.3% (15.3 to 27.7).

Although this unconstrained distributed lag model is useful to obtain overall estimates of effect and to control confounding by temperature when investigating pollution effects,^{13,16} it is problematic for elucidating the pattern of delay in effects because each individual coefficient is imprecisely estimated and is usually highly correlated with estimates of other coefficients. To overcome this problem, investigators have put constraints on the *β _{l}*, obtaining a constrained distributed lag model.

Most simply, the coefficients *β _{l}* for a range of lags (say, 1–3) can be constrained to be equal (

*β*

_{1}=

*β*

_{2}=

*β*

_{3}=

*β*

_{1–3}). We call this the lag-stratified distributed lag model. It is equivalent to entering the mean of

*t*

_{C}over days 1 to 3, in place of the 3 separate days' temperatures in the regression, except that the regression coefficient in the mean model is 3 times that of the constrained model. The constrained coefficient represents the increment in log mortality per degree increase in temperature on one of lags 1 through 3; the coefficient of the mean represents the increment per degree sustained over lags 1 through 3.

The lower graph in Figure 4 shows a lag-stratified distributed lag model applied to the London data with coefficients assumed equal in lag intervals 0–1, 2–6, 7–13, and 14–27. These lag-lengthening intervals were chosen a priori to reflect evidence from previous work of greater change in coefficients over shorter than longer lags.^{12–14,17}

Other possible constraints allow smooth change in the *β _{l}* by making the

*β*follow a polynomial,

_{l}^{19}cubic spline, or penalized spline

^{20}function in lag

*l*. The quartic constraint is illustrated in Figure 4. When interpreting coefficients from constrained distributed lag models (including lag-stratified models), we should be aware that they reflect information from the constraint as well as the data. For example, the degree of certainty with which the constrained distributed lag models for heat in Figure 4 indicate the persistence of the effect up to lag 7 seems speculative given the individual coefficients (top left). Estimated overall effect is, on the other hand, largely insensitive to constraints.

In comparison with some of the more flexible models we discuss subsequently, the linear-thresholds distributed lag models, in particular the lag-stratified model, have advantages in ease of presentation and interpretation and the ease with which epidemiologic measures such as relative risks and attributable fraction can be attached to them. Table 2 illustrates this. The left column shows the relative risk increment per degree of “cold” or “heat” sustained over each lag interval considered. Thus, the figure of 9.2% for heat in lags 2 through 6 is 5 times the increment per day in this period (seen in Fig. 4 as approximately 2%).

The calculation of percent of deaths attributable to heat and cold (Table 2, right column) complements this usefully, reflecting the number and degree of “cold” and “hot” days as well as the risk on those days. This is obtained from the standard formula^{21} as the mean over all deaths of

, and similarly for heat. These estimates are, of course, subject to the validity of the model.

### Nonlinear Distributed Lag Models

To easily adapt standard distributed lag methods, we were obliged to simplify the temperature-mortality relationship at each lag to linear-threshold form. This may not be sufficiently flexible for all purposes. We now propose a genera1 framework that includes and systematizes the linear thresholds model and some other curvilinear distributed lag models previously proposed^{7,11,16} and suggest extensions and a programming strategy. This framework requires that:

- The dependence of mortality on temperature at each lag can be represented as a sum of “basis” functions of temperature {t
_{b}, b = 1,B} and coefficients, and - The dependence of those coefficients on lag can be similarly represented (basis {
*l*_{p},*P*= 1,P}).

This framework allows temperature-mortality and distributed lag dependences to be chosen independently from a menu, including natural cubic spline, polynomial, linear thresholds, or strata (indicators).

All such models can be fit using standard software with some preprocessing (the online appendix includes more detail, and Stata ado files are available on request from the author). Specifically, we compute BXP derived temperature variables

from the data for each day i and use these as linear terms in model 1. The

comprise sums of products of the basis functions for temperature and lag. We thus call them cross-basis functions. The coefficients of the derived temperature variables are not usually easily interpreted directly, but interpretable graphs, summaries, and statistical inference can be obtained from their estimates and standard errors (see the online appendix).

Table 3 displays the main options suggested by the cross-basis framework with references where they have had published description or use.

We illustrate in Figure 5 an “NCS-NCS” cross-basis model fit to the London data with a 5-df natural cubic spline for the temperature-mortality curves and a 4-df natural cubic spline for the coefficient-lag curves, hence 5 × 4 = 20 degrees of freedom (ie, derived temperature variables

) overall. For the lag spline, interior knots were chosen at equal intervals logarithmically over 0 to L so as to allow more flexibility at lower lags. Although the 3-dimensional graph in Figure 5 shows the entire relationship of mortality against *l* and *t*, these graphs are not always easy to follow and cannot show confidence intervals, so graphs of “slices” and other derived measures of risk, shown in Figure 6, are often more useful.

Starting interpretation of Figure 6 with the graph of total impact over all 28 lags (bottom left graph), we see the familiar broadly “U”-shaped relationship with minimum at 18.8°C. We chose this multilag “minimum mortality temperature” as baseline for scaling relative risks in all the cross-basis graphs (see the online appendix). Risks would rise according to the model to approximately 1.5 times this minimum at extremes of low and high temperature if these temperatures were sustained over 28 days.

Considering now the temperature-mortality relationship at different lags (from top left), we see that heat (temperature above 18.8°C) shows an adverse effect at lags 0 and 3, but by lags 11 and 21, a slight apparent deficit of deaths is observed. This long lag deficit is also seen in the graph of mortality by lag at 25°C (bottom right). It has been noted in other studies using other models^{7,12} and interpreted as evidence for a degree of “harvesting” heat-causing deaths in persons who would have otherwise died anyway 2 or 3 weeks later. The deficits at longer lags are subtracted from the “overall impact” estimate of the bottom graph.

The absence of the long lag heat deficit in the linear thresholds distributed lag model (Fig. 4) is noteworthy and seems likely to be due to the high heat threshold estimated (by maximum likelihood) in that model. Linear threshold-distributed lag models with lower heat thresholds showed the long lag deficits. It is itself noteworthy that the estimated minimum mortality temperature in the NCS-NCS model (18.8°C) is lower than the 2 estimated thresholds in the linear threshold-distributed lag model (22.3 and 22.7), suggesting sensitivity of this aspect of the relationship to model choice.

The pattern for cold, seen either from the graphs of mortality by temperature (left) or by lag (right), was however similar to that shown by the linear threshold-distributed lag model.

### Choosing Among Modeling Options

Our review and proposed framework has indicated a wide variety of possible models, each with choices of how to implement. In particular, analysts must choose:

- Form of temperature-mortality curve at a given lag (polynomial, strata, natural cubic spline, linear thresholds);
- Form of change in coefficients by lag (unconstrained, polynomial, strata, natural cubic spline);
- Maximum lag; and
- Extent and place of wiggles:

- for lag-stratified models: strata intervals,
- for NCS models: number and placing of knots, and
- for polynomial models: order.

Each of these choices will depend on the objectives of the analysis as well as the model fit. In general, simpler models (eg, lag-stratified thresholds) have advantages of easy interpretability, but more complex models (eg, natural cubic spline) may be required to better fit the data or clarify subtler points of the relationship.

The simpler variants seem particularly attractive in multicity studies in which one seeks to compare associations across cities (although parameters from any cross-basis model could in theory be input into a multivariate meta-analysis framework to explore cross-city variation). However, the more flexible models may be useful in exploratory single-city studies and can serve to indicate to what extent there are systematic effects of temperature beyond that captured in the simple more easily interpretable model. Furthermore, when the objective is to control confounding by temperature, for example, if air pollution is the focus of interest, a more flexible model is less likely to leave residual confounding.

Choice of specific model may also be informed by model fit criteria such as deviance, Akaike's information criterion (AIC), or tests for white noise in residuals.^{22} These have the feature of being objective criteria, although they implement criteria (optimizing prediction of outcomes in new data in the case of AIC), which are not obviously relevant in epidemiology.

In the London data set, among a wide variety of models within the cross-basis framework with maximum lag 27, AIC favored the model with 8-df natural cubic spline for temperature and 4-df natural cubic spline for lag, and in larger data sets, AIC favored even more complex models. However, much simpler models (for example, the linear thresholds with 5 lag strata) were not far behind. Although it was useful to be aware that the simpler model had not captured all the relationship, for interpretation, the multiwiggled 8-df natural cubic spline seemed less useful than the 5-df natural cubic spline or linear-thresholds model.

Other specific modeling options may be suggested by the plots of flexible ones. For example, the cold effect appears more linear than the heat effect, suggesting fewer knots for lower temperatures. However, such ad hoc data-led model choices are not reflected in formal inference.

The choice of maximum lag similarly might draw on a priori or model fit criteria. If objectives include investigation of possible negative coefficients at longer lags consistent with harvesting, then we would choose a higher maximum lag. In lag-stratified models, strength of evidence for association of longer lag intervals with mortality can be assessed directly. For example, in the lag-stratified linear thresholds model in London (Table 2), the 14- to 27-day lag period had relatively small nonsignificant coefficients, suggesting that reducing the maximum lag to 13 would lose little. An AIC comparison of models with maximum lag 13 and 27 supported this.

When using model-fit criteria to inform model choice, we must keep in mind that relative performance of each of the choices in the bulleted list previously depends on the other choices and on choices for the confounder model. For example, although AIC favored a maximum lag of 13 over 27 with the linear thresholds model as noted previously, with the 20-df natural cubic spline model (Figs. 5 and 6), a maximum lag of 27 was preferred over 13.

As well as considering choice of a preferred model, it is also desirable to consider sensitivity of conclusions to model choice not only in relation to the temperature component, but also that for season and other specific factors.

## DISCUSSION

We have sought to make a systematic presentation of methods to model effects of temperature on mortality, which are generally simultaneously nonlinear and experienced with a variety of delays (distributed lags). In particular, we have proposed the cross-basis framework, which allows a wide variety of models across the spectrum of simplicity, including most of the models previously proposed and some new possibilities (Table 3).

Modeling temperature effects raises many other issues that we have addressed briefly or not at all: optimal control for confounding by season and other causes of fluctuations in mortality over time; model diagnostics; “harvesting”^{12}; modifiers of the temperature effect by other factors, in particular season^{5,6,16}; interactions of temperature with other weather variables (including the use of apparent temperature or synoptic indices); and cross-lag interactions (eg, sustained periods of heat or changes in temperature). Season is particularly problematic because of its close association (conceptual and empiric) with temperature. We believe, however, that the framework we proposed for multilag nonlinear models can allow these issues to be addressed with greater confidence that at least the main effects of temperature have been adequately modeled.

A limitation to the generality of the cross-basis model family is that the basis functions (and hence the number and placement of knots in splines and of thresholds in the linear-thresholds model) must be the same across lags. Relaxing this generally in a meaningful way appears difficult, but simple special cases are useful and present no problems. In particular, lag-stratified natural cubic spline or linear-threshold models could have different knots (thresholds) at different places for different lag strata without imposing problems, although estimating thresholds by maximum likelihood seems likely be tricky.

We have used the London data to illustrate the methods discussed but do not advance the results as a definitive description of the effects of temperature in London. An analysis aimed at more robust substantive results would consider the sensitivity issues discussed previously and in the previous section. Furthermore, relationships will differ across cities and calendar time and with outcome. The results we show are however broadly in line with other studies of London mortality.^{14}

In conclusion, temperature has a substantial impact on mortality in many places. To model this relationship, either for interest in its own right or for control of confounding of effects of other risk factors, we need to capture its multilag and nonlinear nature. A wide range of models are possible for this; the cross-basis family of models provides one framework to guide choice and computing.

## REFERENCES

*Epidemiol Rev*. 2002;24:190–202.

*Lancet*. 1997;349:1341–1346.

*Res Rep Health Eff Inst*. 2004;123:3–27; discussion 29–33.

*Eur J Epidemiol*. 1998;14:571–578.

*Int J Epidemiol*. 1997;26:551–561.

- Cited Here |
- PubMed | CrossRef

*Epidemiology*. 2005;16:58–66.

*Epidemiology*. 2001;12:662–667.

*J Occup Environ Med*. 2001;43:927–933.

*Environ Health Perspect*. 2002;110:859–863.

- Cited Here |
- PubMed | CrossRef

*Occup Environ Med*. 2005;62:702–710.

*Am J Epidemiol*. 2002;155:80–87.

*Epidemiology*. 2005;16:613–620.

*Environ Res*. 2001;86:209–216.

- Cited Here |
- PubMed | CrossRef

*J Epidemiol Community Health*. 2003;57:628–633.

*Epidemiology*. 2004;15:755–761.

*Am J Epidemiol*. 2005;162:80–88.

*Am J Epidemiol*. 1993;137:331–341.

- Cited Here |
- PubMed | CrossRef

*Stat Med*. 2003;22:3055–3071.

- Cited Here |
- PubMed | CrossRef

*Epidemiology*. 2000;11:320–326.

*Biostatistics*. 2000;1:279–292.

- Cited Here |
- PubMed | CrossRef

*Am J Epidemiol*. 1985;122:904–914.

- Cited Here |
- PubMed | CrossRef

*J Royal Stat Soc.*In press.