Secondary Logo

Journal Logo

Methods: Original Article

Future Cases as Present Controls to Adjust for Exposure Trend Bias in Case-only Studies

Wang, Shirleya; Linkletter, Crystala; Maclure, Malcolmb,c; Dore, Davida,d; Mor, Vincenta; Buka, Stephena; Wellenius, Gregory A.a

Author Information
doi: 10.1097/EDE.0b013e31821d09cd
  • Free


One of the most difficult struggles in epidemiology is identifying appropriate groups for comparison. Depending on study design, the ideal comparison group could be an unexposed population who represent the experience of the exposed population if, contrary to fact, they had not been exposed, or it could be a sample from the source population that gave rise to an identified group of cases. In practice, these ideal comparison groups can be difficult to identify. However, when the exposure of interest has a transient effect on risk for an abrupt onset outcome, the solution suggested by researchers such as Maclure (case-crossover),1 Farrington (self-controlled case-series),2 and others3 has been to use cases as their own controls.

Case-only designs are attractive because risk factors that are stable over time cannot confound the association between exposure and outcome. However, the case-crossover and self-controlled case series are subject to bias from population-level and individual-level confounders that vary with time. For example, there could be systematic trends in exposure over calendar time. On the individual level, there could be a change in another risk factor for the outcome, such as smoking habits, which is also associated with exposure; or bias may come into play if early signs of an impending event led to changes in exposure probability during the time preceding the occurrence of a health outcome.1,4–8

When the exposure under investigation is not influenced by the occurrence of individual health outcomes, as is often the case in environmental epidemiology studies, bidirectional sampling of control times (ie, sampling from person-time both before and after event occurrence) is a reasonable strategy for handling temporal exposure time trends.3 However, pharmacoepidemiologic studies typically investigate exposures in which the pattern of exposure is likely to be censored or altered following occurrence of an investigated health outcome (ie, β-blocker use after myocardial infarction). In this setting, Suissa4 has proposed the “case-time-control” design as an alternative approach to handling bias from temporal trends in exposure. Suissa's design involves performing a crossover analyses not only among the cases, but also in a sample of appropriate controls.4 However, the “case-time-control design” can reintroduce bias if an inappropriate control group is selected, in which estimates of exposure prevalence or exposure-time trends among the control group do not provide good estimates of the expected exposure prevalence or exposure-time trends among the cases under the null hypothesis of no relationship between exposure and outcome.1,2,4

We propose an alternative case-only method for handling exposure-time trends within a pharmacoepidemiologic framework. This method counters temporal trends in exposure through use of a matched time-control selection strategy. Such an approach neither samples person-time after outcome occurrence, nor does it require use of an external non-case comparison group. The proposed method requires that the effect of exposure be transient, that the onset of the outcome be acute, and that outcome occurrences be distributed across calendar time.


In the present study, we define a case of a person who develops the outcome of interest. External controls are persons sampled from the source population that gave rise to the cases, but who themselves did not develop the outcome of interest. Case-time refers to the time period at or immediately before the occurrence of the outcome of interest, and referent-time is the time period before the case-time during which the outcome does not occur. Using this terminology, the estimate of association between exposure and outcome in a case-control study can be described as the ratio of the exposure odds among cases to the exposure odds among a sample of external controls. In this study, the exposure odds from the sample of external controls are used to represent the expected exposure odds for cases under the null hypothesis of no relationship between the exposure and the outcome.

In contrast, the case-crossover design samples person-time from cases only, such that the expected exposure odds are determined from referent-time sampled from the exposure history of the case.1 Use of conditional logistic regression to analyze matched sets of case-time and referent-time within individuals produces estimates of association between exposure and outcome that are not confounded by time-invariant characteristics.

The case-time-control design adds an adjustment for trends in exposure over calendar time by conducting concurrent crossover analyses for cases and a sample of external controls.4 This method assumes that the odds ratio calculated among cases is the product of the odds ratio for the causal effect of the exposure on outcome multiplied by the odds ratio for exposure over calendar time (ORcase = ORcausal × ORtime).9 Crossover analyses using concurrent person-time sampled from control persons are used to measure the odds ratio for exposure over calendar time (ORtime). The cases' self-controlled odds ratio (ORtime × ORcausal) is divided by the corresponding self-controlled odds ratio in a concurrent matched control group (ORtime) to obtain estimates of the odds ratio for the causal effect of exposure on outcome (ORcausal).4,9 However, when using external controls, the validity of causal effect estimates depends on how well the control population reflects the expected exposure prevalence, as well as the expected exposure-time trends that would have observed among cases if the null hypothesis of no exposure-outcome relationship was true.4,9Additionally, none of these methods is able to distinguish between effects of a disease that prompts exposure (ie, early manifestations of a health condition leading to treatment) from the direct effects of the exposure on the health outcome.

The “case-case-time-control” design is an extension of the case-time-control design proposed by Suissa.4 However, rather than using a sample of external controls, our proposed analysis method uses referent-time sampled from future cases as controls for current cases to counter bias arising from temporal trends in exposure. This design assumes that referent person-time sampled from the at-risk history of future cases can provide a better estimate of expected exposure prevalence or of expected exposure trends over time than person-time sampled from external controls. The case-case-time-control design additionally minimizes the risk of introducing selection bias from use of a control group whose study base does not match that of the cases.

As seen in Figure 1, the “current” period is a cross-section of calendar time during which the event occurs for subject 1 but has not yet occurred for subject 2. The “reference” period is a cross-sectional sample of exposure history from the same subjects before the “current” period. The current and referent person-time from cases and their calendar-time-matched future-case controls form unique strata; analyses use between- and within-case comparisons to obtain estimates of the exposure-outcome relationship. Dividing the exposure odds for the current case by the exposure odds found for the future-case-control provides an estimate of the exposure-outcome relationship after adjusting for potential bias from exposure time trends. As in other self-matched designs, the use of within-subject comparisons adjusts for potential confounding by measured or unmeasured time-invariant characteristics.



The relationship between a transient exposure and on acute-onset outcome was evaluated using simulated data. Results from a case-crossover and a case-case-time-control analysis were compared to demonstrate their performance in the presence or absence of a time-invariant confounder and exposure-time trends. A case-time-control analysis that sampled referent person-time from an inappropriate control group was simulated to illustrate how selection bias can influence effect estimates.

We simulated 100 cohorts of transiently exposed subjects (n = 100,000) with a follow-up of 2 years for each of the 3 confounding scenarios. The first scenario had no unmeasured confounding; the second included a binary confounder that was time-invariant within persons; the third included a time-invariant confounder and an increase in probability of exposure over time in the source population. In scenarios 2 and 3, the binary, time-invariant confounder had 50% prevalence in the source population. As seen in Figure 2A, the presence of this confounder was associated with twice the probability of exposure over the 2-year follow-up and twice the probability of the outcome independent of exposure. In the third scenario, a monotonic increase in exposure prevalence was simulated where exposure prevalence doubled over the 2-year period. We chose this temporal trend because we believed it could plausibly be encountered in pharmacoepidemiology applications. As seen in Figure 2B, exposure probability doubled over the 2-year follow-up. Among persons with the confounding characteristic, exposure prevalence began at 14% and increased to 30% by the end of follow-up, whereas among those without the time-invariant confounder, exposure prevalence started at 8% and increased to 16%.

A, Time-invariant confounding. B, Time-invariant confounding plus change in exposure prevalence.

We generated outcome events according to a Poisson process in which event rates were set to be 1.0, 2.0, or 3.0 times higher during exposed than unexposed time. In our simulations, the first event censored observation. The same 100 datasets that had been simulated under each confounding scenario were used for each of the 3 case-only analysis designs being compared. Cohorts were generated using SAS (version 9.2, SAS Institute Inc., Cary, NC) and analyzed with STATA (Release 10.0, College Station, TX: Stata Corporation).


In our model specification, we use the following notation. We define Y to be an indicator for the outcome (1 = event and 0 = no event); T to be an indicator for the period (1 = current, 0 = reference); E to be exposure status (1 = exposed, 0 = unexposed), and C to be case status (1 = case, 0 = future-case or control). The case-case-time-control uses 1:1 matching of sampled person-time for each case and a future-case control. The case-time-control uses 1:1 matching of person-time for each case and a control sampled from a cohort with the same prevalence of the binary time-invariant confounder, but without a trend for increasing exposure over the 2-year follow-up.

The case-crossover design is analyzed using conditional logistic regression, where current and referent-times are matched within persons. This removes the effect of confounders that do not vary within persons over time. The model takes the following form:

where πi denotes the probability of E for cases i = 1, …, N. Thus, only persons who are discordant in their exposure status between case and referent-time contribute to the likelihood function. The odds ratio exp {β} compares the odds of exposure during case-time with the odds of exposure during referent-time, and may reflect both causal effects and the effect of any exposure time trends that may be present.

The model for the case-case-time-control design is identical to that of the case-time-control.4 Although similar in form to the case-crossover analysis described earlier in the text, the conditional model for the case-case-time-control and case-time-control includes an interaction term to separate the effects of time and exposure.4 As in the case-crossover analysis, case-times and referent-times are matched within persons. The case-case-time-control approach samples case- and referent-times from one or more future cases matched on calendar time to the case- and referent-times for each case used in the case-crossover analysis. The case-time-control approach does the same, except that it samples person-time from a control group that may or may not include future cases. The assumption is that the trend in exposure over calendar time is the same for current and future cases, allowing the control times sampled from the future cases to provide an estimate of the time effect. The log odds of exposure are modeled using time and the interaction between time and case versus future-case (or control) status. Again, let q be the index within each matched set and i the individual within the matched set. The regression model appears as follows:

where πqi denotes the probability of E for individual i in matched set q, as in the case-crossover analysis. Thus, the OR exp1} is the odds of exposure for case-time as compared with referent-time among the future cases (or controls). This provides an estimate of the exposure-time trend. The OR exp1 + β2} is the same odds among cases. Assuming the exposure time trend is common across all subjects, exp2} provides an estimate of the odds ratio between exposure and outcome after adjusting for potential exposure time trends.

Applied Example

As an applied example, we evaluated an expected null exposure (pharmacy-dispensed vitamins) as a potential trigger for stroke. Our analysis used administrative claims data from the Veteran's Administration (VA), including pharmacy, laboratory, inpatient, and outpatient claims. Data were extracted for all patients admitted to a VA hospital with a diagnosis of stroke between 2003 and 2006. The date of dispensation and the number of day's supply for vitamins dispensed from the VA pharmacy were identified for each stroke case. Days that fell between the dispensation date and the end of the day's supply were considered as exposed days. We defined the case index date as the date of the inpatient admission for stroke and the reference index date as 90 days before the admission. Patients were defined as exposed on an index date if they had a minimum of 3 days of exposure to vitamins within the 30 days before the index date. We used 1:1 matching of future-case controls to current cases. Future-case controls and cases were matched on age (±1 year), sex, and calendar time. We compared results of analyses using the case-crossover and the case-case-time-control.



The estimates of effect (on the logarithmic scale) and the difference (Dβ) between the average simulated coefficient and the true coefficient for each investigated analysis method are presented in Table 1. A Dβ of 0.1 corresponds to approximately 10% change in the odds ratio, whereas a Dβ of 0.2 corresponds to approximately 20% change in the odds ratio.

Average β and SD From 100 Simulated Cohorts

As expected, the case-crossover and case-case-time-control analyses both produce unbiased estimates when there is no unmeasured confounding and the exposure is perfectly measured (scenario 1: Dβ = 0.0). When a time-invariant confounder is introduced, the within-person comparisons used by the case-crossover and case-case-time-control methods eliminate the need to explicitly adjust for the confounder in regression modeling (scenario 2: Dβ = 0.0). However, when exposure prevalence increases over time (scenario 3), the effect estimates for the case-crossover analyses are biased both when the true exposure-outcome relationship was null and when an exposure-outcome relationship was present (Dβ = 0.1). Although the case-case-time-control was robust against bias induced by the simulated time trend in exposure (Dβ = 0.0), a case-time-control analysis that sampled person-time from a control population without the trend for increasing exposure produced estimates with the same magnitude of bias as the case-crossover analysis (Dβ = 0.1).

Applied Example

Inspection of temporal trends prescribed during a stroke event revealed no consistent pattern in prevalence of vitamin use across calendar time (Fig. 3A). However, when examining the prevalence of exposure over the year before the stroke event, we observe a steady increase in probability of exposure over time (Fig. 3B).

Exposure time trends. A, Calendar time. B, Three hundred sixty-five days prior to stroke event.

Comparing the results of the case-crossover with the case-case-time-control analyses, the case-crossover analysis estimates that baseline risk of stroke is elevated by 50% after brief exposure to vitamins, whereas the case-case-time-control indicates a null effect for vitamins as a trigger for stroke (Table 2).

Applied Example: Vitamins and Stroke (Bootstrapped SE)


Case-only methods provide an attractive analytic option for studying the effects of transient exposures on the risk of acute events. These methods are appealing because chronic risk factors that are stable over time within a person cannot confound the analyses. However, these methods remain susceptible to confounding by time-varying factors and time trends in exposure. We propose an extension to existing case-only methods that enhances the ability of researchers to address the issue of confounding from time trends in exposure, and we demonstrate its use in an applied example. The proposed case-case-time-control design uses both within- and between-case comparisons.

We evaluated the performance of this method using simulations. Conducting case-crossover analyses on simulated datasets in which the probability of exposure increased over time resulted in effect estimates that were biased upward by approximately 10%. In our example, applying case-time-control analyses that sampled control person-time from a population not experiencing an increase in exposure over time resulted in similar (biased) estimates as the case-crossover analyses. However, the ability of the case-time-control design to adjust for exposure time trends depends on how well the trend among the cases can be approximated by the control group. The bias observed in any specific study would depend on the suitability of the external control group chosen. In other words, depending on the control group, the magnitude of bias from a case-time-control study could be larger or smaller than that found with a case-crossover analysis. Applying our proposed case-case-time-control analysis in our simulated example resulted in unbiased effect estimates despite the presence of a strong time trend in exposure. Sampling control person-time from the at-risk period of cases that have not yet occurred minimizes the risk of sampling person-time from an inappropriate control group.

When conducting a case-case-time-control study, there are several practical considerations. First, it is important to emphasize that the case-case-time-control, like other self-controlled designs, can be used to study only short-term exposures with transient effects on the risk of events with acute onset. This is because exposures occurring during a referent period must not have residual or carry-over effects on risk during the current period. The lag between current and referent period can be based on prior biologic knowledge of the effects of the exposure (eg, drug pharmacodynamics). Sensitivity analyses should be performed using alternative lags to verify that results are not overly sensitive to these a priori assumptions.

Second, one must consider the duration of study follow-up, taking into account that some proportion of cases will not be able to be matched to controls derived from future cases. This is likely to be particularly true for health outcomes that occur toward the end of the follow-up period, as there are fewer subsequent cases from which to select potential matches. This feature of the study design has important implications for power calculations, in that cases that cannot be matched will not contribute to analyses. Even if most cases can be matched, increasing the number of future cases matched to each case will increase statistical efficiency.

Third, one must consider the permissible lag time between the outcome event for the current case and the outcome event for a matched future-case control. Person-time sampled from future cases needs to be sufficiently far removed in calendar time from the future-case event such that exposure can be reasonably assumed to be independent of the future-case event. If the exposure under investigation is indeed associated with the outcome, sampling person-time too close to the future-case event could lead to bias. In contrast, person-time should be close enough to the future-case event that the exposure time trend estimated using the sampled person-time provides a good approximation of the exposure time trend for current cases. This consideration is particularly important when time trends in exposure may be nonlinear or changing rapidly. Sensitivity analyses exploring alternative lag times between the case and future-case events are recommended.

Fourth, in addition to matching on time, future cases may be matched to current cases on other variables such as age, sex, or location. Although matching on multiple factors may enhance validity of estimates, the tradeoff is the potential loss of precision if the number of factors used reduces the number of cases that can successfully be matched to future cases.

In our applied example, the case-crossover approach produced estimates that indicated an elevated risk of stroke after brief exposure to vitamins, even though we observed no trend in exposure prevalence over calendar time. This biologically implausible result may be explained by a greater propensity for patients to seek treatment as their physical condition deteriorates or as early warning symptoms of stroke manifest. This can result in an increased probability of exposure to a variety of medical treatments in the time leading up to a stroke (ie, protopathic bias). An increasing propensity to be treated in the time before an event is an example of an exposure time trend that can be better estimated using person-time sampled from matched future cases than from an external noncase control group. In this example, after adjusting for the effect of the exposure time trend using person-time sampled from future cases, there was no evidence of an increased risk of stroke after brief exposure to vitamins.

We have demonstrated through simulation study that, in the absence of other time-varying confounders, the case-case-time-control analysis is able to produce unbiased estimates when exposure prevalence increases monotonically over time. When other temporal or seasonal patterns are suspected, these methods may be adapted to account for additional temporal factors influencing exposure prevalence by borrowing from time-stratified control selection methods, such as those used in environmental epidemiology.3 In their case-only analyses, environmental epidemiologists often use selection strategies that involve matching control periods to case periods on day of week, season, or other temporal factors that may influence exposure.3,7,10,11

Neither a bidirectional sampling approach nor a self-controlled case-series analysis were included in the comparison of case-only methods because these methods assume that exposure is neither censored nor altered subsequent to the occurrence of the outcome. Although there are promising new methods for handling outcomes that censor or alter exposure probability, the computational intensity and assumptions required by these methods may limit their utility.5,12

In conclusion, case-only analyses can be applied in situations where exposure status during follow-up is time-varying and there is a clear time of onset for the outcome of interest. Their advantages over more traditional cohort and case-control designs become particularly evident when an appropriate comparison group is difficult to identify, or when there are strong, time-invariant confounders that cannot be measured.1 The within-subject comparisons used by case-only methods implicitly adjust for time-invariant confounding within a person, whether measured or unmeasured. Case-case methods add to previously developed case-only methods by adjusting for temporal changes in exposure prevalence without use of external controls or postevent person-time. Additionally, the case-case-time-control can reduce the impact of protopathic bias, a bias that can occur when early manifestations or warning signs of a disease lead to exposure.8


1. Maclure M. The case-crossover design: a method for studying transient effects on the risk of acute events. Am J Epidemiol. 1991;133:144–153.
2. Farrington CP. Control without separate controls: evaluation of vaccine safety using case-only methods. Vaccine. 2004;22:2064–2070.
3. Lumley T, Levy D. Bias in the case- crossover design: implications for studies of air pollution. Environmetrics. 2000;11:689–704.
4. Suissa S. The case-time-control design. Epidemiology. 1995;6:248–253.
5. Whitaker HJ, Farrington CP, Spiessens B, Musonda P. Tutorial in biostatistics: the self-controlled case series method. Stat Med. 2006;25:1768–1797.
6. Navidi W, Weinhandl E. Risk set sampling for case-crossover designs. Epidemiology. 2002;13:100–105.
7. Bateson TF, Schwartz J. Control for seasonal variation and time trend in case-crossover studies of acute effects of environmental exposures. Epidemiology. 1999;10:539–544.
8. Horwitz RI, Feinstein AR. The problem of “protopathic bias” in casecontrol studies. Am J Med. 1980;68:255–258.
9. Hernandez-Diaz S, Hernan MA, Meyer K, Werler MM, Mitchell AA. Case-crossover and case-time-control designs in birth defects epidemiology. Am J Epidemiol. 2003;158:385–391.
10. Janes H, Sheppard L, Lumley T. Case-crossover analyses of air pollution exposure data: referent selection strategies and their implications for bias. Epidemiology. 2005;16:717–726.
11. Navidi W. Bidirectional case-crossover designs for exposures with time trends. Biometrics. 1998;54:596–605.
12. Farrington CP, Whitaker HJ, Hocine MN. Case series analysis for censored, perturbed, or curtailed post-event exposures. Biostatistics. 2009;10:3–16.
© 2011 Lippincott Williams & Wilkins, Inc.