Clostridium difficile infection is a rapidly increasing cause of health care–associated infections. Based on discharge data from the Healthcare Cost and Utilization Project Nationwide Inpatient Sample, approximately 336,000 cases of C. difficile infection occur annually in the United States.1 This number of cases would cost approximately $500 million per year.2,3 In contrast to other health care–associated infections, C. difficile incidence has increased in the United States, Canada, and Europe, despite prevention efforts.4
The design and analysis of interventions to prevent C. difficile is complicated by the setting in which infection takes place. Hospitalized patients are statistically nonindependent with respect to infectious outcomes, as they share health care providers, a common environment, and a host of other factors. Infected patients act as a source of exposure for other patients in addition to having their own outcomes. Single-intervention studies are rare and difficult to conduct. Thus, hospital policy is often decided upon with scarce data and incomplete information. In this kind of environment, mathematical and cost-effectiveness models are widely used for decision making. To inform model development, there is a need for unbiased epidemiologic estimates of patient outcomes, including length of stay and all-cause mortality, to quantify the experience of a patient with C. difficile infection.
Quantifying these outcomes presents a 3-fold problem. First, infection events cannot be considered independent, necessitating analytic techniques that account for clustering within a hospital. Second, to facilitate the use of these estimates in mathematical models, cost-effectiveness research, and other applications, rates or hazards must be directly estimated. Finally, patients may experience several mutually exclusive outcomes such as death or release from the hospital. To address this final problem, competing risks approaches must be used. Conventional competing risk analysis (ie, a cause-specific survival model) estimates the time to one outcome while treating the other outcomes as censored.5 These estimates address a particular question; namely, in the case of death versus release from a hospital, they estimate the time until death if no one were ever released or the time until release if no one ever died while in the hospital. Although in some settings this approach might be acceptable, we wish to estimate the time until death given the observed levels of release, and the time until release given the observed levels of mortality, to inform future transmission modeling.
This study aims to address the problems enumerated above by applying parametric mixture survival models to estimate 2 survival outcomes from a multihospital cohort of patients with C. difficile infection. This study had the following objectives: (1) to estimate the relative times to death from any cause and time to release from hospital comparing patients in the intensive care unit (ICU) to those in the general hospital population; (2) to estimate the case-fatality rate for ICU and non-ICU patients, and the odds ratio between them; and (3) to compare these estimates to a conventional analysis.
We used a cohort of 609 adult (>18 years of age) incident cases of C. difficile infection admitted between 1 July 2009 and 31 December 2010 obtained from infection control surveillance data from 28 hospitals within the Duke Infection Control Outreach Network, a group of hospitals that shares infection control expertise and data in the southeastern United States.6 The maximum number of cases from a single hospital was 74, the minimum 1, and the median 13. All cases were hospital onset, health care facility-associated, as defined by the Center for Disease Control and Prevention (CDC)’s surveillance guidelines.7 Specifically, cases must have arisen in the hospital more than 48 hours after admission. Institutional review boards at the Duke School of Medicine and the University of North Carolina at Chapel Hill approved the study protocol.
Survival Times and Outcomes
The study had 2 competing, mutually exclusive outcomes of interest: death from any cause and release from the hospital within 180 days, the latter being judged to be well beyond the duration of C. difficile infection. The origin of time at risk was defined as the date of a positive test for C. difficile. The event time was given as the date of release from the hospital or date of death. The single patient with an event time >180 days was censored at 180 days, and the 12 patients with an unknown event time were considered interval-censored from 12 hours after diagnosis to 180 days after diagnosis. Patients who were diagnosed and discharged on the same date were assumed to spend 12 hours in the hospital.
Exposure Definition and Covariate Selection
The ICU status of a patient was determined at the time of diagnosis of C. difficile infection. ICU patients were those in the hospital’s ICU at the time of their diagnosis, and non-ICU patients were those not in the ICU at the time of diagnosis, regardless of whether their treatment subsequently involved the ICU.
Inverse probability weights were used to control for confounding by patient characteristics measured at hospital admission.8 Using such weights, rather than regression adjustment, allows estimated curves to represent the marginal survival function, rather than survival conditional on covariates.9 Variables considered for inclusion in the model were the following: patient age; whether the patient was on dialysis; if the patient had been previously hospitalized within 12 weeks before the current admission; if that prior admission had been in the same institution as the current admission; if the patient had been previously diagnosed with C. difficile; the patient’s sex and race; source of admission (where the patient had been before admission); the medical specialty primarily responsible for the patient (medicine, surgery, obstetrics/gynecology, etc); discharge from any hospital within the past year; and whether the C. difficile infection was a new episode, a recurrent episode, or a continuation, per CDC definitions.7
Potential confounders were included in the weighting model if they were moderately associated (P < 0.20) with either time until death or release using a Weibull or log-normal parametric survival model, respectively. We included higher order terms for the sole continuous variable (age at admission) and bivariate interactions that moderately improved model fit as evaluated by a likelihood ratio test (P < 0.20). Multiple imputation was used to handle missing covariate values, which resulted in 119 cases with at least 1 missing variable. Thirty imputations were based on a multivariate normal model with all variables in the substantive analysis, including outcomes. Imputations were combined using Rubin’s canonical variance estimator.10
Parametric Mixture Model
We modeled time to death and time to release as a mixed survival function, SD(t) and SN(t), respectively, where SD(t) + SN(t) = 1 at t = ∞, indicating all patients having experienced 1 of the 2 outcomes of interest. These 2 functions, as well as the proportion of patients who died (π) and who were released (1– π), give a probability of an event time T taking place at T < t of P(T < t) = π[1 – SD(t)] + (1 – π)[1 – SN(t)]. Details on the theory and implementation of this type of model have been published.11 In brief, these functions can be estimated using maximum-likelihood methods, with the likelihood of a given individual i expressed as follows (Equation 1):
where fD(t) and fN(t) are the probability density functions for death and release, δ and θ are indicators for death = 1 and release = 1, and ζ and η are indicators for interval-censored times for death and release. For interval-censored observations, ti1 and ti2 indicate the 2 times bracketing the censored interval, where ti1 < ti2; in this study, ti1 = 0.5 and ti2 = 180 days. Weighting is incorporated by multiplying the natural log of Li by individual patient i’s weight.
The survival functions used in the mixture model may be any parametric functions. Previous studies have used exponential,12 log-normal,13 and generalized gamma survival functions,14 differing functions for each outcome,5 and nonparametric extensions of the Kaplan-Meier method,15 among others. In this study, we used a Weibull function for death and a log-normal function for release. This choice mirrored the best-fitting parametric models in the single-outcome models discussed below and in the confounder selection process.
Robust standard errors with clustering by hospital were calculated to account for nonindependence among patients in the same hospital. From this model, we obtained 5 main estimates: the ratios of median survival times for death (RTD) and release (RTN) between the ICU cases and non-ICU cases; the proportions who died in hospital for the ICU cases and non-ICU cases (π1 and π0, respectively); and the odds ratio of the mixing proportions (ORπ), which provides a relative measure of mortality between the ICU groups.
For comparison purposes, the cohort was also analyzed using conventional competing risks analysis, with each outcome modeled independently and patients who experienced the other event being considered censored at their event time. The distribution for each parametric model was determined by comparing models using the exponential, Weibull, log-normal, log-logistic, and generalized gamma distributions and selecting the model with the lowest Akaike’s information criterion. Based on this comparison, we selected a Weibull survival model to model time until death and a log-normal survival model to model release. As with the mixture model, robust standard errors were used to account for nonindependence arising from clustering by hospital. All analysis was done using SAS 9.2 (SAS Institute, Cary, NC).
The characteristics of the patient populations in ICU and non-ICU are summarized in Table 1, with the age distribution of the cohort shown in eFigure 1 (http://links.lww.com/EDE/A794). There were 160 (26%) ICU patients and 449 (74%) non-ICU patients.
In the ICU population, 42 patients (26%) died, whereas in the non-ICU population 43 patients (10%) died. The remaining patients were released from the hospital. eFigure 2 (http://links.lww.com/EDE/A794) provides a graphical depiction of the distribution of exposures, outcomes, and survival times in a 20% random sample of the cohort.
The following factors were at least moderately associated with 1 of the 2 outcomes: patient’s age, sex, and race; the source of admission; whether this was a surgical patient; patient on dialysis; and a new case of C. difficile infection (in contrast to a continuing or recurrent case) (Table 2). We found the following interactions to result in improved model fit for the outcome-specific models: between patient’s race and sex, age; whether this was a new C. difficile infection case and dialysis status; patient’s sex and both surgical and dialysis status; and admission source and patient age and dialysis status.
Parametric Survival Models
Using a conventional competing risks approach, the relative time to death (RTD) for the ICU versus non-ICU populations was 0.65 (95% confidence interval [CI] = 0.36–1.17), suggesting that ICU patients died marginally more swiftly than their counterparts in the general hospital population. The relative time until release (RTN) was 2.30 (1.66–3.18), reflecting longer lengths of stay within the ICU population (Figure 1).
The mixing proportion in the ICU population (π1) was 0.28, whereas the mixing proportion in the non-ICU population (π0) was 0.10. The odds ratio of the mixing proportions (ORπ) was 3.38 (95% CI = 1.84–6.19), which reflected the substantially higher burden of mortality in ICU patients compared with those in the general hospital population. Comparing the median event times between ICU and non-ICU patients, RTD was 1.97 (95% CI = 0.96–4.01) and RTN was 1.88 (1.40–2.51) (Figure 2). The robust standard errors resulted in a slight inflation of a parameter’s uncertainty and performed similarly to standard errors obtained using a nonparametric bootstrap method (not shown). Compared with the multiply imputed data used in the primary analysis, estimates using complete cases were less precise and resulted in markedly different effect estimates (eTable 1; http://links.lww.com/EDE/A794).
Our estimates, in contrast to those from the conventional models described above, indicate that, despite the higher severity of illness that might reasonably be assumed, patients admitted to the ICU experienced longer times to both death and release compared with patients in the general hospital population. The differences in estimates between the 2 models are summarized in Table 3.
The purpose of this study was to examine the outcomes experienced by patients with C. difficile infection as a mixture of 2 simultaneously occurring survival processes rather than as 2 disjoint events. We believe that this approach more realistically captures the actual disposition of patients within the hospital. The use of a weighted, parametric mixture model allows for the estimation and prediction of survival times, produces marginal effect estimates and covariate-adjusted survival curves, and is free from the proportional hazards assumption. This assumption, however, is exchanged for the need to correctly specify the underlying distribution of event times, as well as proportional survival times.
This method illustrates the potential for incorrect estimation in conventional survival analysis when both outcomes are relevant to infection prevention efforts. The conventional competing risks model found a slightly reduced time until death for ICU patients (RTD of 0.65 [95% CI = 0.36–1.17]). In contrast, the mixture model’s estimate of RTD (1.97 [0.96–4.01]) is not only on the other side of the null but the CIs of the 2 estimates do not overlap.
On the surface, the estimate from the conventional survival model feels more intuitive. The linkage between the frequency of an outcome and the speed at which the outcome occurs is a familiar one in survival analysis, but that familiarity should not be mistaken for mathematical certainty. The origin of this misestimation lies in what is being modeled. Being capable of modeling only the time until an event, the conventional approach assumes that the survival function for death at t = ∞ is equal to zero when in truth a minority of patients experience this outcome during their hospital stay. The rest are treated as censored; their event times are pushed out toward the tail of the distribution. This, in turn, causes the conventional survival model to vastly overestimate the per-time probability of death. For example, the probability of death at 90 days is 0.765 for ICU and 0.622 for non-ICU patients using a conventional approach—both unrealistically inflated values, given the known lower mortality rate in the study population.
In contrast, the mixture model decouples frequency and rate, forcing both survival functions to equal the mixing proportion of their respective outcomes at t = ∞, a less stringent requirement. By doing so, it can produce more realistic probabilities of an event occurring. In contrast to the inflated 90-day probabilities of death discussed above, the mixture model estimates a probability of death at 90 days at 0.275 for ICU and 0.014 for non-ICU patients. These differences in the estimated survival functions are the source of the disparate estimates of RTD and RTN. Such differences in estimated survival times have potentially important downstream effects on administrative decisions, mathematical, or cost-effectiveness models, and so on.
Focusing on the results of the parametric mixture model, we found that ICU patients infected with C. difficile experience both longer times until death and longer overall lengths of stay after infection, as well as a burden of mortality 3 times that of their non-ICU peers. These differences may arise because of the inherent clinical differences in patients requiring intensive care compared with those who do not. While non-ICU patients died faster, they also died in much lower numbers, perhaps representing those patients with acute conditions beyond the capacity of the hospital to treat—patients with acute stroke or cardiac arrest—or those patients whose palliative goals do not involve aggressive treatment and transfer to the ICU. In contrast, intensive care patients require more aggressive care and die relatively more frequently. At the same time, aggressive, specialized, ICU care may prolong their lives significantly.
The study estimates suggest that ICU populations demand additional resources and attention from an infection prevention perspective to control the hospital-wide problem of C. difficile. The ICU has a proportionately large volume of adverse outcomes, and ICU patients’ longer length of stay may contribute more to the contamination of the hospital environment and serve as a reservoir for in-hospital transmission of C. difficile. Patients have been shown to shed C. difficile into the environment continuously after infection, even after their symptoms have subsided.16 Because of their longer time in the hospital, ICU patients have increased opportunities to shed C. difficile spores. Whether this higher individual-level potential for shedding is outweighed by the considerably larger number of spore-shedding patients within the general hospital population who are hospitalized for shorter time periods warrants further examination.
This study has several limitations. It is not a study of the effect of C. difficile on patient outcomes but rather a study of the effect of the clinical settings on patients infected with C. difficile. Although the surveillance data used have information on whether a given patient died within the hospital, it cannot necessarily be assumed that these deaths were attributable to C. difficile, either solely or as part of a constellation of ailments. Understanding the effect of clinical setting on patient outcomes in turn allows for the examination of the patient’s potential impact on the hospital’s environment during their infection, until it is interrupted favorably by release from the hospital or unfavorably by the patient’s death. While less patient-centric than the results of other study methodologies, these types of estimates are crucial for the study and prevention of hospital-acquired infections, where infected patients not only represent adverse outcomes but also are sources of infection risk to other patients.
ICU exposure status was assigned at diagnosis, whether a patient passed through the ICU after their diagnosis. Therefore, some ICU-exposed patients may be misclassified. Finally, patients may experience a recurrent infection without the assessors’ knowledge of the prior infection, primarily because the previous treatment was at another hospital. Therefore, some covariate misclassification may have occurred. However, these limitations are inherent in the difficult task of conducting observational studies within a hospital setting, and they occur regardless of the analytic methods.
Our approach allows the separate estimation of the timing of an event and the frequency with which it occurs, providing a more nuanced view of the outcomes experienced by patients with C. difficile infection. As the interest in health care–associated infection prevention increases, so too does the need for more sophisticated analytic techniques to reflect the complexity surrounding patients, providers, and the environment of a healthcare facility.
We thank A. van Rie, W. A. Rutala, and N.H. Fefferman for their advice and comments on drafts of this manuscript.
1. HCUPnet. . Agency for Healthcare Research and Quality; 2006. Available at: http://hcupnet.ahrq.gov/
. Accessed 17 September 2012
2. Ghantoji SS, Sail K, Lairson DR, DuPont HL, Garey KW. Economic healthcare costs of Clostridium difficile
infection: a systematic review. J Hosp Infect. 2010;74:309–318
3. McGlone SM, Bailey RR, Zimmer SM, et al. The economic burden of Clostridium difficile
. Clin Microbiol Infect. 2012;18:282–289
4. Miller BA, Chen LF, Sexton DJ, Anderson DJ. Comparison of the burdens of hospital-onset, healthcare facility-associated Clostridium difficile
Infection and of healthcare-associated infection due to methicillin-resistant Staphylococcus aureus in community hospitals. Infect Control Hosp Epidemiol. 2011;32:387–390
5. Lau B, Cole SR, Gange SJ. Competing risk regression models for epidemiologic data. Am J Epidemiol. 2009;170:244–256
6. Anderson DJ, Miller BA, Chen LF, et al. The network approach for prevention of healthcare-associated infections: long-term effect of participation in the Duke Infection Control Outreach Network. Infect Control Hosp Epidemiol. 2011;32:315–322
7. McDonald LC, Coignard B, Dubberke E, Song X, Horan T, Kutty PKAd Hoc Clostridium difficile
Surveillance Working Group. . Recommendations for surveillance of Clostridium difficile
-associated disease. Infect Control Hosp Epidemiol. 2007;28:140–145
8. Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168:656–664
9. Cole SR, Hernán MA. Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed. 2004;75:45–49
10. Little RJA, Rubin DB. The analysis of social science data with missing values. Sociol Method Res. 1989;18:292–326
11. Lau B, Cole SR, Gange SJ. Parametric mixture models to evaluate and summarize hazard ratios in the presence of competing risks with time-dependent hazards and delayed entry. Stat Med. 2011;30:654–665
12. Cox DR. The analysis of exponentially distributed life-times with two types of failure. J R Stat Soc Ser B Stat Methodol. 1959;21:411–421
13. Cole SR, Li R, Anastos K, et al. Accounting for leadtime in cohort studies: evaluating when to initiate HIV therapies. Stat Med. 2004;23:3351–3363
14. Checkley W, Brower RG, Muñoz ANIH Acute Respiratory Distress Syndrome Network Investigators. . Inference for mutually exclusive competing events through a mixture of generalized gamma distributions. Epidemiology. 2010;21:557–565
15. Ghani AC, Donnelly CA, Cox DR, et al. Methods for estimating the case fatality ratio for a novel, emerging infectious disease. Am J Epidemiol. 2005;162:479–486
16. Sethi AK, Al-Nassir WN, Nerandzic MM, Bobulsky GS, Donskey CJ. Persistence of skin contamination and environmental shedding of Clostridium difficile
during and after treatment of C. difficile
infection. Infect Control Hosp Epidemiol. 2010;31:21–27