The CSH corresponds to the rate of type j event at time t, in the presence of all types of events. The presence of other events affects the “risk set,” that is, people who survive up to time t. For example, if a person dies from a type 2 event (eg, cancer-caused death), then that person is no more at risk for the type 1 event (eg, stroke). Thus, the estimation of the CSH for type j event involves removing people who have experienced other types of competing events before time t from the “risk set” and assumes that only those who are not censored prior to time t as being at risk for experiencing the type j event. When X is a single categorical variable (eg, treatment indicator), we may evaluate whether λj(t) differs across categories of X. When λj differs by a constant multiplicative factor across categories (the proportional hazards assumption), we can use the standard log-rank test to test for difference in hazards. When X is continuous or contains multiple covariates, we may use the Cox model to estimate how different variables affect the CSH.14
CSH modeling is useful for estimating the average relative risk reduction due to treatment in trials. It is important to model the CSH for both the primary and competing outcomes to obtain a comprehensive assessment of the effects of the treatment.
The EFS, S(t), is defined as the probability of not experiencing any event until time t.
EFS can be written in terms of the overall hazard function, λ(t | X) :
The overall hazard function is the sum of all the CSHs:
EFS is commonly used, because it simplifies the competing risks problem by transforming it into a univariate survival problem, where only the time to occurrence of the first event, whatever it may be, is considered. A well-known example is the composite outcome strategy in clinical trials. Frequently, endpoints such as the time to myocardial infarction, stroke, and all-cause mortality are combined as a single outcome. Therefore, standard techniques of univariate survival analysis including the log-rank test and Cox regression model can be used (when censoring is independent).
EFS may be used to estimate the average effect of the treatment on the overall survival in a trial. However, it should only be used when the treatment alters the risk of primary and competing outcomes in the same direction, in which case it has greater power to detect a treatment effect than the CSH approach.
Cumulative Incidence Function
CIF of event of type j, Fj(t | X), is the probability of experiencing the event of type j in the time interval (0, t]. It is defined as:
It is related to the CSH of the various types of events as follows:
The CIF is also known as the absolute cause-specific risk16 and the crude incidence function.18 Although the CSH denotes the instantaneous rate of event occurrence, the CIF is the probability of the event's occurrence over a meaningfully long time interval. The CIF is a subdistribution function; it is a nondecreasing function of time with Fj(t = ∞) =Pr(ε = j)≤1.
In Eq. 7, λj(u | X) is the CSH (Eq.1) and S(u | X) is the EFS (Eq. 3). Thus, Eq. 7 demonstrates the connection between the 3 estimands: CIF, CSH, and EFS. The CIF of type j event depends upon the CSH of all types of events through its dependence on the EFS, S(u | X). Importantly, even if an intervention has no direct impact on λ1, it can still influence F1 by affecting the CSH of one of the competing outcomes. Conversely, it is possible for an intervention to have a significant impact on λ1, but have no impact on F1. This is an essential feature of the competing risks problem. Because of this dependence of F1(t | X) on all λj(t | X), the testing for equality between K groups and regression estimation for F1 is not straightforward. The hypothesis of equality of λ1 between K groups is not equivalent to the hypothesis of equality of F1. Consequently, the log-rank test is not applicable for testing the equality of CIFs. Gray proposed a test for testing the equality of the CIFs of a particular type of event for K different groups.19
Regression modeling for estimating the association between the CIF and covariates, X, is complicated because Fj(t | X) depends on X in a complex manner via the individual CSH regression functions (Eq. 7). If the main goal is to estimate the CIF for a given set of covariates X, we can construct Cox models for each CSH and use the definition in Eq. (7). This is the “indirect” approach. It is valid provided the proportional hazards assumptions for the individual CSH models are reasonable.18 If, however, the goal is not estimation of Fj, but estimation of covariate effect on Fj, the indirect approach is not suitable. Fine and Gray proposed a direct regression model assuming a proportional hazard form for the “hazard rate” of the subdistribution function
The covariates are treated as linear on log(−log(1 − Fj(t | X))).19 This model also allows for time-varying effects of covariates on the CIF. Since Fine and Gray,19 several modeling techniques have been proposed for direct regression estimation of the CIF.20–22 These techniques can be implemented using R functions or SAS macros.
Because the CIF is a measure of the actual probability of events, it may be used to evaluate the effect of a treatment on the absolute risk of outcomes. It is better suited for risk-benefit calculations than the CSH. Fine and Gray's regression approach is useful for developing risk prediction models for the primary outcome (Table 1).
Now, we show why the 1 − KM estimator overestimates the CIF. The 1 − KM estimate of cumulative incidence of event type j is:
Comparison of Eqns. (8) with (7) and (9) with (3) shows that the 1 − KM estimator sets the CSH of all event types, except that of type j, to zero. Consequently, Sj(t | X) > S(t | X), and Gj(t | X) > Fj(t | X). Thus, the 1 − KM estimator overestimates the cumulative incidence of event type j. This estimator treats those who die of causes other than j as being at risk for type j event, or equivalently, it means that 1 − KM estimates cumulative incidence for cause j in the hypothetical scenario where other causes of failure are absent.
SECTION 3: A SIMULATION STUDY OF THE 3 ANALYTIC METHODS FOR Two-SAMPLE TESTING
Three different measures of outcome were defined in Section 2: CSH, EFS, and CIF. We illustrate the use of these estimands for detecting the effect of intervention on the outcomes, under different scenarios of how the intervention affects the different outcomes. Of the 3, CSH is fundamental. EFS and CIF are nonlinear functions of CSH (Eqs.1, 3, 5, and 7). The implication of this is that an intervention can have different effect on the 3 estimands. For example, an intervention may have a significant reduction in the CSH of primary outcome, but it may not affect the EFS or the CIF of primary outcome to a comparable extent.
We consider 2 types of events (J = 2): the primary event of clinical interest (ε = 1) and the competing event (ε = 2). We test whether there is a difference between 2 treatment groups (A = new treatment and B = control) in terms of the EFS, the CSH for the primary event, and the CIF for the primary event.
We use the log-rank test for testing equality of the CSH for the primary event between the groups A and B:
We use Gray's K-sample test for testing the equality of CIF for the primary event between the groups A and B:
We use the log-rank test for testing equality of EFS. This is equivalent to testing the overall hazard rate between the groups A and B:
For our simulation, we generated 4 sets of uncorrelated random numbers from exponential distributions, representing failure times for the primary, and the competing events and the 2 treatment groups, A (treatment) and B (placebo). The rates for the 4 exponential distributions were chosen to reflect 9 possible effects that an intervention might have on the 2 events. (Table 2). The sample size was 100 in each group. For log-rank tests of Ho and S0, we used the survdiff function in the R package, survival.24 For Gray's K-sample test of I0, we used the cuminc function in the R package, cmprsk.24 All the tests were set up to have a nominal type-I error rate of 0.05.
Table 2 (a) shows scenarios under which there is no treatment effect on the primary event. In scenario 1, where treatment has no effect on both the primary and competing events, all 3 tests reject the null hypothesis at the correct probability of 0.05. In scenarios 2 and 3, where only the competing events rate is affected by the treatment, hypothesis of equality of CSH (H0) is rejected at the correct probability level of 0.05. Hypothesis of equality of CIF of primary event (test I0) in the 2 groups is rejected at a probability greater than 0.05. This is because the treatment does affect F1 by affecting λ2, but not λ1. We can, in fact, detect this treatment effect on the competing event rate λ2,with greater statistical power, by performing the log-rank test: λ 2 A(t) = λ 2 B(t) , for all t. This underscores the need to perform testing on CSH for all causes.
Table 2 (b) shows scenarios (4 and 5) where the treatment only affects the primary event. These scenarios illustrate that the test of equality of CSH of primary event (test H0) has the greatest power to detect a treatment effect. In Table 2 (c) the treatment affects both primary and competing events. When the treatment affects both event types in the same direction (scenarios 6 and 7), testing the equality of EFS (test S0) is most powerful for detecting a treatment effect. However, there are many examples of treatments whose effects on competing events were not anticipated before large trials of sufficient duration were performed.25 Therefore, results of EFS analysis should be reported along with the effects of treatment on the CSH of primary and competing events. When treatment affects both event types, but in the opposite directions (scenarios 8 and 9), the EFS based test has the least power to detect treatment impact. In this case, the test for equality of CIF of the primary event (test of I0) has most power for detecting a treatment effect. This includes instances where the treatment reduces primary events (such a beneficial reduction in the incidence of bad outcomes) but increases competing events (such as a harmful increase in the incidence of competing events), a situation that is of paramount relevance to clinical trials. Results of CIF analysis should not be reported without also reporting the effects of treatment on the CSH of primary and competing events.
SECTION 4: REGRESSION MODELING OF COMPETING RISKS—APPLICATION TO AN RCT
We now use an example of a randomized controlled trial of the effect of diethylstilbestrol for treating prostate cancer to demonstrate a comprehensive approach to the evaluation of treatment effectiveness in the presence of competing risks.28 The trial outcomes were published in 1980; the data were reanalyzed using CSH in 1986.27 Participants were allocated to 1 of 4 regimens: placebo, 0.2, 1.0, and 5.0 mg of diethylstilbestrol daily. Following Green and Byar, we dichotomized the treatments and designated placebo and 0.2 mg as the placebo group, and 1.0 and 5.0 mg as the treated group. Death from prostate cancer was the primary outcome; deaths from cardiovascular events and other causes were competing events. We report results for the 483 patients with complete information on all relevant variables. Of these, 344 died, with 149 from prostate cancer, 139 from cardiovascular causes, and 56 from other causes.
Kay reported the results of separate Cox proportional hazards models for each cause of failure, where failures from the other causes were treated as censored observations27; we analyze these data by modeling the CIF, using the direct regression approach of Fine and Gray19 (Table 3).
An analysis of only the primary outcome, prostate cancer, without also examining the treatment effect on competing risks, cardiovascular, and other deaths, would overestimate the benefits of the treatment. From a CSH analysis of all outcome types it is evident that the treatment is protective against prostate cancer and death from “other causes,” but raises the risk of cardiovascular deaths (Table 3, first row). A CIF analysis leads to a similar inference. Figure 2 compares the CIFs. Gray's 2-sample test was conducted to determine whether the treatment affected the CIFs (P values are reported in figure caption).17 The cumulative incidence plots reveal that the probability of cardiovascular death is high initially in the treatment group, so that the treatment has increased cardiovascular death within 1 year of initiating the treatment.
An EFS analysis using all-cause mortality suggests that the treatment is beneficial since it reduced the rate of overall death. This is a dangerously misleading inference, because the data also show that the treatment is related to excess cardiovascular mortality. Furthermore, there can be no good justification for combining cancer and cardiovascular mortality. It is worth noting the lack of power to detect a significant treatment difference on all-cause mortality, because the treatment effect on overall death is diluted by the contrasting treatment effects on cancer plus other mortality and cardiovascular mortality. This is consistent with scenarios 8 and 9 in Table 2 (c), where the EFS based test S0 had the least power for detecting a treatment effect (Figure 2).
To demonstrate the potential for bias when estimating treatment induced absolute risk reduction (ARR) without accounting for competing events, we evaluated the effectiveness of DES in reducing the absolute risk of prostate cancer. A popular but flawed approach is to estimate this as the difference between the 1 − KM estimates of the cumulative incidences of prostate cancer in the treatment and control groups. The correct approach is to estimate the ARR as the difference between the CIFs of the 2 groups. We compare these 2 approaches in Figure 3, which shows that the 1 − KM overestimates the ARR due to DES treatment. The positive bias of 1 − KM is especially pronounced when the treatment has a protective effect on the competing risks. In this example, DES increased the risk of cardiovascular death, during the initial 1-year period, but it was also modestly protective against the other competing event, “other” causes of death, during the later study period. Therefore, in the initial 1-year period we do not see any significant bias of 1 − KM estimate of ARR for prostate cancer, but we see a positive bias after 1 year.
SECTION 5: STATISTICAL APPROACHES BASED ON UNOBSERVABLE VARIABLES
The 3 approaches discussed so far are aimed at estimating the effect of interventions on observable quantities: the CSH itself, and CIF and EFS, which are functions of CSH. These are called “crude” estimates of treatment effect (Section 1). The crude estimands are observable in the sense that they can be estimated using only observable information such as time-to-event and the type of event (under the assumption that censoring due to drop-out is completely at random). Furthermore, the modeling assumptions can be checked using observed data. It may be argued that the crude treatment effects do not represent the “true” biologic impacts of the intervention on outcomes, and that such evaluations should be based on the unobservable marginal distributions of the times to outcomes.23,28,29 Treatment effects on the marginal failure time distributions are called “net” treatment effects. Although it seems attractive, estimation of net effects is fraught with conceptual and methodological difficulties: (a) in what sense does the time to stroke exist for an individual who has died of breast cancer? and (b) how do we check assumptions about the dependency between the multivariate failure times when only one of them is observable on each person? Prentice et al provides a cogent critique of this approach.4 Some recent advances have attempted to address these conceptual issues. Rubin, who called this problem “truncation by death,” proposed that a valid comparison of 2 interventions, in terms of their effects on the primary outcome, can only be carried out in such persons who would not experience failure from any of the other competing outcomes under either intervention.30 However, we cannot know who such people are from observed data. Further assumptions, which are usually untestable, are required to identify such people and to estimate the differences in survival times for the 2 interventions in that group. In summary, assessing the “true” biologic impact of interventions in a competing risks problem is challenging.
SECTION 6: SUMMARY AND RECOMMENDATIONS
We have illustrated the appropriateness of different estimands for addressing different study objectives (Table 1). CSH is the fundamental and most commonly used estimand in competing risks problems. It is appropriate for investigating the effect of an intervention on the rate of occurrence of an event, allowing for the presence of all types of outcome events. For example, the CSH approach might be useful to evaluate the average treatment effect of aspirin to prevent stroke in a population with hypertension, where mortality from other causes can occur. Results of CSH analysis for primary and competing outcomes should always be reported in a competing risks problem.
EFS analysis, frequently used for evaluating the composite effect of an intervention in randomized trials, is appropriate only when combining different endpoints is clinically and biologically meaningful, and when the treatment has similar effects (both in magnitude and sign) on the individual event types. For example, EFS might appropriately be used to study the effectiveness of antiretroviral therapy in patients with HIV/AIDS. The therapy is effective at both reducing opportunistic infections and mortality, which may be combined appropriately in a composite outcome. Results of EFS analysis should always be accompanied by the effects of treatment on the CSH of primary and competing events.
CIF is useful for evaluating the effect of an intervention on the probability of occurrence of an event of a particular type over a meaningful period of time. This is best for absolute risk calculations, risk prediction, risk/benefit analyses of a treatment, and identification of subgroups most likely to benefit from a treatment. CIF might be appropriate for investigating an intervention with both substantial benefits and risks like warfarin for the management of atrial fibrillation. Warfarin reduces the risk of stroke but also raises the risk of hemorrhage. Results of CIF analysis should always be reported alongside the effects of treatment on the CSH of primary and competing events.
The 1 − KM estimator should not be used for estimating the absolute risk when competing risks are present, because (a) its validity is problematic, (b) it overestimates the absolute risk, and (c) it likely overestimates the ARR due to an intervention.
The relevance and usefulness of the evidence derived from the use of different estimands may depend on the stakeholder. The needs of the clinician may differ from those of a payor or a policymaker, and this may guide the investigator choosing an estimand. It may be informative for the investigator to explore all 3 estimands (CSH, CIF, and EFS) when analyzing data, although it may not be necessary to report all the results, especially if the study objectives clearly dictate the use of a particular estimand.
The technical challenges of implementing the competing risks methodology are diminishing. This methodology is becoming increasingly available in popular software so that investigators can readily implement the methodology for evaluating health outcomes in the presence of competing risks.
This discussion is also relevant to comparative effectiveness research where the goal is identifying effective interventions for use in usual care settings. In these settings, the patients are heterogeneous in their physiological and functional characteristics. This heterogeneity can impact treatment effectiveness by affecting the baseline risk of the primary outcome, treatment responsiveness, treatment induced harm, and the rate of competing events. Application of the methods described in this article may contribute to an improved understanding of risks and benefits of interventions under different settings.
The authors thank the project manager Dr. Christine Weston for her assistance with the project and our technical expert panelists Drs. Mitch Gail, Bob Wallace, and Louise Walter for their valuable input. The authors also thank Ms. Pamela Shepherd for assistance with manuscript preparation. The first author (R.V.) would also like to thank Ms. Laura Podewils for introducing him to the competing risks problem.
1. Seal HL. Studies in the history of probability and statistics. XXXV: multiple decrements or competing risks. Biometrika
2. Fix E, Neyman J. A simple stochastic model of recovery, relapse, death and loss of patients. Hum Biol
3. Tsiatis A. A nonidentifiability aspect of the problem of competing risks. Proc Natl Acad Sci USA
4. Prentice RL, Kalbfleisch JD, Peterson AV Jr, et al. The analysis of failure times in the presence of competing risks. Biometrics
5. Gooley TA, Leisenring W, Crowley J, et al. Estimation of failure probabilities in the presence of competing risks: new representations of old estimators. Stat Med
6. Gaynor JJ. On the use of cause-specific failure and conditional failure probabilities. Examples from clinical oncology data. J Am Stat Assoc
7. Pepe MS. Inference for events with dependent risks in multiple endpoint studies. J Am Stat Assoc
8. Kay R, Schumacher M. Unbiased assessment of treatment effects on disease recurrence and survival in clinical trials. Stat Med
9. Lentine KL, Rocca Rey LA, Kolli S, et al. Variations in the risk for cerebrovascular events after kidney transplant compared with experience on the waiting list and after graft failure. Clin J Am Soc Nephrol
10. Ridker PM, Danielson E, Fonseca FA, et al. Rosuvastatin to prevent vascular events in men and women with elevated C-reactive protein. N Engl J Med
11. Freemantle N, Calvert M, Wood J, et al. Composite outcomes in randomized trials: greater precision but with greater uncertainty? JAMA
12. Walter LC, Covinsky KE. Cancer screening in elderly patients: a framework for individualized decision making. JAMA
13. Gail MH, Constantino JP, Bryant J, et al. Weighing the risk and benefits of tamoxifen treatment for preventing breast cancer. J Natl Cancer Inst
14. Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data.
New York, NY: Wiley; 1980.
15. Benichou J, Gail MH. Estimates of absolute cause-specific risk in cohort studies. Biometrics
16. Korn EL, Dorey FJ. Applications of crude incidence curves. Stat Med
17. Gray RJ. Class of K-sample tests for comparing the cumulative incidence of a competing risk. Ann Stat.
18. Cheng SC, Fine JP, Wei LJ. Prediction of cumulative incidence function under the proportional hazards model. Biometrics
19. Fine JP, Gray RJ. Proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc
20. Fine JP. Regression modeling of competing crude failure probabilities. Biostatistics
21. Scheike TH. Direct modelling of regression effects for transition probabilities in multistate models. Scand J Stat
22. Klein JP, Andersen PK. Regression modeling of competing risks data based on pseudovalues of the cumulative incidence function. Biometrics
23. Robins JM. An analytic method for randomized trials with informative censoring: part 1. Lifetime Data Anal
24. R Development Care Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2009.
25. Lauer MS, Topol EJ. Clinical trials—-multiple treatments, multiple end points, and multiple lessons. JAMA
26. Byar DP, Green SB. The choice of treatment for cancer patients based on covariate information. Bull Cancer
27. Kay R. Treatment effects in competing-risks analysis of prostate cancer data. Biometrics
28. Lin DY, Robins JM, Wei JY. Comparing two failure time distributions in the presence of dependent censoring. Biometrika
29. Peng L, Fine JP. Regression modeling of semicompeting risks data. Biometrics
30. Rubin DB. Causal inference without counterfactuals, by A.P. Dawid [comment]. J Am Stat Assoc
31. Peto R, Peto J. Asymptotically efficient rank invariance test procedures. J R Stat Soc
Keywords:© 2010 Lippincott Williams & Wilkins, Inc.
cause-specific hazards; event-free survival; cumulative incidence function; semicompeting; risks; comparative effectiveness