Statistical interaction is commonly evaluated by including a product-term between two exposures of interest in the model.1,2 Whether an additive or multiplicative interaction is evaluated depends on the scale of the chosen statistical model.3 In survival analysis, the usual choice is the Cox proportional-hazard regression and statistical interaction is generally assessed as a deviation from multiplicativity, often without any mention of the scale issue.4,5
Interaction on the additive scale, however, is a more intuitive concept with a particular public health meaning, and investigating and presenting statistical interaction according to both scales has been widely recommended.1,3,6,7 To evaluate additive interaction in survival analysis, measures derived from multiplicative models, such as the relative risk due to interaction, the synergy index, or the attributable proportion due to interaction have been proposed.8,9 As a possible alternative, Rod et al.4 suggested the use of the additive hazard model.
All the available measures, whether additive or multiplicative, calculate interactions on the rate scale, estimating probabilities or hazards of the event within a fixed follow-up time. Another possible approach for the analysis of time-to-event data is the evaluation of survival percentiles, defined as the time by which a certain fraction of the population has experienced the event of interest.10 When focusing on survival percentiles, a specific probability of the event is fixed and is the time point to be estimated. While common scales for the evaluation of interaction, such as the risk differences scale, the risk ratios scale, and the odds ratio scale, have been studied extensively, measures of interaction on the survival time scale have never been investigated. This approach can be particularly relevant in the analysis of ultimate or inevitable outcomes, such as death. Statistical methods to estimate survival percentiles, and differences in survival percentiles according to exposure level, are available at the univariable and multivariable level.10 Among these, Laplace regression is a flexible approach to directly estimate the conditional percentiles of the time-to-event variable as a linear combination of the predictors.11,12
The aim of this article is to present the evaluation of statistical interaction in the context of survival percentiles, introducing Laplace regression as a possible approach to evaluate and test additive interaction in time-to-event analysis.
DEFINING ADDITIVE INTERACTION IN THE CONTEXT OF SURVIVAL PERCENTILES
The common scenario in time-to-event analysis is represented by a cohort of n individuals, free from a specific disease of interest D at time t = 0, who are observed during a follow-up period to evaluate the disease-free survival. Survival percentiles can be defined as the time points by which specific proportions of the study population have experienced the event D. For example, the time by which the first 50% of the individuals have experienced the event is defined as 50th survival percentile or median survival. The survival curve depicts a complete summary of the entire range of observed survival percentiles, presenting the proportion of events during the follow-up time. A common approach to evaluate the survival function is to fix a specific time t—usually the end of follow-up—and to estimate survival probabilities or rates of the event D in the time interval [0, t], possibly according to levels of specific exposures or risk factors. In a percentile approach, on the other hand, the incidence proportion p is fixed to a specific level and the outcome to be evaluated is the corresponding survival percentile, the time t by which the study population reaches the specific fraction of events p.
Let G and E be two binary exposures, which can take values 0 or 1, and are both risk factors for the event D. Figure 1 presents a possible survival experience for the four combinations of the two exposures (i.e., G = 0, E = 0; G = 1, E = 0; G = 0, E = 1; G = 1, E = 1). Given a fixed proportion of events p, the pth survival percentiles for each of the four groups (t00, t10, t01, t11) are displayed in the figure. The difference (t11− t00) represents the difference in the pth survival percentile between participants with both exposures and participants with neither. The quantities (t10− t00) and (t01− t00) are the differences between participants with only exposure G or E, respectively, and participants with neither exposure. Following the conventional notation introduced in terms of risk,1,7 the following difference represents an intuitive measure of interaction at the pth percentile (Ip):
This difference, calculated as an additive measure, can be rewritten as t11− t10− t01 + t00 and represents a measure of interaction on the metric of survival time. It expresses to what extent the difference in survival due to the presence of both exposures exceeds the sum of survival differences due to each specific exposure. Comparing this measure with 0, we can define the interaction as superadditive, if greater than 0, or subadditive otherwise.
Unadjusted differences in survival percentiles can be obtained from the Kaplan–Meier method, deriving the confidence intervals of the differences via bootstrap. When adjustment for other covariates is needed, other methods to evaluate survival percentiles are required. Laplace regression was introduced as a method for estimating the conditional percentiles of a potentially censored outcome, and can be used in time-to-event analysis to evaluate adjusted survival percentiles.10,11 Differently from other survival analysis techniques, Laplace regression directly models the percentiles of the time variable as a linear combination of the predictors. Coefficients estimates can be interpreted as differences in time (i.e., years, months, days) by which different subpopulations reach the same fraction of events. As any regression method, Laplace can model the outcome (i.e., survival time) as a function of multiple covariates, possibly including continuous exposures with flexible transformations. Situations of heteroskedasticity can be accommodated by letting the scale parameter depend on one or more covariates. Multiple percentiles can be estimated simultaneously, testing coefficients within and between survival percentiles. The simultaneous estimation and plotting of different percentiles might require some smoothing, depending on the variability across percentiles, and algorithms such as the lowess can be applied.
Laplace regression assumes the errors follow an asymmetric Laplace distribution. Nevertheless, this parametric assumption, shared by other methods in quantile regression,13–16 has been shown not to influence the performances of the model under different data distributions.11,12,17 Simulations studies from previous articles have documented good performances of the model in terms of computational speed, precision, robustness of standard errors, and coverage of confidence intervals that was close to the nominal value.11,18 Performance of the model was further improved after the introduction of a gradient search maximization algorithm, currently implemented for the estimation of the model.18 A flexible user-friendly program for the estimation of Laplace regression, which makes use of this algorithm, is available in Stata.12
Recent developments have increased the potential of Laplace regression in the epidemiologic context, presenting how to use the method to derive and draw adjusted survival curves,19 and to estimate percentile of attained age at the event of interest.17 This latter study presented the meaning and estimation of survival percentiles with different time scales, exploring the statistical properties of Laplace in estimating percentiles of attained age, and discussing the advantages that this application may accrue in time-to-event analysis.17
EVALUATING ADDITIVE INTERACTION WITH LAPLACE REGRESSION
To evaluate the impact of the two binary exposures G and E and their interaction on the pth survival percentile of the time variable T, we can fit the following model
The time variable T is defined as time between entry into the study and experiencing the event D. An implicit assumption to allow interpretability of Equation (2) is that parameters are constrained to keep the right-hand side of the equation positive. This should hold for all the observed, and possibly for all the potentially observable, combinations of covariate patterns. To improve interpretability, we prefer to code G and E so that their effects are in the same direction, that is, β1(p) and β2(p) have the same sign. From Equation (2), it is possible to estimate the pth survival percentiles for the four combinations of the two exposures, corresponding to the time points displayed in Figure 1.
It simply follows that the measure of additive interaction (1) is estimated by the parameter β3(p). If β3(p) > 0, we are in the presence of superadditive interaction between G and E. If β3(p) < 0, the interaction is subadditive. The statistical test associated with the parameter β3(p) can hence be viewed as a test for additive interaction. Model (2) can be extended to include additional covariates. In this situation, the interpretation of the product term coefficient as a measure of additive interaction is still valid when conditioning on the additional covariates.
The fact that in a percentile approach the incidence proportion p is fixed at a specific value, as displayed in Figure 1, implies that the measure of additive interaction depends on p. This allows the study of interaction between two risk factors according to the fraction of events considered.
We evaluated the interaction between smoking status (0 = current smoker; 1 = never smoker) and educational level (0 = primary education; 1 = high school/university education) in predicting overall mortality. We used data from the Cohort of Swedish Men and the Swedish Mammography Cohort, two large cohorts of ~80,000 men and women from central Sweden, aged 45–83 at baseline, established in 1997 and largely described elsewhere.20 After exclusions we considered in these analyses 71,238 participants who were followed up for 16 years between January 1, 1998, and December 31, 2013. During this period, an overall 23% of the study participants died (n = 16,346). Because of the different fractions of cases across strata of the exposures, we focused our main analysis on the 10th survival percentile, the time by which the first 10% of subjects have died.
First, we evaluated the following Laplace regression model on the 10th percentile, with the two binary exposures and their interaction:
Predicted values of the 10th survival percentile for each of the four subpopulations, calculated by combining the obtained coefficients estimates, are presented in the Table. The product term β3, which estimates the additive interaction presented in Equation (1), has a simple and intuitive interpretation, as it represents the excess in survival when both predictors are equal to 1. In our example, the presence of both predictors (nonsmokers and highly educated) was associated with 2.1 additional years of survival (β3 = 2.1 years, 95% CI: 1.2, 2.9). This excess is larger than 0, suggesting the presence of super additive interaction in predicting mortality between being nonsmoker and highly educated.
We next adjusted for age at baseline as a 5-year categorical variable to evaluate the age-adjusted additive interaction between smoking and education in predicting overall mortality. This adjustment, which can be done simply by including the additional covariate in model (3), strongly changed the estimate of the product-term coefficient to −0.8 years (95% CI: −1.4, −0.1). This suggests that the crude interaction was probably explained by the different distribution of age across strata of the two exposures. In particular, the median age at baseline ranged from 53 years for current smokers with high education to 66 years for never smokers with low education.
The evaluation of additive interaction can be extended to other fractions of events rather than the first 10%. For example, we fitted age-adjusted Laplace regression models to evaluate the interaction between smoking and education for percentiles 1 to 15. This was done by evaluating model (3), further adjusted for age at baseline, varying p from 1 to 15. Figure 2A presents the survival curves for the four combinations of smoking and education, further adjusted for age at baseline. The estimated coefficients of the interaction term from these models, with confidence interval, are reported in Figure 2B. When looking at the first percentiles, which represent the early cases, the interaction is positive, later decreasing to a subadditivity.
In this study, we introduced the topic of interaction in the context of survival percentiles. In a percentile approach to time-to-event outcomes, a specific fraction of events is fixed while the time point is estimated. Interaction can therefore be evaluated in the unit of measurement of time. A measure of additive interaction can be estimated using Laplace regression to model conditional survival percentiles, including a product term between two exposures of interest in the model. The regression coefficient of the product term represents the excess/decrease in survival due to the presence of both the exposures of interest.
Evaluating the possible interaction between two exposures is a common component of epidemiologic studies.21 A detailed tutorial, summarizing the wide discussion on the topic, has recently been published.7 It is important to separate the concepts of biological and statistical interaction.3,22–25 Statistical interaction, which is the focus of this article, arises from a statistical model and should not be used to draw biologic conclusions.1,24,26,27 Statistical interaction is usually assessed by including in the model a product-term between the two exposures of interest.2 This implies that the evaluation and interpretation of interaction depend on the scale of the model chosen, which can be either additive or multiplicative.1
In the context of time-to-event analysis, the multiplicative nature of the Cox proportional-hazard model implies that interaction analysis is commonly limited to the multiplicative scale.6 Various studies have underlined the important public health meaning of additive interaction, which can be used to assess which subgroups of individuals are to be treated.1,6,7,28,29 Presenting both additive and multiplicative measures of interaction has been widely recommended,7,30–32 but this practice is still very uncommon.5 Moreover, the absence of interaction on one scale implies the presence of interaction on the other scale.3,6 In survival analysis, measures of additive interaction such as the relative risk due to interaction, the synergy index, or the attributable proportion due to interaction can be calculated after fitting a Cox model.8,9 Among these, the relative risk due to interaction is the most frequently utilized approach, even if it can only be used to assess the direction of the additive interaction without any indication on the magnitude.33 As an alternative approach, the use of the additive hazard regression has been proposed.4 To the best of our knowledge, however, this method is seldom used in epidemiologic research. When dealing with time-to-event outcomes, the risk of the event of interest often varies across follow-up. All the available measures of interaction, whether additive or multiplicative, are calculated on the risk scale, and commonly assume a time-fixed effect.
Evaluating percentiles of survival represents a possible alternative for the analysis of time-to-event outcomes.10 Common approaches fix the follow-up time to a specific value and evaluate the risk of the event within that time interval. By focusing on survival percentiles, on the other hand, a specific survival probability is fixed and the time by which that proportion is achieved is estimated. This change of perspective clarifies the connection between incidence proportion and time, and allows evaluating how the joint effect of two exposures is changing over time.
In this article, we have chosen to use Laplace regression to model survival percentiles because of its stability and computational speed. Other approaches are available and worth mentioning. An option often used in epidemiologic studies is to derive the adjusted survival curves after estimating a Cox model, and calculating the quantiles. This method, however, is computationally demanding and unclear with respect to standard error derivation.34 Assumptions such as proportionality of hazards strongly influence the shape of the derived survival curve. Other methods to evaluate quantiles of censored outcomes have been proposed and are available in R and SAS.35–37 These methods make different assumptions and are valid semi-parametric alternatives to Laplace regression.11
In this article, we showed how a measure of additive interaction expressed in the metric of time can be derived by evaluating survival percentiles. An advantage of the presented approach is that the additive interaction depends on the specified fraction of events. Within a follow-up time, it is therefore possible to investigate how the interaction between two exposures changes according to the fraction of cases, or equivalently by time. Currently available methods typically provide a single product-term coefficient, implicitly assuming that main effects and their interaction are constant over the entire follow-up time. We limited our presentation to the simplest scenario of two binary covariates. Extension to interaction analysis between categorical or continuous exposures is straightforward.
In conclusion, we introduced the concept of interaction analysis in the metric of time, which can be investigated by switching the focus from survival probabilities to survival percentiles. When Laplace regression is used to model survival percentiles, a measure of additive interaction can be easily estimated without assuming a constant effect over time.
1. Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 2008 Philadelphia, PA Lippincott Williams & Wilkins
2. Greenland S.. Tests for interaction in epidemiologic studies: a review and a study of power. Stat Med. 1983;2:243–251
3. Rothman KJ, Greenland S, Walker AM.. Concepts of interaction. Am J Epidemiol. 1980;112:467–470
4. Rod NH, Lange T, Andersen I, Marott JL, Diderichsen F.. Additive interaction in survival analysis: use of the additive hazards model. Epidemiology. 2012;23:733–737
5. Knol MJ, Egger M, Scott P, Geerlings MI, Vandenbroucke JP.. When one depends on the other: reporting of interaction in case-control and cohort studies. Epidemiology. 2009;20:161–166
6. Greenland S.. Interactions in epidemiology: relevance, identification, and estimation. Epidemiology. 2009;20:14–17
7. VanderWeele TJ, Knol MJ.. A tutorial on interaction. Epidemiologic Methods. 2014;3:33–72
8. Li R, Chambless L.. Test for additive interaction in proportional hazards models. Ann Epidemiol. 2007;17:227–236
9. VanderWeele TJ.. Causal interactions in the proportional hazards model. Epidemiology. 2011;22:713–717
10. Orsini N, Wolk A, Bottai M.. Evaluating percentiles of survival. Epidemiology. 2012;23:770–771
11. Bottai M, Zhang J.. Laplace regression with censored data. Biom J. 2010;52:487–503
12. Bottai M, Orsini N.. A command for Laplace regression. Stata J. 2013;13:1–13
13. Farcomeni A.. Quantile regression for longitudinal data based on latent Markov subject-specific parameters. Stat Comput. 2012;22:141–152
14. Lee D, Neocleous T.. Bayesian quantile regression for count data with application to environmental epidemiology. J Royal Stat Soc Series C (Appl Stat). 2010;59:905–920
15. Yuan Y, Yin G.. Bayesian quantile regression for longitudinal studies with nonignorable missing data. Biometrics. 2010;66:105–114
16. Liu Y, Bottai M.. Mixed-effects models for conditional quantiles with longitudinal data. Int J Biostat. 2009;5:Article 28
17. Bellavia A, Discacciati A, Bottai M, Wolk A, Orsini N.. Using Laplace regression to model and predict percentiles of age at death, when age is the primary time-scale. Am J Epidemiol. 2015;182:271–277
18. Bottai M, Orsini N, Geraci M.. A gradient search maximization algorithm for the asymmetric Laplace likelihood. J Stat Comput Simul. 2015;85:1–7
19. Bellavia A, Bottai M, Discacciati A, Orsini N.. Adjusted survival curves with multivariable laplace regression. Epidemiology. 2015;26:e17–e18
20. Harris H, Håkansson N, Olofsson C, Julin B, Åkesson A, Wolk A.. The Swedish mammography cohort and the cohort of Swedish men: study design and characteristics of 2 population-based longitudinal cohorts. OA Epidemiol. 2013;1:16
21. Ding B, Källberg H, Klareskog L, Padyukov L, Alfredsson L.. GEIRA: gene-environment and gene-gene interaction research application. Eur J Epidemiol. 2011;26:557–561
22. Rothman KJ.. Causes. Am J Epidemiol. 1976;104:587–592
23. Rothman KJ.. Synergy and antagonism in cause-effect relationships. Am J Epidemiol. 1974;99:385–388
24. Thompson WD.. Effect modification and the limits of biological inference from epidemiologic data. J Clin Epidemiol. 1991;44:221–232
25. Greenland S.. Basic problems in interaction assessment. Environ Health Perspect. 1993;101(Suppl 4):59–66
26. Siemiatycki J, Thomas DC.. Biological models and statistical interactions: an example from multistage carcinogenesis. Int J Epidemiol. 1981;10:383–387
27. Cordell HJ.. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet. 2002;11:2463–2468
28. Knol MJ, VanderWeele TJ, Groenwold RH, Klungel OH, Rovers MM, Grobbee DE.. Estimating measures of interaction on an additive scale for preventive exposures. Eur J Epidemiol. 2011;26:433–438
29. Saracci R.. Interaction and synergism. Am J Epidemiol. 1980;112:465–466
30. Botto LD, Khoury MJ.. Commentary: facing the challenge of gene-environment interaction: the two-by-four table and beyond. Am J Epidemiol. 2001;153:1016–1020
31. Vandenbroucke JP, von Elm E, Altman DG, et al.STROBE Initiative. Strengthening the reporting of observational studies in epidemiology (STROBE): explanation and elaboration. Epidemiology. 2007;18:805–835
32. Knol MJ, VanderWeele TJ.. Recommendations for presenting analyses of effect modification and interaction. Int J Epidemiol. 2012;41:514–520
33. Skrondal A.. Interaction as departure from additivity in case-control studies: a cautionary note. Am J Epidemiol. 2003;158:251–258
34. Lai TL, Su Z.. Confidence intervals for survival quantiles in the Cox regression model. Lifetime Data Anal. 2006;12:407–419
35. Powell JL.. Censored regression quantiles. J Econom. 1986;32:143–155
36. Portnoy S.. Censored regression quantiles. J Am Stat Assoc. 2003;98:1001–1012
37. Peng L, Huang Y.. Survival analysis with quantile regression models. J Am Stat Assoc. 2008;103:637–649