Many observational studies have reported factors that are harmful in the general population, but protective in certain subpopulations, or vice versa. Causal diagrams are often used to argue that these paradoxical associations are examples on selection bias due to conditioning on a collider.1,2 However, causal diagrams do not generally allow exploration of the direction and magnitude of the bias.
To better understand these biases, we numerically explore scenarios under which selection bias could be a satisfactory explanation. To do so, we explicitly model the unknown individual heterogeneity, often denoted frailty, which influences disease risk. Vaupel et al.3 introduced frailty models in 1979 to highlight bias in demographic studies, and frailty models are commonly used in survival analysis.4 We argue that these models also should be used to study bias in epidemiology.
In “Crossing Hazards” section, we describe a selection bias of observed hazard ratios, and we explore this bias in several real life scenarios. In “Paradoxes for Competing Risks” section, we highlight bias that arises in a competing risks setting, which may lead to both false protectivity and false increase in risk. Finally, in “Bias in Index-event Studies” section, we derive novel mathematical expressions to highlight issues in index-event studies, where subjects are recruited based on the occurrence of another disease.
Suppose we want to estimate the effect of treatment X on outcome Y, which is ascertained at the end of follow-up among individuals who survived (S = 1) until that time. Each individual has an unknown vulnerability of developing Y, that is, a frailty Z. If treatment X affects survival S, then selection bias due to collider stratification is expected (Figure 1).
We will explore real life scenarios by considering counterfactual models. As explained in Section 3 of Aalen et al.,5 we consider the causal effect of an exposure X on the outcome Y conditional on the frailty Z. The causal effect corresponds to the average causal effect of exposure among individuals with a given frailty Z. Intervening on the exposure does not causally influence the frailty Z. Let T be the time to event Y, and let the baseline hazard rate at time t be defined as α(t). Assuming a binary exposure, the counterfactual event times are
and the counterfactual hazard rates are
When an individual is exposed (X = 1), she will experience r(t) times the hazard rate of the counterfactual situation in which she is not exposed (X = 0).
These conditional hazard rates have a causal interpretation but, in real life, they are unobservable. Rather, we estimate the unconditional hazard ratio in the population, which may not have a causal interpretation. In our examples below, we explore how the unconditional hazard ratio will differ from the underlying hazard ratios conditional on Z.
Treatment Effect Is Constant Within Levels of Frailty
First, we will assess treatments with effects that gradually decrease with time. In particular, this is seen for many cancer therapeutics because the malignant cells develop resistance to the drugs.6–9
Let the frailty Z be a gamma distributed random variable, which allows for a wide range of shapes. As an example, Figure 2A shows a frailty gamma distribution with mean (µ) and variance (δ) equal to 1. The conditional hazard ratio is defined as
under the counterfactual outcomes of receiving treatment or no treatment. It declines from
to 1 as t goes from 0 to 10, and thereby resembles a treatment that gradually loses effect. Figure 2B displays the hazard ratio (dashed line) when
. The solid line shows the hazard ratio at the population level, which is directly observed and influenced by the deaths of individuals before time t Interpreting the observed population estimate as a conditional causal effect would be fatal, because the high-risk group apparently becomes the low-risk group. Intuitively, this occurs because only the patients with relatively low frailty survive in the high-risk group. Indeed, this selection of low-frailty individuals over time is analogous to the process leading to cancer therapy resistance over time; only the fittest subjects (individuals or cancer cells) survive.
Crossing hazards may also occur for treatments that do not lose effect. It may be particularly prominent when a subgroup of individuals is nonsusceptible to developing a disease.10 For example, some heavy smokers never develop chronic obstructive pulmonary disease,11 and only about 30% of heavy alcohol drinkers will develop advanced cirrhosis or fibrosis of the liver.12 In such scenarios, the frailty may be described by a compound Poisson distribution, which allows for a group of nonsusceptible individuals.4 For example, we may study a treatment that constantly doubles the hazard of disease. Assume that the frailty has a compound Poisson distribution (
) with 30% nonsusceptible individuals, and we follow the patients for 10 years. In “Crossing Hazards” section in the Appendix, we show that the observed hazard ratio declines to 0.5, even though the causal hazard ratio is constantly equal to 2.2
Treatment Effect Varies Across Levels of Frailty
The effect of hormone replacement therapy (HRT) on coronary heart disease (CHD) in postmenopausal women is debated.13 We will compare the counterfactual hazard rates of receiving HRT
and not receiving HRT
. Observational studies have suggested a protective effect of HRT on CHD,14 but an overall increased risk of CHD has been reported in randomized controlled trials (RCTs), for example, by the Women Health Initiative (WHI).15 It has been speculated whether the discrepancy is due to incomplete capture of early clinical events in the observational studies,16–18 that is, crossing of hazards. Interestingly, crossing hazards was also reported in the randomized WHI study; in the first year of follow-up, the hazard ratio was 1.81, but it was reduced to 0.71 after 6 years (red curve in Figure 3A, confidence intervals are shaded in red). We assess whether the crossing of hazards could be explained by frailty.
In the early postmenopausal years, the risk of CHD may vary substantially among individuals, and a majority will have a very low short-term risk. Hence, it may be plausible to use a compound Poisson model in which the majority is regarded as nonsusceptible. In Figure 3A, the population hazard ratio that would be observed under a compound Poisson frailty distribution with a constant hazard ratio of 1.81, 3% fraction susceptible to CHD and δ = 35 is displayed (dashed line). This frailty model is a good fit to the WHI results, suggesting that the declining hazard ratio may possibly occur simply due to skewed frailty.
Hitherto, we have assumed that the treatment effect is homogeneous on the multiplicative scale, that is, the hazard ratio is equal for each level of frailty. This assumption facilitates mathematical derivations, and makes it easy to compare the observed rates with the underlying causal effect. In real life, however, the treatment response may vary across frailty levels. We will now describe scenarios in which the hazard ratio is heterogeneous. First, let the causal hazard ratio of individual i, HRi, have a probability distribution
The distribution of
has mean equal to 1.81, and no individuals will have a hazard ratio lower than 1 for hormone therapy. We have derived results by simulating 107 patients with frailty Z in three counterfactual scenarios for 6 years. In scenario 1, they received no treatment (baseline). In scenario 2, they got hormone therapy with homogeneous HR equal to 1.81. In scenario 3, each individual had a treatment effect drawn from Equation (1). The underlying population causal effects of scenarios 2 and 3 are equal. In Figure 3B, scenarios 2 (black curve) and 3 (blue curve) are compared when Z has a compound Poisson distribution. Interestingly, the magnitude of frailty bias is larger when a heterogeneous treatment effect is included.
Furthermore, we assessed the consequences of heterogeneous treatment effects for other frailty distributions. In Figure 3C, Z is gamma distributed. Similar to the compound Poisson models in Figure 3A, the heterogeneous treatment effect leads to larger bias. We also assessed a uniform distribution of
with mean 1.81
, and the frailty bias was similar to using the HR in Equation (1).
In conclusion, a skewed compound Poisson model is a good fit to the WHI results, and including a heterogeneous treatment effect seems to increase the magnitude of frailty bias. Hence, the crossing of hazards in the WHI study could be due to very skewed frailty.
PARADOXES FOR COMPETING RISKS
To gain further insight into scenarios in which frailty biases could be important, we consider the concept of competing risks. The setting may be described by Figure 1, in which S and Y denotes two different events (the arrow between S and Y may be less relevant for competing risks). The frailty is still denoted Z. Subjects at risk of Y at time t are those who have neither experienced S nor Y. Di Serio19 has shown that false protectivity may occur in such scenarios, and we will derive numerical estimates.
Let the hazard rates for S and Y be
, respectively. The cumulative hazards are
. Let Z be gamma distributed with expectation equal to 1, and variance δ. The population hazard rate at time t for Y,
, has a simple expression (see derivation in section “False Protectivity” in Appendix)
Assume that a covariate increases the hazard rate of S, but has no influence on Y. This would increase A1(t) in Expression (2) and thereby reduce
. Hence, the covariate would falsely appear as protective for Y.
For example, smoking cigarettes is associated with lower risk of developing Alzheimer’s disease in some observational studies.20 However, the association depends on the minimum age of baseline in the study population20; higher minimum age suggests decreasing risk among the smokers.21 We assess whether this relation can be influenced by frailty. Let S and Y denote survival and Alzheimer’s disease, respectively, and the counterfactual outcomes are the hazard rate of Alzheimer’s disease in smokers
First, we use a convenient relation between the familial relative risk (FRR) and the variance, assuming identical risk within a family.22 Let P be the risk of disease in an individual.
Because we have defined the mean of the frailty distribution to be 1 (without loss of generality), we have that
where δ is the variance. We define the FRR for Alzheimer’s disease to be 3.5.23 Let the frailty Z be gamma distributed, and use Expression (3) to find that the variance is 2.5, assuming that the outcome is rare and that the frailty is shared between Alzheimer’s disease incidence and mortality. We define the hazard ratio for all cause death in smokers to be 1.8 compared with nonsmokers.24 Approximately 70% of never-smoking Americans survive until the age of 80 years.25 About 17% of Americans between 75 and 85 years suffer from Alzheimer’s disease,26 and we therefore use 17% as a crude estimate on the risk of having Alzheimer’s disease at 80. Based on these estimates, we find a cumulative baseline hazard rate of dying at 80 years (A1(80) = 0.36) and a cumulative baseline hazard rate of getting Alzheimer’s at 80 years (A2(80) = 0.19). We use Expression (2) to find an observed hazard ratio of 0.77 of developing Alzheimer’s disease in smokers compared with nonsmokers.
These calculations suggest that a spurious association due to smoking can occur in studies of elderly. Consistent with our observed estimate of 0.77, Hernán et al.21 found a relative risk of 0.72 by using data from three observational studies which included patients older than 75 years. If we considered instead individuals with mean age of 70 years, we would observe a population hazard ratio of 0.82, still suggesting a strong bias.
Here we have used numbers from real-world analyses in a clearly simplified model. However, there is no obvious reason to assume that similar phenomena are absent from the complex real life scenarios, and the model can easily be extended, for example, by replacing the common frailty Z with separate, correlated frailties for S and Y.
Not only false protectivity but also falsely increased risk might occur in a competing risks setting. Danaei et al.27 assessed the presence of frailty bias in the positive association between statins and diabetes risk. By using inverse probability weighting with variables that are known and measured, they found no evidence of bias. Their results also agree with a Mendelian randomization study.28 However, in other scenarios, using the inverse probability weights might be insufficient, because unknown factors can yield the survival bias. Defining a statistical model for the unknown frailty may yield additional insight.
BIAS IN INDEX-EVENT STUDIES
Many studies in epidemiology consider patients who have experienced a particular event, for example, the onset of a disease. We will therefore study the effect of a binary exposure X on Y2, conditioning on Y1 has occurred. The causal structure is exactly the same as in Figure 1, but S is replaced by Y1. This is an index-event design,29 which may lead to bias that is often neglected,30 for example, in the context of obesity paradoxes.31
We can explore the magnitude of index-event bias by fitting frailty models. The time to Y1 is determined by an individual hazard rate, which is the product of the frailty Z and a basic rate α(t)
Some individuals have an additional risk factor (e.g., obesity) giving an additional hazard rate of k(t), and their total hazard rate is
are the counterfactual hazard rates for the survival times
respectively. Let the basic frailty Z be gamma distributed with expectation equal to 1 and variance equal to δ, and for simplicity we assume that k(t) and α(t) are constants. Given Y1 has occurred, let Zt be the frailty variable among individuals lacking the special risk factor, and let Zt,k be the frailty among individuals characterized by the risk factor. In sections “Bias in Index-event Studies” and “Special Case: Gamma Frailty” in Appendix, we have derived that
in distribution, where U is exponentially distributed and independent of Zt,k.
Importantly, since U is always positive, Expression (4) shows that conditioning on Y1, the frailty will be larger for individuals lacking the special risk factor. Heuristically, an unmeasured and possibly more severe risk factor caused Y1 in these individuals. This phenomenon is seen in breast cancer, where younger patients tend to have a worse prognosis. Old age is probably not a real protective factor. Rather, young patients are likely to have a particular vulnerability, that is frailty, that also influences survival. Similarly, the paradoxical effect of obesity in a wide range of index-event studies may be influenced by the frailty phenomenon displayed in Expression (4).
In Figure 4, we study two scenarios in which patients are monitored for 5 years after experiencing Y1. In both scenarios, the baseline hazard of Y1, α(t), is identical to the baseline hazard of Y2, and the high-risk group is characterized by an average doubling of the causal hazard rate of Y1. In Figure 4A, the observed hazard rates of Y2 in the high-risk group (red curve) and the low-risk group (black curve) are displayed, given that Y1 has occurred. The baseline hazard of Y2 is equal in both groups, but a paradoxical association is found during the whole time period. The association in Figure 4B appears even more counter-intuitive; here, the additional risk factor increases the hazard rate of both Y1 and Y2. Nevertheless, carriers of the risk factor are observed to have lower hazard rate during the whole time period.
Counter-intuitive associations occur naturally in observational studies. We have shown how causal models may illuminate issues in epidemiology and survival analysis. Unknown heterogeneity in disease risk may generate bias when subjects are selected after an exposure is assigned. In particular, this happens when the heterogeneity, that is, the frailty, has a multiplicative effect on the disease risk. We have numerically explored such bias by using simple frailty models.
Frailty modeling may be desirable compared with conventional sensitivity analyses. First, the parameters of the frailty models are easy to interpret. Importantly, we have shown how plausible parameter values often may be derived from published estimates. Furthermore, the frailty models naturally incorporate the time dependency of the frailty biases. Obtaining a specific estimate of the frailty bias may also be useful when detailed causal inference is performed. These characteristics make frailty models suitable for real life, bias analyses.
Our examples illustrate the intimate connection between causal models and frailty. A causal point of view is required for a profound understanding of frailty, and classical survival analysis is therefore dependent on a causal framework. On the other hand, the parametric frailty models add quantitative insight into biases that are revealed by directed acyclic graphs (DAGs). Nevertheless, frailty concepts have rarely been used to explore surprising associations in epidemiology. On the contrary, frailty models have been widely recognized in related fields such as demography3 and bioarcheology.32 It is important that epidemiologists and clinicians understand the issues of frailty bias, and numeric evaluations may often be appropriate.
1. Hernán MA, Schisterman EF, Hernández-Daz S. Invited commentary: composite outcomes as an attempt to escape from selection bias and related paradoxes. Am I Epidemiol. 2014;179:368–370.
2. Efird JT, O’Neal WT, Kennedy WL, Kypson AP. Grand challenge: understanding survival paradoxes in epidemiology. Front Public Health. 2013;1:3.
3. Vaupel JW, Manton KG, Stallard E. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979:16:439–454.
4. Aalen O, Borgan O, Gjessing H. Survival and Event History Analysis: A Process Point of View. 2008.Springer Verlag.
5. Aalen OO, Cook RJ, Røysland K. Does cox analysis of a randomized survival study yield a causal treatment effect? Lifetime Data Anal. 2015;21:579–593.
6. Gottesman MM. Mechanisms of cancer drug resistance. Annu Rev Med. 2002;53:615–627.
7. Holohan C, Van Schaeybroeck S, Longley DB, Johnston PG. Cancer drug resistance: an evolving paradigm. Nat Rev Cancer. 2013;13:714–726.
8. Haar CP, Hebbar P, Wallace GC IV, et al. Drug resistance in glioblastoma: a mini review. Neurochem Res. 2012;37:1192–1200.
9. Perez EA. Impact, mechanisms, and novel chemotherapy strategies for overcoming resistance to anthracyclines and taxanes in metastatic breast cancer. Breast Cancer Res Treat. 2009;114:195–201.
10. Aalen OO, Valberg M, Grotmol T, Tretli S. Understanding variation in disease risk: the elusive concept of frailty. Int J Epidemiol. 2015;44:1408–1421.
11. Cosio MG, Saetta M, Agusti A. Immunologic aspects of chronic obstructive pulmonary disease. N Engl J Med. 2009;360:2445–2454.
12. Gao B, Bataller R. Alcoholic liver disease: pathogenesis and new therapeutic targets. Gastroenterology. 2011;141:1572–1585.
13. Hernán MA, Alonso A, Logan R, et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology. 2008;19:766.
14. Grodstein F, Manson JE, Stampfer MJ. Hormone therapy and coronary heart disease: the role of time since menopause and age at hormone initiation. J Women’s Health. 2006;15:35–44.
15. Manson JE, Hsia J, Johnson KC, et al. Estrogen plus progestin and the risk of coronary heart disease. N Engl J Med. 2003;349:523–534.
16. Hernán MA. The hazards of hazard ratios. Epidemiology. 2010;21:13.
17. Grodstein F, Clarkson TB, Manson JE. Understanding the divergent data on postmenopausal hormone therapy. N Engl J Med. 2003;348:645–650.
18. Hernán MA. Counterpoint: epidemiology to guide decision-making: moving away from practice-free research. Am J Epidemiol. 2015;182:834–839.
19. Di Serio C. The protective impact of a covariate on competing failures with an example from a bone marrow transplantation study. Lifetime Data Anal. 1997;3:99–122.
20. Chang C-CH, Zhao Y, Lee C-W, Ganguli M. Smoking, death, and Alzheimer’s disease: a case of competing risks. Alzheimer Dis Assoc Disord. 2012;26:.
21. Hernán MA, Alonso A, Logroscino G. Cigarette smoking and dementia: potential selection bias in the elderly. Epidemiology. 2008;19:448–450.
22. Moger TA, Aalen OO, Heimdal K, Gjessing HK. Analysis of testicular cancer data using a frailty model with familial dependence. Stat Med. 2004;23:617–632.
23. Van Duijn CM, Clayton D, Chandra V, et al. Familial aggregation of alzheimer’s disease and related disorders: a collaborative re-analysis of case-control studies. Int J Epidemiol. 1991;20(Suppl 2):S13–S20.
24. Jacobs DR, Adachi H, Mulder I, et al. Cigarette smoking and mortality risk: twenty-five–year follow-up of the seven countries study. Arch Intern Med. 1999;159:733–740.
25. Rogers RG, Hummer RA, Krueger PM, Pampel FC. Mortality attributable to cigarette smoking in the United States. Popul Dev Rev. 2005;31:259–292.
26. Hebert LE, Weuve J, Scherr PA, Evans DA. Alzheimer disease in the united states (2010–2050) estimated using the 2010 census. Neurology. 2013;80:1778–1783.
27. Danaei G, Garca Rodrguez LA, Fernandez Cantero O, Hernán MA. Statins and risk of diabetes an analysis of electronic medical records to evaluate possible bias due to differential survival. Diabetes Care. 2013;36:1236–1240.
28. Swerdlow DI, Preiss D, Kuchenbaecker KB, et al. Hmg-coenzyme a reductase inhibition, type 2 diabetes, and bodyweight: evidence from genetic analysis and randomised trials. Lancet. 2015;385:351–361.
29. Dahabreh IJ, Kent DM. Index event bias as an explanation for the paradoxes of recurrence risk research. JAMA. 2011;305:822–823.
30. Flanders WD, Eldridge RC, McClellan W. A nearly unavoidable mechanism for collider bias with index-event studies. Epidemiology. 2014;25:762–764.
31. Banack HR, Kaufman JS. The “obesity paradox” explained. Epidemiology. 2013;24:461–462.
32. Wood JW, Milner GR, Harpending HC, et al. The osteological paradox: problems of inferring prehistoric health from skeletal samples [and comments and reply]. Curr Anthropol. 1992;24:343–370.
APPENDIX Crossing Hazards
Figure A1 shows the conditional and unconditional hazard ratios for a treatment that constantly halves the hazard of a disease. The dashed line describes the hazard ratio conditional on the frailty, which has a causal interpretation. The red curve shows the observed population hazard ratio at time t in (0,10] when the frailty has a compound Poisson distribution (µ = δ = 1) with 30% nonsusceptible individuals. Even though the causal hazard ratio is constantly 2, the observed hazard ratio declines to 0.5. The black curve displays the observed hazard ratio when the frailty is gamma distributed (µ = δ = 1). There is still a strong, time-dependent bias.
Assume that the common frailty, Z, is gamma distributed. The Laplace transform is frequently used in a wide range of quantitative disciplines. In survival analysis, it is particularly useful for expressing the survival functions in simple mathematical terms, and thereby we are able to highlight several important concepts.4 Here, the Laplace transform of Z is denoted
. The probability of surviving both risks equals
The probability of surviving to time t and then experiencing event 2 in the small time interval
[t, t + dt] equals
The rate, µ2(t) at which event 2 occurs for survivors at time t, is then found by dividing by S(t), giving
Interestingly, the hazard rate µ2 also depends on α1(t) (through its integral). Hence, any factor that influences the risk of event 1 will, on the population level, also be seen to influence the risk of event 2, even when α2(t) is independent of this risk factor.
As an explicit example, let Z be gamma distributed with expectation equal to 1, and variance δ. This yields
Bias in Index-Event Studies
Assume that a time scale is divided into two stages. The end of the first stage is defined by some event denoted Y1, for example, kidney failure. The end of the second stage is defined by some other event Y2, for example, death. The causal structure is displayed in Figure 1. A factor that increases the hazard rate of the first event is often found to decrease the hazard rate of the second event, as we discussed in the paragraph on false protectivity. For example, we can assume that obesity is an additional risk factor increasing the hazard rate of kidney failure (X in Figure 1).
This scenario can be described by a frailty model. The time to the first event, Y1, is dependent on the hazard rate, which is given as the product of an individual specific quantity Z and a basic rate α(t):
Here, Z is considered to be a random variable over the population of individuals, specifying the level of frailty.
Some individuals have an additional risk factor (e.g., obesity) giving an additional hazard rate of k(t). Hence, their total hazard rate is
The Laplace transform for individuals experiencing the first event at time t is given as follows:
The frailty variable defined by this Laplace transform is denoted Zt,k.
When k(t) = 0 for all t, the Laplace transform of individuals experiencing the first event at time t is given as follows:
The frailty variable defined by this Laplace transform is denoted Zt.
When the event Y1 has occurred, say at time t, we consider the hazard rate of the next event, Y2. We assume that for known frailties the hazard rate at time t + s equals
in the special risk group, and
in the nonrisk group. Define
. The corresponding population hazard rates are then
Now, assume a special case with constant rates: α(t) = α, k(t) = k, m(s) = m and γ(s) = γ. Then the Laplace transforms are
whereas the hazard rates are (assumed m = 0 when k = 0) as follows:
These expressions will be used in section “Special Case: Gamma Frailty.”
Special Case: Gamma Frailty
To illustrate the consequences of the derivations in Bias in index-event studies, assume that the basic frailty Laplace transform is gamma distributed:
This distribution has expectation equal to 1 and variance equal to δ. The derivative of this Laplace transform equals
The latter distribution is a gamma distribution with shape parameter 1 + 1/δ, and scale parameter
αt + 1/δ. The Laplace transform
corresponds to a mixture of this distribution and a gamma distribution with shape parameter 1/δ and scale parameter αt + 1/δ, with mixture proportions α/(α + k) and k/(α + k).
We may rewrite
This means that we can write as follows:
where U is independent of Zt,k and exponentially distributed with scale parameter 1/δ + α/(δk) + αt.
In Figure 4, hazard rates for individuals with and without the special risk factor are displayed. In Figure 4A, the parameters are t = 1, k = 1, α = 1, δ = 5, m = 0, and γ = 1. In Figure 4B, m is changed to 0.1 and the other parameters are unchanged. In both plots, carriers of the extra risk factor k appear to have lower risk of Y2, even though the causal risk is either equal to noncarriers (Figure 4A) or larger than noncarriers (Figure 4B).