# Exploring Selection Bias by Causal Frailty Models: The Magnitude Matters

Counter-intuitive associations appear frequently in epidemiology, and these results are often debated. In particular, several scenarios are characterized by a general risk factor that appears protective in particular subpopulations, for example, individuals suffering from a specific disease. However, the associations are not necessarily representing causal effects. Selection bias due to conditioning on a collider may often be involved, and causal graphs are widely used to highlight such biases. These graphs, however, are qualitative, and they do not provide information on the real life relevance of a spurious association. Quantitative estimates of such associations can be obtained from simple statistical models. In this study, we present several paradoxical associations that occur in epidemiology, and we explore these associations in a causal, frailty framework. By using frailty models, we are able to put numbers on spurious effects that often are neglected in epidemiology. We discuss several counter-intuitive findings that have been reported in real life analyses, and we present calculations that may expand the understanding of these associations. In particular, we derive novel expressions to explain the magnitude of bias in index-event studies.

From the ^{a}Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway; and ^{b}Diakonhjemmet Hospital, Oslo, Norway.

Submitted 9 January 2016; accepted 11 January 2017.

This study was partially supported by a grant from the Norwegian Research Council (191460/V50), and by the Norwegian Cancer Society, Project/Grant Number 171851 and 4493570.

The authors report no conflicts of interest.

The computer code can be found in http://links.lww.com/EDE/B175.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com).

Correspondence: Mats Julius Stensrud, Domus Medica, Postboks 1122 Blindern, 0317 Oslo, Norway. E-mail: m.j.stensrud@medisin.uio.no.

Many observational studies have reported factors that are harmful in the general population, but protective in certain subpopulations, or vice versa. Causal diagrams are often used to argue that these paradoxical associations are examples on selection bias due to conditioning on a collider.^{1},^{2} However, causal diagrams do not generally allow exploration of the direction and magnitude of the bias.

To better understand these biases, we numerically explore scenarios under which selection bias could be a satisfactory explanation. To do so, we explicitly model the unknown individual heterogeneity, often denoted frailty, which influences disease risk. Vaupel et al.^{3} introduced frailty models in 1979 to highlight bias in demographic studies, and frailty models are commonly used in survival analysis.^{4} We argue that these models also should be used to study bias in epidemiology.

In “Crossing Hazards” section, we describe a selection bias of observed hazard ratios, and we explore this bias in several real life scenarios. In “Paradoxes for Competing Risks” section, we highlight bias that arises in a competing risks setting, which may lead to both false protectivity and false increase in risk. Finally, in “Bias in Index-event Studies” section, we derive novel mathematical expressions to highlight issues in index-event studies, where subjects are recruited based on the occurrence of another disease.

## CROSSING HAZARDS

Suppose we want to estimate the effect of treatment *X* on outcome *Y*, which is ascertained at the end of follow-up among individuals who survived (*S* = 1) until that time. Each individual has an unknown vulnerability of developing *Y*, that is, a frailty *Z*. If treatment *X* affects survival *S*, then selection bias due to collider stratification is expected (Figure 1).

We will explore real life scenarios by considering counterfactual models. As explained in Section 3 of Aalen et al.,^{5} we consider the causal effect of an exposure *X* on the outcome *Y* conditional on the frailty *Z*. The causal effect corresponds to the average causal effect of exposure among individuals with a given frailty *Z*. Intervening on the exposure does not causally influence the frailty *Z*. Let *T* be the time to event *Y*, and let the baseline hazard rate at time *t* be defined as *α*(*t*). Assuming a binary exposure, the counterfactual event times are

and

and the counterfactual hazard rates are

When an individual is exposed (*X* = 1), she will experience *r*(*t*) times the hazard rate of the counterfactual situation in which she is not exposed (*X* = 0).

These conditional hazard rates have a causal interpretation but, in real life, they are unobservable. Rather, we estimate the unconditional hazard ratio in the population, which may not have a causal interpretation. In our examples below, we explore how the unconditional hazard ratio will differ from the underlying hazard ratios conditional on *Z*.

### Treatment Effect Is Constant Within Levels of Frailty

First, we will assess treatments with effects that gradually decrease with time. In particular, this is seen for many cancer therapeutics because the malignant cells develop resistance to the drugs.^{6–9}

Let the frailty *Z* be a gamma distributed random variable, which allows for a wide range of shapes. As an example, Figure 2A shows a frailty gamma distribution with mean (*µ*) and variance (*δ*) equal to 1. The conditional hazard ratio is defined as

for

and

under the counterfactual outcomes of receiving treatment or no treatment. It declines from

to 1 as *t* goes from 0 to 10, and thereby resembles a treatment that gradually loses effect. Figure 2B displays the hazard ratio (dashed line) when

. The solid line shows the hazard ratio at the population level, which is directly observed and influenced by the deaths of individuals before time *t* Interpreting the observed population estimate as a conditional causal effect would be fatal, because the high-risk group apparently becomes the low-risk group. Intuitively, this occurs because only the patients with relatively low frailty survive in the high-risk group. Indeed, this selection of low-frailty individuals over time is analogous to the process leading to cancer therapy resistance over time; only the fittest subjects (individuals or cancer cells) survive.

Crossing hazards may also occur for treatments that do not lose effect. It may be particularly prominent when a subgroup of individuals is nonsusceptible to developing a disease.^{10} For example, some heavy smokers never develop chronic obstructive pulmonary disease,^{11} and only about 30% of heavy alcohol drinkers will develop advanced cirrhosis or fibrosis of the liver.^{12} In such scenarios, the frailty may be described by a compound Poisson distribution, which allows for a group of nonsusceptible individuals.^{4} For example, we may study a treatment that constantly doubles the hazard of disease. Assume that the frailty has a compound Poisson distribution (

) with 30% nonsusceptible individuals, and we follow the patients for 10 years. In “Crossing Hazards” section in the Appendix, we show that the observed hazard ratio declines to 0.5, even though the causal hazard ratio is constantly equal to 2.^{2}

### Treatment Effect Varies Across Levels of Frailty

The effect of hormone replacement therapy (HRT) on coronary heart disease (CHD) in postmenopausal women is debated.^{13} We will compare the counterfactual hazard rates of receiving HRT

and not receiving HRT

. Observational studies have suggested a protective effect of HRT on CHD,^{14} but an overall increased risk of CHD has been reported in randomized controlled trials (RCTs), for example, by the Women Health Initiative (WHI).^{15} It has been speculated whether the discrepancy is due to incomplete capture of early clinical events in the observational studies,^{16–18} that is, crossing of hazards. Interestingly, crossing hazards was also reported in the randomized WHI study; in the first year of follow-up, the hazard ratio was 1.81, but it was reduced to 0.71 after 6 years (red curve in Figure 3A, confidence intervals are shaded in red). We assess whether the crossing of hazards could be explained by frailty.

In the early postmenopausal years, the risk of CHD may vary substantially among individuals, and a majority will have a very low short-term risk. Hence, it may be plausible to use a compound Poisson model in which the majority is regarded as nonsusceptible. In Figure 3A, the population hazard ratio that would be observed under a compound Poisson frailty distribution with a constant hazard ratio of 1.81, 3% fraction susceptible to CHD and *δ* = 35 is displayed (dashed line). This frailty model is a good fit to the WHI results, suggesting that the declining hazard ratio may possibly occur simply due to skewed frailty.

Hitherto, we have assumed that the treatment effect is homogeneous on the multiplicative scale, that is, the hazard ratio is equal for each level of frailty. This assumption facilitates mathematical derivations, and makes it easy to compare the observed rates with the underlying causal effect. In real life, however, the treatment response may vary across frailty levels. We will now describe scenarios in which the hazard ratio is heterogeneous. First, let the causal hazard ratio of individual *i*, HR_{i}, have a probability distribution

The distribution of

has mean equal to 1.81, and no individuals will have a hazard ratio lower than 1 for hormone therapy. We have derived results by simulating 10^{7} patients with frailty *Z* in three counterfactual scenarios for 6 years. In scenario 1, they received no treatment (baseline). In scenario 2, they got hormone therapy with homogeneous HR equal to 1.81. In scenario 3, each individual had a treatment effect drawn from Equation (1). The underlying population causal effects of scenarios 2 and 3 are equal. In Figure 3B, scenarios 2 (black curve) and 3 (blue curve) are compared when *Z* has a compound Poisson distribution. Interestingly, the magnitude of frailty bias is larger when a heterogeneous treatment effect is included.

Furthermore, we assessed the consequences of heterogeneous treatment effects for other frailty distributions. In Figure 3C, *Z* is gamma distributed. Similar to the compound Poisson models in Figure 3A, the heterogeneous treatment effect leads to larger bias. We also assessed a uniform distribution of

with mean 1.81

, and the frailty bias was similar to using the HR in Equation (1).

In conclusion, a skewed compound Poisson model is a good fit to the WHI results, and including a heterogeneous treatment effect seems to increase the magnitude of frailty bias. Hence, the crossing of hazards in the WHI study could be due to very skewed frailty.

## PARADOXES FOR COMPETING RISKS

To gain further insight into scenarios in which frailty biases could be important, we consider the concept of competing risks. The setting may be described by Figure 1, in which *S* and *Y* denotes two different events (the arrow between *S* and *Y* may be less relevant for competing risks). The frailty is still denoted *Z*. Subjects at risk of *Y* at time *t* are those who have neither experienced *S* nor *Y*. Di Serio^{19} has shown that false protectivity may occur in such scenarios, and we will derive numerical estimates.

Let the hazard rates for *S* and *Y* be

and

, respectively. The cumulative hazards are

and

. Let *Z* be gamma distributed with expectation equal to 1, and variance *δ*. The population hazard rate at time *t* for *Y*,

, has a simple expression (see derivation in section “False Protectivity” in Appendix)

Assume that a covariate increases the hazard rate of *S*, but has no influence on *Y*. This would increase *A*_{1}(*t*) in Expression (2) and thereby reduce

. Hence, the covariate would falsely appear as protective for *Y*.

For example, smoking cigarettes is associated with lower risk of developing Alzheimer’s disease in some observational studies.^{20} However, the association depends on the minimum age of baseline in the study population^{20}; higher minimum age suggests decreasing risk among the smokers.^{21} We assess whether this relation can be influenced by frailty. Let *S* and *Y* denote survival and Alzheimer’s disease, respectively, and the counterfactual outcomes are the hazard rate of Alzheimer’s disease in smokers

and nonsmokers

.

First, we use a convenient relation between the familial relative risk (FRR) and the variance, assuming identical risk within a family.^{22} Let *P* be the risk of disease in an individual.

Because we have defined the mean of the frailty distribution to be 1 (without loss of generality), we have that

where *δ* is the variance. We define the FRR for Alzheimer’s disease to be 3.5.^{23} Let the frailty *Z* be gamma distributed, and use Expression (3) to find that the variance is 2.5, assuming that the outcome is rare and that the frailty is shared between Alzheimer’s disease incidence and mortality. We define the hazard ratio for all cause death in smokers to be 1.8 compared with nonsmokers.^{24} Approximately 70% of never-smoking Americans survive until the age of 80 years.^{25} About 17% of Americans between 75 and 85 years suffer from Alzheimer’s disease,^{26} and we therefore use 17% as a crude estimate on the risk of having Alzheimer’s disease at 80. Based on these estimates, we find a cumulative baseline hazard rate of dying at 80 years (*A*_{1}(80) = 0.36) and a cumulative baseline hazard rate of getting Alzheimer’s at 80 years (*A*_{2}(80) = 0.19). We use Expression (2) to find an observed hazard ratio of 0.77 of developing Alzheimer’s disease in smokers compared with nonsmokers.

These calculations suggest that a spurious association due to smoking can occur in studies of elderly. Consistent with our observed estimate of 0.77, Hernán et al.^{21} found a relative risk of 0.72 by using data from three observational studies which included patients older than 75 years. If we considered instead individuals with mean age of 70 years, we would observe a population hazard ratio of 0.82, still suggesting a strong bias.

Here we have used numbers from real-world analyses in a clearly simplified model. However, there is no obvious reason to assume that similar phenomena are absent from the complex real life scenarios, and the model can easily be extended, for example, by replacing the common frailty *Z* with separate, correlated frailties for *S* and *Y*.

Not only false protectivity but also falsely increased risk might occur in a competing risks setting. Danaei et al.^{27} assessed the presence of frailty bias in the positive association between statins and diabetes risk. By using inverse probability weighting with variables that are known and measured, they found no evidence of bias. Their results also agree with a Mendelian randomization study.^{28} However, in other scenarios, using the inverse probability weights might be insufficient, because unknown factors can yield the survival bias. Defining a statistical model for the unknown frailty may yield additional insight.

## BIAS IN INDEX-EVENT STUDIES

Many studies in epidemiology consider patients who have experienced a particular event, for example, the onset of a disease. We will therefore study the effect of a binary exposure *X* on *Y*_{2}, conditioning on *Y*_{1} has occurred. The causal structure is exactly the same as in Figure 1, but *S* is replaced by *Y*_{1}. This is an index-event design,^{29} which may lead to bias that is often neglected,^{30} for example, in the context of obesity paradoxes.^{31}

We can explore the magnitude of index-event bias by fitting frailty models. The time to *Y*_{1} is determined by an individual hazard rate, which is the product of the frailty *Z* and a basic rate *α*(*t*)

Some individuals have an additional risk factor (e.g., obesity) giving an additional hazard rate of *k*(*t*), and their total hazard rate is

Hence,

and

are the counterfactual hazard rates for the survival times

and

respectively. Let the basic frailty *Z* be gamma distributed with expectation equal to 1 and variance equal to *δ*, and for simplicity we assume that *k*(*t*) and *α*(*t*) are constants. Given *Y*_{1} has occurred, let *Z*_{t} be the frailty variable among individuals lacking the special risk factor, and let *Z*_{t,k} be the frailty among individuals characterized by the risk factor. In sections “Bias in Index-event Studies” and “Special Case: Gamma Frailty” in Appendix, we have derived that

in distribution, where *U* is exponentially distributed and independent of *Z*_{t,k}.

Importantly, since *U* is always positive, Expression (4) shows that conditioning on *Y*_{1}, the frailty will be larger for individuals lacking the special risk factor. Heuristically, an unmeasured and possibly more severe risk factor caused *Y*_{1} in these individuals. This phenomenon is seen in breast cancer, where younger patients tend to have a worse prognosis. Old age is probably not a real protective factor. Rather, young patients are likely to have a particular vulnerability, that is frailty, that also influences survival. Similarly, the paradoxical effect of obesity in a wide range of index-event studies may be influenced by the frailty phenomenon displayed in Expression (4).

In Figure 4, we study two scenarios in which patients are monitored for 5 years after experiencing *Y*_{1}. In both scenarios, the baseline hazard of *Y*_{1}, *α*(*t*), is identical to the baseline hazard of *Y*_{2}, and the high-risk group is characterized by an average doubling of the causal hazard rate of *Y*_{1}. In Figure 4A, the observed hazard rates of *Y*_{2} in the high-risk group (red curve) and the low-risk group (black curve) are displayed, given that *Y*_{1} has occurred. The baseline hazard of *Y*_{2} is equal in both groups, but a paradoxical association is found during the whole time period. The association in Figure 4B appears even more counter-intuitive; here, the additional risk factor increases the hazard rate of both *Y*_{1} and *Y*_{2}. Nevertheless, carriers of the risk factor are observed to have lower hazard rate during the whole time period.

## DISCUSSION

Counter-intuitive associations occur naturally in observational studies. We have shown how causal models may illuminate issues in epidemiology and survival analysis. Unknown heterogeneity in disease risk may generate bias when subjects are selected after an exposure is assigned. In particular, this happens when the heterogeneity, that is, the frailty, has a multiplicative effect on the disease risk. We have numerically explored such bias by using simple frailty models.

Frailty modeling may be desirable compared with conventional sensitivity analyses. First, the parameters of the frailty models are easy to interpret. Importantly, we have shown how plausible parameter values often may be derived from published estimates. Furthermore, the frailty models naturally incorporate the time dependency of the frailty biases. Obtaining a specific estimate of the frailty bias may also be useful when detailed causal inference is performed. These characteristics make frailty models suitable for real life, bias analyses.

Our examples illustrate the intimate connection between causal models and frailty. A causal point of view is required for a profound understanding of frailty, and classical survival analysis is therefore dependent on a causal framework. On the other hand, the parametric frailty models add quantitative insight into biases that are revealed by directed acyclic graphs (DAGs). Nevertheless, frailty concepts have rarely been used to explore surprising associations in epidemiology. On the contrary, frailty models have been widely recognized in related fields such as demography^{3} and bioarcheology.^{32} It is important that epidemiologists and clinicians understand the issues of frailty bias, and numeric evaluations may often be appropriate.

## REFERENCES

### APPENDIX Crossing Hazards

Figure A1 shows the conditional and unconditional hazard ratios for a treatment that constantly halves the hazard of a disease. The dashed line describes the hazard ratio conditional on the frailty, which has a causal interpretation. The red curve shows the observed population hazard ratio at time *t* in (0,10] when the frailty has a compound Poisson distribution (*µ = δ = 1*) with 30% nonsusceptible individuals. Even though the causal hazard ratio is constantly 2, the observed hazard ratio declines to 0.5. The black curve displays the observed hazard ratio when the frailty is gamma distributed (*µ = δ = 1*). There is still a strong, time-dependent bias.

### False Protectivity

Assume that the common frailty, *Z*, is gamma distributed. The Laplace transform is frequently used in a wide range of quantitative disciplines. In survival analysis, it is particularly useful for expressing the survival functions in simple mathematical terms, and thereby we are able to highlight several important concepts.^{4} Here, the Laplace transform of *Z* is denoted

*.* The probability of surviving both risks equals

The probability of surviving to time *t* and then experiencing event 2 in the small time interval

[*t*, *t* + d*t*] equals

The rate, *µ*_{2}(*t*) at which event 2 occurs for survivors at time *t*, is then found by dividing by *S*(*t*), giving

Interestingly, the hazard rate *µ*_{2} also depends on *α*_{1}(*t*) (through its integral). Hence, any factor that influences the risk of event 1 will, on the population level, also be seen to influence the risk of event 2, even when *α*_{2}(*t*) is independent of this risk factor.

As an explicit example, let *Z* be gamma distributed with expectation equal to 1, and variance *δ*. This yields

### Bias in Index-Event Studies

Assume that a time scale is divided into two stages. The end of the first stage is defined by some event denoted *Y*_{1}, for example, kidney failure. The end of the second stage is defined by some other event *Y*_{2}, for example, death. The causal structure is displayed in Figure 1. A factor that increases the hazard rate of the first event is often found to decrease the hazard rate of the second event, as we discussed in the paragraph on false protectivity. For example, we can assume that obesity is an additional risk factor increasing the hazard rate of kidney failure (*X* in Figure 1).

This scenario can be described by a frailty model. The time to the first event, *Y*_{1}, is dependent on the hazard rate, which is given as the product of an individual specific quantity *Z* and a basic rate *α*(*t*):

Here, *Z* is considered to be a random variable over the population of individuals, specifying the level of frailty.

Some individuals have an additional risk factor (e.g., obesity) giving an additional hazard rate of *k*(*t*). Hence, their total hazard rate is

Define

and

.

The Laplace transform for individuals experiencing the first event at time *t* is given as follows:

The frailty variable defined by this Laplace transform is denoted *Z*_{t,k}.

When *k*(*t*) = 0 for all *t*, the Laplace transform of individuals experiencing the first event at time *t* is given as follows:

The frailty variable defined by this Laplace transform is denoted *Z*_{t}.

When the event *Y*_{1} has occurred, say at time *t*, we consider the hazard rate of the next event, *Y*_{2}. We assume that for known frailties the hazard rate at time *t* + *s* equals

in the special risk group, and

in the nonrisk group. Define

. The corresponding population hazard rates are then

Now, assume a special case with constant rates: α(*t*) = α, *k*(*t*) = *k*, *m*(*s*) = *m* and γ(*s*) = γ. Then the Laplace transforms are

whereas the hazard rates are (assumed *m* = 0 when *k* = 0) as follows:

These expressions will be used in section “Special Case: Gamma Frailty.”

### Special Case: Gamma Frailty

To illustrate the consequences of the derivations in Bias in index-event studies, assume that the basic frailty Laplace transform is gamma distributed:

This distribution has expectation equal to 1 and variance equal to *δ*. The derivative of this Laplace transform equals

This gives

The latter distribution is a gamma distribution with shape parameter 1 + 1/*δ*, and scale parameter

*αt* + 1/*δ*. The Laplace transform

corresponds to a mixture of this distribution and a gamma distribution with shape parameter 1/*δ* and scale parameter *αt* + 1/*δ*, with mixture proportions *α*/(*α* + *k*) and *k*/(*α* + *k*).

We may rewrite

This means that we can write as follows:

where *U* is independent of *Z*_{t,k} and exponentially distributed with scale parameter 1/*δ* + α/(*δk*) + *αt*.

In Figure 4, hazard rates for individuals with and without the special risk factor are displayed. In Figure 4A, the parameters are *t* = 1, *k* = 1, *α* = 1, *δ* = 5, *m* = 0, and *γ* = 1. In Figure 4B, *m* is changed to 0.1 and the other parameters are unchanged. In both plots, carriers of the extra risk factor *k* appear to have lower risk of *Y*_{2}, even though the causal risk is either equal to noncarriers (Figure 4A) or larger than noncarriers (Figure 4B).