Selection bias and confounding, either of which can create nonexchangeability,^{1}^{,}^{2} are two key threats to valid estimation of causal effects in observational studies. Structural selection bias, our focus, can be viewed structurally as resulting from conditioning or selection based on a collider–a variable that is a common effect of two other variables–one associated with a cause of exposure, the other associated with a cause of disease.^{3}

Structural selection bias is important because it represents a mechanism by which bias is introduced either by selection and inclusion of subjects for study or by adjustment or stratification on a variable. Examples described by Hernán et al^{3} include Berkson’s bias, differential loss to follow-up in longitudinal studies, healthy worker bias, volunteer/self-selection bias, exclusion of subjects with missing data, and situations wherein exposures vary over time with adjustment for certain variables affected by previous exposure (time-varying confounding). Hernán et al^{3} showed how it is sometimes possible to correct for structural selection bias using inverse probability (of selection) weights.

Here, we address situations wherein correction using inverse probability weighting is not possible, e.g., if some selection probabilities are zero or unknown or certain causes of selection are unmeasured.^{3} Our first goal is to derive bounds for the extent of structural selection bias under specific assumptions detailed below. These assumptions characterize a subtype of structural selection bias called M-bias^{4} that might suggest the bound applies only to M-bias, but subsequent examples show how to apply it with other subtypes. Analogous to the work of Ding and VanderWeele,^{5} who developed bounds for the magnitude of confounding, our bound depends on three risk ratios, each a proposed limit for the maximum association between variables that create the bias. Our bound can also be used in sensitivity analyses by positing different sizes for the required three risk ratios or as the basis of an e-value.^{6}

Bounds for the magnitude of several types of bias, including M-bias, were derived by Greenland.^{4} Although correct and often cited, Greenland’s bound for M-bias

was derived assuming that all associations, as measured by odds ratios (ORs), are uniform (homogeneity). Our second goal is to explore through examples the magnitude of bias and compare the performance of our bound with

when homogeneity fails. Finally, we provide an R program to calculate directly: (1) the magnitude of M-bias using a full set of parameters; (2) a bound based on only three risk ratios; and (3) the bound assuming homogeneity and one OR.

## METHODS

We now derive a bound for M-bias.^{3}^{,}^{4} We assume that causal relationships are like those in the directed acyclic graph (DAG) of Figure 1A, wherein a collider (

) is caused by two other factors (

and

), one of which is a cause of exposure

the other a cause of disease

^{3}. (More generally, structural selection bias includes situations wherein

or associates with a cause of

and,

or associates with a cause of

.) We expect bias if the analysis is restricted to those with

, say due to selection for study, or

is controlled analytically, say in a misdirected attempt to control confounding. Hernán et al^{3} describe many examples. In analogy with results of others,^{5}^{,}^{7} we want the bound to depend only on the strength of association between factors underlying bias (

in Figure 1A). Examples 5–7 show how the bound can be used with subtypes of structural selection bias other than M-bias. However, we consider neither time-varying confounding wherein one exposure affects later exposures^{3}^{,}^{8} nor selection bias without colliders (off the null).^{9}

The observed risk ratio (*oRR*), conditional on

, can be expressed in terms of the underlying probabilities as:

To simplify expression 1, we can divide numerator and denominator by

, with

and then

. We substitute

for

and obtain:

Now, the distribution of

among the exposed who are selected, i.e., those with

is:

Using this distribution to standardize for

yields the standardized risk ratio

:

We assume no other sources of bias (consistent with Figures 1A and 2A–C), specifically that the

is unbiased and yields, for a large sample, the causal risk ratio comparing risk among the exposed in the study cohort with that among the same group had they, contrary to fact, been unexposed. We define the bias

as the ratio,

divided by

:

In the Appendix, we show that a dichotomous variable

exists with the same causal relationships as

in Figure 1A and the same magnitude of bias. Thus, we can rewrite expression 2 with

replacing

as:

In Figure 1B,

replaces

of Figure 1A, and

is added to represent known confounders; all results can be conditioned on

, but henceforth we suppress

to simplify notation (Figure 1C). Many other patterns are also consistent with structural selection bias; a few, summarized in Figures 2A–C, are discussed in subsequent examples. We prove in eAppendices 2–3 (Online Supplement; http://links.lww.com/EDE/B520) that the bias based on the expected values of each sum in equation (3) never exceeds the bound

:

where

, and

measure the strengths of association between the factors

and

, in Figure 1C:

By recoding the exposed as unexposed, as did Ding and VanderWeele,^{5} it is straightforward to show that a lower bound for the bias is given by

(Online Supplement, eAppendix 2; http://links.lww.com/EDE/B520).

## RESULTS

Through a series of examples we now: (1) explore the magnitude of M-bias, the performance of

and compare it with

, and illustrate application of an R program to calculate bias (examples 1–4); (2) illustrate how to use

for types of structural selection bias in addition to M-bias (examples 5–7); and (3) use

to derive a bound for confounding (example 8).

### Example 1

Liu et al^{10} quantified M-bias in two hypothetical pharmacoepidemiologic cohort studies. Their first example, considered further here, concerns the effects of selective serotonin reuptake inhibitors (SSRIs) on lung cancer risk. It involves the same pattern of effects (their DAG 1) as those in our Figure 1C: SSRI, depression, smoking, coronary artery disease, and lung cancer are E, V, N, C, D in Figure 1C. In their basic scenario using parameters from the literature, Liu et al assumed that study subjects took SSRIs for depression (OR, OR_{VE} = 27), that depression caused coronary artery disease (OR_{VC} = 1.6), and that smoking caused coronary artery disease (OR_{NC} = 3) and lung cancer (OR_{ND} = 15). We calculated risks from frequencies and ORs, as needed. The true causal risk ratio (sRR) was one. Although coronary artery disease is a collider, adjustment for it is expected to bias the exposure-outcome association. In Monte Carlo simulations, Liu et al showed that this adjustment created little bias (

; basic scenario). Here, we focus on bias potentially caused by selecting a healthy subgroup for study (coronary artery disease = 0). The expected bias we calculated (R program, eAppendix 4; http://links.lww.com/EDE/B520) for the stratum-specific RR using equation 3 was small,

(Table, column 2), consistent with their simulations.

Direct calculation of bias

requires specification of up to 12 parameters (Table). If we were unsure about some, but could specify the three risk ratios in equation (5), we could calculate

to bound the causal risk ratio. Taking values compatible with the full set of parameters,

, and

, we obtain

and conclude

. We interpret this as follows: if all the parameters were as stipulated (Table, column 2), the expected bias would be 0.978. If some parameters differed, but were still consistent with equation (5) (

, bias could be as extreme as 0.813. Thus,

represents a worst case, provided

. Homogeneity is violated, so

is not directly applicable. We chose to calculate

using

, yielding

and

This example, with parameters from the literature,^{10} shows that

can provide a tighter bound than

.

### Example 2

We now modify example 1. First, we change the logistic model for effects of smoking and depression on coronary artery disease to make effects heterogeneous (multiplicative scale) and reflect the ORs reported by Ford et al.^{11} The effect of depression among nonsmokers (OR_{VC} = 1.03; Table, column 3) differs from that among smokers (OR_{VE} = 2.11); the effect of smoking among the nondepressed, not reported by Ford et al,^{11} is unchanged (OR = 3.0). Second, we assume 31% are smokers (average of prevalence in Liu et al^{10} and in Ford et al). Finally, we assume the coronary artery disease frequency is 19.7% (cumulative incidence for cardiovascular disease in Ford et al). With these modifications, the expected bias is 0.855.

, providing a tighter bound than

.

### Example 3

In examples 1 and 2, we ignored the possibility that subjects may need to volunteer. We now consider bias expected if people without coronary artery disease at baseline were invited to participate, but some refused. Haapea et al^{12} reported that the proportion refusing among men with a mood disorder was 23% higher (absolute scale) than among other men. Similarly, refusals among smokers may be 13% higher than among nonsmokers.^{13} Thus, we assume (Table) that smoking or depression alone each increase the proportion refusing to participate from 35% to 40%; smoking and depression together increase the proportion to 60% (supra-additivity), partially justified because the effect of depression averaged over smoking categories is not larger than the marginal effect reported by Haapea et al.^{12} To quantitatively address participation, define

if a subject refuses participation and 0 otherwise. Figure 1C still applies if we define

. We use:

. Accounting for refusals with the assumed pattern (Table, column 4), the expected bias is 0.758. Bounds

and

are conservative. Bias is less for some other parameters.

### Example 4

We now illustrate potential bias for two, perhaps extreme, scenarios. We assume (Table, column 5) stronger effects of smoking (OR = 4) and depression (OR = 3), and substantial heterogeneity wherein coronary artery disease risk among smokers with depression was only moderately higher (OR = 1.74) than that among nondepressed nonsmokers. We also assume more refusals, and a pattern wherein refusals of depressed smokers were only slightly more likely than for nondepressed nonsmokers. In this situation with very heterogeneous effects, the expected bias is 1.64 and

was marginally invalid. Other patterns with more heterogeneity and selected, higher prevalences can lead to potentially important negative bias (

; Table, example 5). With even greater heterogeneity, the actual bias can be more than two-fold greater than

(Online Supplement, eAppendix 4; http://links.lww.com/EDE/B520), but such extremes may typically be implausible.

### Example 5

In their classic article on smoking and mortality in British physicians,^{14} Doll and Hill raise the possibility that the observed, positive association between smoking and lung cancer mortality might be explained by bias if some physicians who both knew they had lung cancer and were heavy smokers, had been more likely to participate (self-select) in the study. They dismiss this possibility as improbable. Our results allow quantification of how strong the association between smoking, knowledge of lung cancer and participation would need to have been if such self-selection fully accounted for the association. The situation is similar to that in Figure 2C, where smoking

and belief that one has lung cancer (

), affect participation

lung cancer (

) affects both belief of having it (

and death from it (

). Smoking is hypothesized to be a cause of participation, so there is not a common cause of exposure

and participation. Therefore, we use a limiting case wherein

and

of Figure 1C are highly associated

which will hold for large

. The limiting bias is then

. The association between belief that one has lung cancer and death from it was plausibly very strong among these physicians, so we use

yielding

. For this bias to fully explain the observed

of 1.78 would require:

, or

. Such bias could occur if a smoker with knowledge of his/her expected death from lung cancer were 42% more likely to participate than, say, a nonsmoker with similar knowledge, and a nonsmoker without known lung cancer were 42% more likely to participate than a nonsmoker with known lung cancer. Such a strong pattern of self-selection seems implausible, supporting the conclusions of Doll and Hill.

### Example 6

Bound

also applies to more complicated situations like that in Figure 2A. To use

, we relate the causal structure in Figure 1C to that in Figure 2A. In Figure 2A,

, like

of Figure 1C, is a cause of both

and

and,

, like

of Figure 1C, is a cause of both

and

. Thus,

and

correspond to

and

, respectively. Thus, we can use these corresponding probabilities to calculate the risk ratios in equation (5) and then apply

to situations like those in Figure 2A.

### Example 7

Bound

also applies to less complicated situations like that in Figure 2B, where exposure and disease are each a cause of selection, not merely associated with such causes. To use

, we relate the causal structure in Figure 1C to that in Figure 2B, recognizing that as the risk ratios

and

become large,

and

become highly associated, approximating the situation in which

and

are the same variable. Similar reasoning holds for

and

. Thus, the selection bias bound for this situation is given by the limit, as

and

:

, by l’Hôpital’s rule. This is a worst-case scenario, in that the bias will not exceed

.

### Example 8

We can obtain a bound for the maximum magnitude of confounding as another limiting case of bound

. We reason as follows: For the causal relationships summarized in Figure 1C, if the risk ratio

is large,

and

will be highly associated, conditional on

. In the limit as

,

. The causal relationships then closely approximate those characterizing typical confounding (Figure 2D). Applying l’Hôpital’s rule to evaluate the limiting bias for large

becomes:

.

We can compare

with a bound derived by Lee^{7} for confounding when the standard is the exposed (as here). His bound is:

, where

, is the OR measuring the strength of the

association. In our formulation, the maximum bias occurs when

(eAppendix 3; http://links.lww.com/EDE/B520,

defined in equation 5). Replacing

with

to obtain the maximum bias, we have

. Thus, our result also leads to a bound for confounding, yielding Lee’s result as a special, limiting case.

These eight examples don’t fully cover the possible parameter combinations. Thus, the Online Supplement (eAppendix 4; http://links.lww.com/EDE/B520) includes: (1) a program to evaluate additional combinations using randomly chosen parameters, thereby supplementing the examples and the proof; and, (2) summary results for over 1,000,000 experiments that describe the parameter values and the bound’s performance and “tightness,” measured as the difference between it and the expected M-bias (simulated bias never exceeded

). The Online Supplement (eAppendix 5; http://links.lww.com/EDE/B520) also includes summaries of comparing

to

, focusing on robustness to more extreme violations of the homogeneity assumption used to derive

.

## DISCUSSION

We report several key results. First, we report a bound for the maximum magnitude of M-bias in cohort studies that depends only on the strengths of association between factors causing the bias. This result is analogous to previous bounds for confounding,^{5}^{,}^{7} in that it depends on the strength of association between key variables but not absolute frequencies or risks. Second, we show how to apply

to types of structural selection bias in addition to M-bias (examples 5–7), including the simpler situation wherein exposure and disease directly affect selection (

; example 7). Third, examples 1 and 2 show that, absent homogeneity,

can provide meaningfully tighter bounds than

. M-bias was relatively small in these examples with plausible values from the literature. Fourth, we derive a bound for confounding (

as a special, limiting case of our M-bias bound and show that this confounding bound, when expressed using ORs for a worst-case scenario, equals that reported by Lee.^{7} Finally, we illustrate application of the bound and use of a program (Online Supplement, eAppendix 5; http://links.lww.com/EDE/B520) to quantify M-bias using a complete set of parameters and explore the magnitude of bias.

Our bound for the extent of structural selection bias depends only on the strength of association between factors causing the bias (

, equation 5; Figure 1C). If

,

or

is 1,

and M-bias is absent. The bound provides a tool for conducting sensitivity analyses without having to specify either the frequency of the potentially unobserved variables (

in Figure 1C or absolute risks, such as

. Much like bounds for confounding^{5}^{,}^{7} and that for bias because of confounding and biased control selection in case–control studies,^{15} if some variables are unmeasured (e.g.,

in Figure 1C), the bound can be applied in sensitivity analyses by using the literature, other substantive knowledge, plausibility arguments and speculation about the strengths of association (

. If substantive knowledge points to a simpler situation wherein E and D directly affect participation or selection (Figure 2B), rather than just being associated with such causes, then tighter bounds may be obtained by using the selection probabilities more directly.^{16} The approach works best and most plausibly if the variables are recognized but perhaps unmeasured, and information about the associations

and

is available in the literature.^{5}

Bound

applies when the exposed is the standard. If we define

and

, and use

or

instead of

in equation (4), we obtain a bound with the unexposed

or full, selected population (

) as the standard (online Supplement, eAppendix 2; http://links.lww.com/EDE/B520).

Our bound for selection bias complements that of Huang and Lee,^{15} whose bound applies to ORs from case–control studies. Their bound addresses confounding as well, and reduces to the confounding bound of Lee^{7} if selection bias is absent. However, their bound for selection bias only applies to biased selection of controls in case–control studies and to ORs. It does not apply generally to M-bias or other types of structural selection bias in cohort studies which is our focus. Hence, our results are complementary.

The magnitude of structural selection bias tends to be smaller than that for confounding,^{4}^{,}^{5} provided the effects of the covariates causing the former are of similar size as those causing the latter; (e.g., if we use the same values

and

in

and in

). However, if the bias is due to effects of exposure and disease on selection (example 7, Figure 2B), this pattern may not hold.^{4}

The magnitude of M-bias in example 1 with parameters taken from the literature supports the thesis that M-bias will often be small^{4}^{,}^{10} (e.g., <5%), and bounds

and

conservative. The magnitude was larger (10%–25%) in examples 2 and 3 with heterogeneity (

and supra-additivity of refusals), relatively common covariates and parameters partially justified by the literature. M-bias was larger (>40%) in our examples and

invalid, only with substantial, perhaps implausible, heterogeneity and relatively common covariates.

Several approaches are available to address M-bias, depending on the situation and information available. With knowledge about the possibly unmeasured variables thought to have created M-bias (V and N), one can potentially specify values for the eight parameters used in equation 3 (Table; 13 with allowance for refusals) and then calculate either the bias directly or a corrected RR using inverse-probability-of-selection weights. With less knowledge about the factors, one can specify the risk ratios representing the causal effects in Figure 1C (

,

, and

) and use

; this bound should be conservative provided effects are specified judiciously. Finally, if relevant variables are known to not have importantly heterogeneous effects, one can specify the common OR and calculate

. This bound may also be conservative, unless substantial heterogeneity exists and covariates are not rare, when it can be too small. All four approaches can provide insight, recognizing that

definitely provides a worst-case bound, that

is probably, but not definitely conservative, and that bias (

equation 3; eAppendix 2; http://links.lww.com/EDE/B520) or the corrected RR can be accurate but with accuracy reflecting that of the input.

The e-value approach to sensitivity analysis described by VanderWeele and Ding^{6} for confounding can also be applied to M-bias using bound

or

. Example 5 illustrates this application for a situation in which the authors^{14} raised selection bias as a possible explanation of their reported association. Like the e-value applied to confounding, one can quantify how strongly participation would need to depend on causes of the exposure and on causes of the outcome to fully explain the association.

We note that our results provide another link between confounding and structural selection bias, both key causes of bias in observational studies and reasons for lack of exchangeability.^{1}^{,}^{4}^{,}^{15} They show that the maximum extent of confounding corresponds to a limiting case of structural selection bias in that the bounding formula for confounding is a special case of that for M-bias. We do not address selection bias without colliders, wherein off the null, selection can lead to biased estimators for the full cohort,^{9} even if unbiased for the selected subset (e.g., with

).

We temporarily address bounds for confounding, not our main focus. In formulating their bounds, Ding and VanderWeele^{5} and Lee^{7} measured the strength of the

association with parameters that differ from the parameter

that we used. If

and

, then Ding and VanderWeele’s parameter

equals

, Lee’s parameter

equals

and

equals

. Although Ding and VanderWeele’s confounding bound (

) is tight, Lee’s bound (

and less commonly c

can be smaller than

This can happen, for example, if

or if

is small to moderate, say less than four. This is not a discrepancy, although

and to a lesser extent

can use the additional information in

. Thus, if available information indicates

or

is small or moderate, one may sometimes be able to improve on

by using

. For confounding,

, but

in a worst-case scenario when

.

Reflecting tightness of

, we could typically construct hypothetical examples of confounding for which the magnitude of confounding approaches

as

became large. However, we could not always identify examples in which the magnitude of confounding approached

. Therefore, our bound is likely not tight for confounding, and by extension it may not be tight for selection bias. Returning to our primary focus, structural selection bias, these considerations raise the question as to whether our bound could be improved, if more detailed information such as the value of both

and

is available, by using that information in an alternative definition of

, perhaps similar to

or

. We leave those questions for future research.

In summary, we provide a new formula for the maximum extent of M-bias, dropping the homogeneity assumption. We show how the bound can be modified and used with additional types of structural selection bias, including situations wherein selection is directly caused by both exposure and disease, and confounding. Through several examples, we illustrate application of an equation and use of a provided R program for direct calculation of the bound and magnitude of bias.

## APPENDIX

We show that a dichotomous variable N suffices in place of

.

Because

is not a cause of

(e.g., Figure 1C),

Consider a dichotomous variable

with the same children as

, with no parents (in the DAG) like

, and such that

and

and

so that

With these definitions, direct substitution and evaluation into equation (2; main text) show that we can write

as:

Note that:

implies that:

, so

satisfies the same bound on the strength of association as

.

## ACKNOWLEDGMENTS

We thank Dr. Mitch Klein for his helpful comments and support. Other Sources of Support: 1. Interagency Personnel Agreement, Centers for Disease Control, Center for Environmental Health, Division of Environmental Hazards and Health Effects. Consultation, environmental epidemiology. Consultation (PI: Flanders). 2. Contract, American Cancer Society 09/01/2002–8/31/2018. Consultation, cancer epidemiology (PI: Flanders). 3. DHHS-NIH-NIAID, R01AI122266 “Prenatal, Intrapartum, and Infant Antibiotic Use and Atopic Diseases in Childhood” (PI: Dr. L. Darrow), Subrecipient through UNR-16–45 (PI: Flanders). 4. R01HD092595, Eunice Kennedy Shriver National Institute of Child Health and Human Development (PI: Dr. M. Goodman).