Secondary Logo

Journal Logo

Limits for the Magnitude of M-bias and Certain Other Types of Structural Selection Bias

Flanders, W. Danaa; Ye, Dongnib

doi: 10.1097/EDE.0000000000001031

Background: Structural selection bias and confounding are key threats to validity of causal effect estimation. Here, we consider M-bias, a type of selection bias, described by Hernán et al as a situation wherein bias is caused by selecting on a variable that is caused by two other variables, one a cause of the exposure, the other a cause of the outcome. Our goals are to derive a bound for (the maximum) M-bias, explore through examples the magnitude of M-bias, illustrate how to apply the bound for other types of selection bias, and provide a program for directly calculating M-bias and the bound.

Methods: We derive a bound for selection bias assuming specific, causal relationships that characterize M-bias and further evaluate it using simulations.

Results: Through examples, we show that, in many plausible situations, M-bias will tend to be small. In some examples, the bias is not small–but plausibility of the examples, ultimately to be judged by the researcher, may be low. The examples also show how the M-bias bound yields bounds for other types of selection bias and also for confounding. The latter illustrates how Lee’s bound for confounding can arise as a limiting case of ours.

Conclusions: We have derived a new bound for M-bias. Examples illustrate how to apply it with other types of selection bias. They also show that it can yield tighter bounds in certain situations than a previously published bound for M-bias. Our examples suggest that M-bias may often, but not uniformly, be small.

aDepartment of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA.

bDepartment of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA.

cCurrent: Oakridge Institute for Science and Education, Oakridge, TN.

Editor’s Note: A related commentary appears on p. 517.

Submitted May 13, 2018; accepted April 8, 2019.

The authors report no conflicts of interest.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (

Process by which someone else could obtain the data and computing code: there are no relevant data, except published data used in an Example; the R code for simulations is included as an online supplement.

Correspondence: W. Dana Flanders, 1518 Clifton Rd, Atlanta, GA 30322. E-mail:

Selection bias and confounding, either of which can create nonexchangeability,1,2 are two key threats to valid estimation of causal effects in observational studies. Structural selection bias, our focus, can be viewed structurally as resulting from conditioning or selection based on a collider–a variable that is a common effect of two other variables–one associated with a cause of exposure, the other associated with a cause of disease.3

Structural selection bias is important because it represents a mechanism by which bias is introduced either by selection and inclusion of subjects for study or by adjustment or stratification on a variable. Examples described by Hernán et al3 include Berkson’s bias, differential loss to follow-up in longitudinal studies, healthy worker bias, volunteer/self-selection bias, exclusion of subjects with missing data, and situations wherein exposures vary over time with adjustment for certain variables affected by previous exposure (time-varying confounding). Hernán et al3 showed how it is sometimes possible to correct for structural selection bias using inverse probability (of selection) weights.

Here, we address situations wherein correction using inverse probability weighting is not possible, e.g., if some selection probabilities are zero or unknown or certain causes of selection are unmeasured.3 Our first goal is to derive bounds for the extent of structural selection bias under specific assumptions detailed below. These assumptions characterize a subtype of structural selection bias called M-bias4 that might suggest the bound applies only to M-bias, but subsequent examples show how to apply it with other subtypes. Analogous to the work of Ding and VanderWeele,5 who developed bounds for the magnitude of confounding, our bound depends on three risk ratios, each a proposed limit for the maximum association between variables that create the bias. Our bound can also be used in sensitivity analyses by positing different sizes for the required three risk ratios or as the basis of an e-value.6

Bounds for the magnitude of several types of bias, including M-bias, were derived by Greenland.4 Although correct and often cited, Greenland’s bound for M-bias

was derived assuming that all associations, as measured by odds ratios (ORs), are uniform (homogeneity). Our second goal is to explore through examples the magnitude of bias and compare the performance of our bound with

when homogeneity fails. Finally, we provide an R program to calculate directly: (1) the magnitude of M-bias using a full set of parameters; (2) a bound based on only three risk ratios; and (3) the bound assuming homogeneity and one OR.

Back to Top | Article Outline


We now derive a bound for M-bias.3,4 We assume that causal relationships are like those in the directed acyclic graph (DAG) of Figure 1A, wherein a collider (

) is caused by two other factors (


), one of which is a cause of exposure

the other a cause of disease

3. (More generally, structural selection bias includes situations wherein

or associates with a cause of


or associates with a cause of

.) We expect bias if the analysis is restricted to those with

, say due to selection for study, or

is controlled analytically, say in a misdirected attempt to control confounding. Hernán et al3 describe many examples. In analogy with results of others,5,7 we want the bound to depend only on the strength of association between factors underlying bias (

in Figure 1A). Examples 5–7 show how the bound can be used with subtypes of structural selection bias other than M-bias. However, we consider neither time-varying confounding wherein one exposure affects later exposures3,8 nor selection bias without colliders (off the null).9



The observed risk ratio (oRR), conditional on

, can be expressed in terms of the underlying probabilities as:

To simplify expression 1, we can divide numerator and denominator by

, with

and then

. We substitute


and obtain:

Now, the distribution of

among the exposed who are selected, i.e., those with


Using this distribution to standardize for

yields the standardized risk ratio


We assume no other sources of bias (consistent with Figures 1A and 2A–C), specifically that the

is unbiased and yields, for a large sample, the causal risk ratio comparing risk among the exposed in the study cohort with that among the same group had they, contrary to fact, been unexposed. We define the bias

as the ratio,

divided by


In the Appendix, we show that a dichotomous variable

exists with the same causal relationships as

in Figure 1A and the same magnitude of bias. Thus, we can rewrite expression 2 with



In Figure 1B,


of Figure 1A, and

is added to represent known confounders; all results can be conditioned on

, but henceforth we suppress

to simplify notation (Figure 1C). Many other patterns are also consistent with structural selection bias; a few, summarized in Figures 2A–C, are discussed in subsequent examples. We prove in eAppendices 2–3 (Online Supplement; that the bias based on the expected values of each sum in equation (3) never exceeds the bound





, and

measure the strengths of association between the factors


, in Figure 1C:

By recoding the exposed as unexposed, as did Ding and VanderWeele,5 it is straightforward to show that a lower bound for the bias is given by

(Online Supplement, eAppendix 2;

Back to Top | Article Outline


Through a series of examples we now: (1) explore the magnitude of M-bias, the performance of

and compare it with

, and illustrate application of an R program to calculate bias (examples 1–4); (2) illustrate how to use

for types of structural selection bias in addition to M-bias (examples 5–7); and (3) use

to derive a bound for confounding (example 8).

Back to Top | Article Outline

Example 1

Liu et al10 quantified M-bias in two hypothetical pharmacoepidemiologic cohort studies. Their first example, considered further here, concerns the effects of selective serotonin reuptake inhibitors (SSRIs) on lung cancer risk. It involves the same pattern of effects (their DAG 1) as those in our Figure 1C: SSRI, depression, smoking, coronary artery disease, and lung cancer are E, V, N, C, D in Figure 1C. In their basic scenario using parameters from the literature, Liu et al assumed that study subjects took SSRIs for depression (OR, ORVE = 27), that depression caused coronary artery disease (ORVC = 1.6), and that smoking caused coronary artery disease (ORNC = 3) and lung cancer (ORND = 15). We calculated risks from frequencies and ORs, as needed. The true causal risk ratio (sRR) was one. Although coronary artery disease is a collider, adjustment for it is expected to bias the exposure-outcome association. In Monte Carlo simulations, Liu et al showed that this adjustment created little bias (

; basic scenario). Here, we focus on bias potentially caused by selecting a healthy subgroup for study (coronary artery disease = 0). The expected bias we calculated (R program, eAppendix 4; for the stratum-specific RR using equation 3 was small,

(Table, column 2), consistent with their simulations.



Direct calculation of bias

requires specification of up to 12 parameters (Table). If we were unsure about some, but could specify the three risk ratios in equation (5), we could calculate

to bound the causal risk ratio. Taking values compatible with the full set of parameters,

, and

, we obtain

and conclude

. We interpret this as follows: if all the parameters were as stipulated (Table, column 2), the expected bias would be 0.978. If some parameters differed, but were still consistent with equation (5) (

, bias could be as extreme as 0.813. Thus,

represents a worst case, provided

. Homogeneity is violated, so

is not directly applicable. We chose to calculate


, yielding


This example, with parameters from the literature,10 shows that

can provide a tighter bound than


Back to Top | Article Outline

Example 2

We now modify example 1. First, we change the logistic model for effects of smoking and depression on coronary artery disease to make effects heterogeneous (multiplicative scale) and reflect the ORs reported by Ford et al.11 The effect of depression among nonsmokers (ORVC = 1.03; Table, column 3) differs from that among smokers (ORVE = 2.11); the effect of smoking among the nondepressed, not reported by Ford et al,11 is unchanged (OR = 3.0). Second, we assume 31% are smokers (average of prevalence in Liu et al10 and in Ford et al). Finally, we assume the coronary artery disease frequency is 19.7% (cumulative incidence for cardiovascular disease in Ford et al). With these modifications, the expected bias is 0.855.

, providing a tighter bound than


Back to Top | Article Outline

Example 3

In examples 1 and 2, we ignored the possibility that subjects may need to volunteer. We now consider bias expected if people without coronary artery disease at baseline were invited to participate, but some refused. Haapea et al12 reported that the proportion refusing among men with a mood disorder was 23% higher (absolute scale) than among other men. Similarly, refusals among smokers may be 13% higher than among nonsmokers.13 Thus, we assume (Table) that smoking or depression alone each increase the proportion refusing to participate from 35% to 40%; smoking and depression together increase the proportion to 60% (supra-additivity), partially justified because the effect of depression averaged over smoking categories is not larger than the marginal effect reported by Haapea et al.12 To quantitatively address participation, define

if a subject refuses participation and 0 otherwise. Figure 1C still applies if we define

. We use:

. Accounting for refusals with the assumed pattern (Table, column 4), the expected bias is 0.758. Bounds


are conservative. Bias is less for some other parameters.

Back to Top | Article Outline

Example 4

We now illustrate potential bias for two, perhaps extreme, scenarios. We assume (Table, column 5) stronger effects of smoking (OR = 4) and depression (OR = 3), and substantial heterogeneity wherein coronary artery disease risk among smokers with depression was only moderately higher (OR = 1.74) than that among nondepressed nonsmokers. We also assume more refusals, and a pattern wherein refusals of depressed smokers were only slightly more likely than for nondepressed nonsmokers. In this situation with very heterogeneous effects, the expected bias is 1.64 and

was marginally invalid. Other patterns with more heterogeneity and selected, higher prevalences can lead to potentially important negative bias (

; Table, example 5). With even greater heterogeneity, the actual bias can be more than two-fold greater than

(Online Supplement, eAppendix 4;, but such extremes may typically be implausible.

Back to Top | Article Outline

Example 5

In their classic article on smoking and mortality in British physicians,14 Doll and Hill raise the possibility that the observed, positive association between smoking and lung cancer mortality might be explained by bias if some physicians who both knew they had lung cancer and were heavy smokers, had been more likely to participate (self-select) in the study. They dismiss this possibility as improbable. Our results allow quantification of how strong the association between smoking, knowledge of lung cancer and participation would need to have been if such self-selection fully accounted for the association. The situation is similar to that in Figure 2C, where smoking

and belief that one has lung cancer (

), affect participation

lung cancer (

) affects both belief of having it (

and death from it (

). Smoking is hypothesized to be a cause of participation, so there is not a common cause of exposure

and participation. Therefore, we use a limiting case wherein


of Figure 1C are highly associated

which will hold for large

. The limiting bias is then

. The association between belief that one has lung cancer and death from it was plausibly very strong among these physicians, so we use


. For this bias to fully explain the observed

of 1.78 would require:

, or

. Such bias could occur if a smoker with knowledge of his/her expected death from lung cancer were 42% more likely to participate than, say, a nonsmoker with similar knowledge, and a nonsmoker without known lung cancer were 42% more likely to participate than a nonsmoker with known lung cancer. Such a strong pattern of self-selection seems implausible, supporting the conclusions of Doll and Hill.

Back to Top | Article Outline

Example 6


also applies to more complicated situations like that in Figure 2A. To use

, we relate the causal structure in Figure 1C to that in Figure 2A. In Figure 2A,

, like

of Figure 1C, is a cause of both



, like

of Figure 1C, is a cause of both


. Thus,


correspond to


, respectively. Thus, we can use these corresponding probabilities to calculate the risk ratios in equation (5) and then apply

to situations like those in Figure 2A.

Back to Top | Article Outline

Example 7


also applies to less complicated situations like that in Figure 2B, where exposure and disease are each a cause of selection, not merely associated with such causes. To use

, we relate the causal structure in Figure 1C to that in Figure 2B, recognizing that as the risk ratios


become large,


become highly associated, approximating the situation in which


are the same variable. Similar reasoning holds for


. Thus, the selection bias bound for this situation is given by the limit, as



, by l’Hôpital’s rule. This is a worst-case scenario, in that the bias will not exceed


Back to Top | Article Outline

Example 8

We can obtain a bound for the maximum magnitude of confounding as another limiting case of bound

. We reason as follows: For the causal relationships summarized in Figure 1C, if the risk ratio

is large,


will be highly associated, conditional on

. In the limit as


. The causal relationships then closely approximate those characterizing typical confounding (Figure 2D). Applying l’Hôpital’s rule to evaluate the limiting bias for large



We can compare

with a bound derived by Lee7 for confounding when the standard is the exposed (as here). His bound is:

, where

, is the OR measuring the strength of the

association. In our formulation, the maximum bias occurs when

(eAppendix 3;,

defined in equation 5). Replacing


to obtain the maximum bias, we have

. Thus, our result also leads to a bound for confounding, yielding Lee’s result as a special, limiting case.

These eight examples don’t fully cover the possible parameter combinations. Thus, the Online Supplement (eAppendix 4; includes: (1) a program to evaluate additional combinations using randomly chosen parameters, thereby supplementing the examples and the proof; and, (2) summary results for over 1,000,000 experiments that describe the parameter values and the bound’s performance and “tightness,” measured as the difference between it and the expected M-bias (simulated bias never exceeded

). The Online Supplement (eAppendix 5; also includes summaries of comparing


, focusing on robustness to more extreme violations of the homogeneity assumption used to derive


Back to Top | Article Outline


We report several key results. First, we report a bound for the maximum magnitude of M-bias in cohort studies that depends only on the strengths of association between factors causing the bias. This result is analogous to previous bounds for confounding,5,7 in that it depends on the strength of association between key variables but not absolute frequencies or risks. Second, we show how to apply

to types of structural selection bias in addition to M-bias (examples 5–7), including the simpler situation wherein exposure and disease directly affect selection (

; example 7). Third, examples 1 and 2 show that, absent homogeneity,

can provide meaningfully tighter bounds than

. M-bias was relatively small in these examples with plausible values from the literature. Fourth, we derive a bound for confounding (

as a special, limiting case of our M-bias bound and show that this confounding bound, when expressed using ORs for a worst-case scenario, equals that reported by Lee.7 Finally, we illustrate application of the bound and use of a program (Online Supplement, eAppendix 5; to quantify M-bias using a complete set of parameters and explore the magnitude of bias.

Our bound for the extent of structural selection bias depends only on the strength of association between factors causing the bias (

, equation 5; Figure 1C). If



is 1,

and M-bias is absent. The bound provides a tool for conducting sensitivity analyses without having to specify either the frequency of the potentially unobserved variables (

in Figure 1C or absolute risks, such as

. Much like bounds for confounding5,7 and that for bias because of confounding and biased control selection in case–control studies,15 if some variables are unmeasured (e.g.,

in Figure 1C), the bound can be applied in sensitivity analyses by using the literature, other substantive knowledge, plausibility arguments and speculation about the strengths of association (

. If substantive knowledge points to a simpler situation wherein E and D directly affect participation or selection (Figure 2B), rather than just being associated with such causes, then tighter bounds may be obtained by using the selection probabilities more directly.16 The approach works best and most plausibly if the variables are recognized but perhaps unmeasured, and information about the associations


is available in the literature.5


applies when the exposed is the standard. If we define


, and use


instead of

in equation (4), we obtain a bound with the unexposed

or full, selected population (

) as the standard (online Supplement, eAppendix 2;

Our bound for selection bias complements that of Huang and Lee,15 whose bound applies to ORs from case–control studies. Their bound addresses confounding as well, and reduces to the confounding bound of Lee7 if selection bias is absent. However, their bound for selection bias only applies to biased selection of controls in case–control studies and to ORs. It does not apply generally to M-bias or other types of structural selection bias in cohort studies which is our focus. Hence, our results are complementary.

The magnitude of structural selection bias tends to be smaller than that for confounding,4,5 provided the effects of the covariates causing the former are of similar size as those causing the latter; (e.g., if we use the same values



and in

). However, if the bias is due to effects of exposure and disease on selection (example 7, Figure 2B), this pattern may not hold.4

The magnitude of M-bias in example 1 with parameters taken from the literature supports the thesis that M-bias will often be small4,10 (e.g., <5%), and bounds


conservative. The magnitude was larger (10%–25%) in examples 2 and 3 with heterogeneity (

and supra-additivity of refusals), relatively common covariates and parameters partially justified by the literature. M-bias was larger (>40%) in our examples and

invalid, only with substantial, perhaps implausible, heterogeneity and relatively common covariates.

Several approaches are available to address M-bias, depending on the situation and information available. With knowledge about the possibly unmeasured variables thought to have created M-bias (V and N), one can potentially specify values for the eight parameters used in equation 3 (Table; 13 with allowance for refusals) and then calculate either the bias directly or a corrected RR using inverse-probability-of-selection weights. With less knowledge about the factors, one can specify the risk ratios representing the causal effects in Figure 1C (


, and

) and use

; this bound should be conservative provided effects are specified judiciously. Finally, if relevant variables are known to not have importantly heterogeneous effects, one can specify the common OR and calculate

. This bound may also be conservative, unless substantial heterogeneity exists and covariates are not rare, when it can be too small. All four approaches can provide insight, recognizing that

definitely provides a worst-case bound, that

is probably, but not definitely conservative, and that bias (

equation 3; eAppendix 2; or the corrected RR can be accurate but with accuracy reflecting that of the input.

The e-value approach to sensitivity analysis described by VanderWeele and Ding6 for confounding can also be applied to M-bias using bound


. Example 5 illustrates this application for a situation in which the authors14 raised selection bias as a possible explanation of their reported association. Like the e-value applied to confounding, one can quantify how strongly participation would need to depend on causes of the exposure and on causes of the outcome to fully explain the association.

We note that our results provide another link between confounding and structural selection bias, both key causes of bias in observational studies and reasons for lack of exchangeability.1,4,15 They show that the maximum extent of confounding corresponds to a limiting case of structural selection bias in that the bounding formula for confounding is a special case of that for M-bias. We do not address selection bias without colliders, wherein off the null, selection can lead to biased estimators for the full cohort,9 even if unbiased for the selected subset (e.g., with


We temporarily address bounds for confounding, not our main focus. In formulating their bounds, Ding and VanderWeele5 and Lee7 measured the strength of the

association with parameters that differ from the parameter

that we used. If


, then Ding and VanderWeele’s parameter


, Lee’s parameter




. Although Ding and VanderWeele’s confounding bound (

) is tight, Lee’s bound (

and less commonly c

can be smaller than

This can happen, for example, if

or if

is small to moderate, say less than four. This is not a discrepancy, although

and to a lesser extent

can use the additional information in

. Thus, if available information indicates


is small or moderate, one may sometimes be able to improve on

by using

. For confounding,

, but

in a worst-case scenario when


Reflecting tightness of

, we could typically construct hypothetical examples of confounding for which the magnitude of confounding approaches


became large. However, we could not always identify examples in which the magnitude of confounding approached

. Therefore, our bound is likely not tight for confounding, and by extension it may not be tight for selection bias. Returning to our primary focus, structural selection bias, these considerations raise the question as to whether our bound could be improved, if more detailed information such as the value of both


is available, by using that information in an alternative definition of

, perhaps similar to


. We leave those questions for future research.

In summary, we provide a new formula for the maximum extent of M-bias, dropping the homogeneity assumption. We show how the bound can be modified and used with additional types of structural selection bias, including situations wherein selection is directly caused by both exposure and disease, and confounding. Through several examples, we illustrate application of an equation and use of a provided R program for direct calculation of the bound and magnitude of bias.

Back to Top | Article Outline


We show that a dichotomous variable N suffices in place of



is not a cause of

(e.g., Figure 1C),

Consider a dichotomous variable

with the same children as

, with no parents (in the DAG) like

, and such that



so that

With these definitions, direct substitution and evaluation into equation (2; main text) show that we can write


Note that:

implies that:

, so

satisfies the same bound on the strength of association as


Back to Top | Article Outline


We thank Dr. Mitch Klein for his helpful comments and support. Other Sources of Support: 1. Interagency Personnel Agreement, Centers for Disease Control, Center for Environmental Health, Division of Environmental Hazards and Health Effects. Consultation, environmental epidemiology. Consultation (PI: Flanders). 2. Contract, American Cancer Society 09/01/2002–8/31/2018. Consultation, cancer epidemiology (PI: Flanders). 3. DHHS-NIH-NIAID, R01AI122266 “Prenatal, Intrapartum, and Infant Antibiotic Use and Atopic Diseases in Childhood” (PI: Dr. L. Darrow), Subrecipient through UNR-16–45 (PI: Flanders). 4. R01HD092595, Eunice Kennedy Shriver National Institute of Child Health and Human Development (PI: Dr. M. Goodman).

Back to Top | Article Outline


1. Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15:413–419.
2. Flanders WD, Eldridge RC. Summary of relationships between exchangeability, biasing paths and bias. Eur J Epidemiol. 2015;30:1089–1099.
3. Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15:615–625.
4. Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. 2003;14:300–306.
5. Ding P, VanderWeele TJ. Sensitivity analysis without assumptions. Epidemiology. 2016;27:368–377.
6. VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med. 2017;167:268–274.
7. Lee WC. Bounding the bias of unmeasured factors with confounding and effect-modifying potentials. Stat Med. 2011;30:1007–1017.
8. Robins J. A new approach to causal inference in mortality studies with sustained exposure periods—application to control of the health worker survivor effect. Math Model. 1986;7:1393–1515.
9. Hernán MA. Invited commentary: selection bias without colliders. Am J Epidemiol. 2017;185:1048–1050.
10. Liu W, Brookhart MA, Schneeweiss S, Mi X, Setoguchi S. Implications of M bias in epidemiologic studies: a simulation study. Am J Epidemiol. 2012;176:938–948.
11. Ford DE, Mead LA, Chang PP, Cooper-Patrick L, Wang NY, Klag MJ. Depression is a risk factor for coronary artery disease in men: the precursors study. Arch Intern Med. 1998;158:1422–1426.
12. Haapea M, Miettunen J, Läärä E, et al. Non-participation in a field survey with respect to psychiatric disorders. Scand J Public Health. 2008;36:728–736.
13. Cunradi CB, Moore R, Killoran M, Ames G. Survey nonresponse bias among young adults: the role of alcohol, tobacco, and drugs. Subst Use Misuse. 2005;40:171–185.
14. Doll R, Hill AB. The mortality of doctors in relation to their smoking habits; a preliminary report. Br Med J. 1954;1:1451–1455.
15. Huang TH, Lee WC. Bounding formulas for selection bias. Am J Epidemiol. 2015;182:868–872.
16. Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic Research: Principles and Quantitative Methods. 1982.Belmont, CA: John Wiley & Sons.

Bias; Bound; Confounding; e-Value; Exchangeability; Selection bias; Sensitivity analysis; Structural selection bias

Supplemental Digital Content

Back to Top | Article Outline
Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved.