Secondary Logo

Journal Logo

The Case for Magnitude-based Inference

Batterham, Alan M.; Hopkins, William G.

Medicine & Science in Sports & Exercise: April 2015 - Volume 47 - Issue 4 - p 885
doi: 10.1249/MSS.0000000000000551
SPECIAL COMMUNICATIONS: Letters to the Editor-in-Chief

Health and Social Care Institute, Teesside University Middlesbrough, England, UNITED KINGDOM

College of Sport and Exercise Science Victoria University Melbourne, AUSTRALIA

Dear Editor-in-Chief,

We respond here to a critique of magnitude-based inference (MBI) by Welsh and Knight (8).

  1. They asserted that MBI is just another form of null hypothesis significance testing (NHST), whereas it is philosophically and statistically distinct. Type 1 and Type 2 errors of MBI are conceptually quite unlike those of NHST.
  2. MBI forces researchers to define and justify clinically, practically, or mechanistically meaningful values of an effect. MBI then provides a framework for interpreting uncertainty in the effect in relation to these values, making it attractive to end users.
  3. MBI is actually a “reference Bayes” inferential method, combining the usual confidence interval with a noninformative prior belief (7). Noninformative prior beliefs are justified because prior knowledge is often vague (3). Moreover, MBI has the practical advantage of using standard sampling theory to obtain an intuitive (Bayesian) interpretation of the confidence interval as the likely range for the true effect (2,3,7). As such, MBI is quite possibly the ideal frequentist–Bayesian hybrid.
  4. Utilizing the logical fallacy of the straw man, Welsh and Knight (8) implied that MBI ignores data structure, multiple covariates, distribution and scale of the outcome variable, and presentation of effect size—all issues that we have attended to. They also practiced “mathematistry”—statistical theorizing that redefines a real-world problem (here, making robust inferences) without solving it (1,5).
  5. Using analytical formulas and simulation, we have verified that MBI Type 1 error rates (false discoveries of clear substantial effects when the true effect is null) are much less than Welsh and Knight (8) presented. The rates are acceptable given that the “errors” are accompanied by probabilistic terms representing level of evidence. For example, in a controlled trial such as the one analyzed by Welsh and Knight (8) (typical error of three times the smallest important effect), a true null produces clear possibly substantial effects only 21% of the time with nonclinical MBI, whereas the rates for likely substantial, very likely substantial, and most likely substantial are only 14%, 1.6%, and 0.1%, respectively. With conservative clinical MBI, the corresponding rates are only 0.0%, 2.1%, 0.9%, and 0.1%.
  6. Publication bias will decline if the criterion for manuscript acceptance is clear rather than significant, tempering any concerns about overall Type 1 error rates with MBI. In our simulations of the above controlled trial, trivial true effects produce trivial to small clear effects on average, but significant effects (P < 0.05 or P < 0.01) are mostly small to moderate. The differences arise from gratifyingly higher rates of potentially publishable clear effects with MBI (36%–81%) than of significant effects with NHST (1%–11%).
  7. The sample size spreadsheet at Sportscience ( produces correct estimates for NHST (verified by G*Power 3 software [4]) and MBI (verified by the Type 1 and Type 2 error rates shown).

In conclusion, Welsh and Knight (8) proposed that we either present confidence intervals or use “a fully Bayesian analysis” in place of MBI. However, they were silent on how to make inferences with confidence intervals, and full Bayesian analyses are not about to become universal: “we will all be Bayesians in 2020” (6) seems most unlikely. Meanwhile, the robust hybrid approach of MBI is genuinely progressive and becoming popular.

Back to Top | Article Outline


1. Box GEP. Science and statistics. J Am Stat Assoc. 1976; 71: 791–9.
2. Burton PR. Helping doctors to draw appropriate inferences from the analysis of medical studies. Stat Med. 1994; 13: 1699–713.
3. Burton PR, Gurrin LC, Campbell MJ. Clinical significance not statistical significance: a simple Bayesian alternative to P values. J Epidemiol Community Health. 1998; 52: 318–23.
4. Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007; 39: 175–91.
5. Little RJ. In praise of simplicity not mathematistry! Ten simple powerful ideas for the statistical scientist. J Am Stat Assoc. 2013; 108: 359–69.
6. Smith AA. A conversation with Dennis Lindley. Stat Sci. 1995; 10: 305–19.
7. Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation, Chichester, United Kingdom: Wiley. p. 2004: 112.
8. Welsh AH, Knight EJ. “Magnitude-based inference”: a statistical review. Med Sci Sports Exerc. 2015; 47 (4): 874–84.
© 2015 American College of Sports Medicine