“Take nothing on its looks; take everything on evidence. There’s no better rule.”
─ Charles Dickens, Great Expectations
In this issue of the journal, The Open Mind article authored by Drs. Prielipp and Coursin1 provides 3 examples in perioperative medicine wherein initial high-profile research studies have been used to justify changes in public policy, yet, over time, these studies could not be replicated. It is speculated that the evidence-based changes in policy or performance measures, originally based on these studies, have led to deleterious effects, which these authors colorfully compare with the rise in organized crime in response to prohibition. This is a well-written, entertaining, and thought-provoking history of our specialty. We should heed their message and at the same time be aware that although we have great expectations for clinical investigation, and evidence-based medicine, by its nature, it has numerous important limitations.
High-profile studies are used to formulate the guideline-directed medical therapy (and public policy). However, high profile does not necessarily equate to high quality. There are numerous examples of studies other than those discussed in this Open Mind article.2–5 Although the need to generate practice guidelines and policy is and rooted in the will to improve patient safety, the quality of the evidence on which many of our current guidelines are formulated may be overrated. In a somewhat controversial article, Kavanagh6 pointed out that the practice of grading evidence was, paradoxically, not evidence-based. Most readers, and reviewers for that matter, of perioperative research rely too heavily of the size of the P value to judge the veracity of a research result,7 whereas credibility should, instead, rely on the design, the effect size, and the biases inherent to the trial.
At the American Society of Anesthesiologists’ Annual Meeting in San Francisco in 2013, John Ioannidis gave a guest lecture on medical evidence to the editorial board members of both journals Anesthesia & Analgesia and Anesthesiology. Dr. Ioannidis is an epidemiologist and meta-analyst, whose publications have made him a world-renowned and sought-after lecturer on the credibility of medical evidence.
In his article entitled “Why Most Published Research Findings Are False” published in PLoS,8 which, parenthetically, is the most highly cited article in that journal’s history, Dr. Ioannidis has showed, using mathematical modeling, that the probability that the results of a pristinely conducted research study being true is related to 3 factors:
- the plausibility of the hypothesis being tested (here he uses the term pretest probability);
- the statistical power of the study (which he points out is related to 2 factors: the samples size and the effect size); and
- the level of statistical significance.
His models suggest that if a study is being conducted on a biologically plausible premise, has a large effect size, and a large sample size (think thousands), then the results of such a study have only an 85% probability of being confirmed as being a true result. This is a classic exercise in logic: by assuming all factors in favor of a theory, the results reveal any inherent weaknesses of the postulate. The article goes on to enumerate several factors, common to medical science, which will decrease the probability of a research finding being true. He notes that the probability of finding a statistically significant result increases by random chance as the number of different teams working on similar problems increases. This may be problematic because it is more likely that positive studies will be published than negative studies.9 In addition, Ioannidis points out that the outcome measured is critically important; although death is an unequivocal event, rating scales and composite outcomes can be manipulated to cast interventions in a better light.10 Surrogate outcomes are common, and we forget that deep vein thrombosis and cardiac ischemia are surrogates for harder, less frequent outcomes. Notably, for harder outcomes, the effect size of an intervention is usually smaller.
Ioannidis was also careful to show that his idealized equations used for modeling did not account for the possibility of bias. Bias is always present, at least subtly, in even the best quality blinded randomized clinical trial. In a related article, his team was able to identify >200 types of bias germane to medical science.11 Bias reduces the likelihood that study results are in fact true. Examples of these biases include, but are not limited to:
- the control group may be have an inflated event rate—e.g., Dutch Echocardiographic Cardiac Risk Evaluation Applying Stress Echocardiography (DECREASE 1)12;
- failure of randomization to create equitable cohorts—e.g., TRICC13;
- academic conflicts of interest or outright fraud—e.g., Dr. Boldt,14 Dr. Reuben,a and Dr. Fujii15;
- selective reporting of outcomes and/or suppression of competing evidence via the peer review process: for instance, Metoprolol after Vascular Surgery (MaVS)16 and POISE117 were both rejected by The New England Journal of Medicine despite the publication of the 2 original high-profile β-blocker studies; and
- financial incentives.18
The theory espoused in “Why Most Published Research Findings Are False” was then followed by a study entitled “Contradicted and Initially Stronger Effects in Highly Cited Clinical Research,” a real-life confirmation of his previously presented mathematical framework.19 In this study, a search of major medical journals found 49 articles with >1000 citations. A secondary search was then performed for each of the 49 studies to identify subsequent studies on the same subject. Four of the 49 studies were eliminated because they showed no efficacy and also refuted claims of efficacy based on observational studies. In the remaining 45 studies, 14 (33%) either contradicted the initial efficacy claim or found an efficacy of reduced effect size. These subsequent and contradictory studies were found to have both superior design and larger sample sizes, where the median sample size was >3 times larger (2165 subjects). At this point, one should note that the great majority of perioperative medicine research studies and those used in this Open Mind article fall well short of these standards.
Ioannidis’ evidence of evidence clearly shows that claims of efficacy (and hence public policy) should be based solely on replicated, minimally biased studies. It also gives us clear direction on how this goal will be achieved; decision-making, high-quality evidence can only come from large, well-designed, prospective trials pursuing plausible theses. Unfortunately, and restated for emphasis, there is a paucity of these examples in perioperative medicine.
It is our hope that Drs. Prielipp and Coursin are documenting a process where our specialty is evolving into a high-quality, evidence-based practice. The myriad of pressing perioperative issues such as obstructive sleep apnea, perioperative fluid therapy, appropriate transfusion policies, postoperative renal failure, and the toxicity of anesthetics at extremes of life, to name a few, requires quality evidence for decision-making.8 This evidence can only be attained through the arduous, time-consuming, expensive, and frequently unsuccessful process of clinical investigation. These are indeed great expectations for clinical investigation, and our pursuit of patient safety is a clear justification for its expanded presence.
It will take seismic shifts in our behavior to overcome the numerous obstacles that stand in the way of this progress. We must overcome the paucity of research networks, adopt new models for recruiting high volumes of patients, overcome funding issues with new entrepreneurial designs, and shift our focus away from individual academic achievement to reward collaboration.
Although many readers may be concerned by the vignettes portrayed in The Open Mind as representing steps in the wrong direction, to my mind they actually highlight the great strength of the evidence-based medicine process: the ability to improve with change.20
Name: W. Scott Beattie, MD, PhD, FRCPC.
Contribution: This author wrote the manuscript.
Attestation: W. Scott Beattie approved the final manuscript.
This manuscript was handled by: Sorin J. Brull, MD, FCARCSI (Hon.).
a Available at: www.aaeditor.org/HWP/Retraction.notice. Accessed April 7, 2015.
1. Prielipp RC, Coursin DB. All that glitters is not a golden recommendation. Anesth Analg. 2015;121:727–33
2. Koch CG, Li L, Sessler DI, Figueroa P, Hoeltge GA, Mihaljevic T, Blackstone EH. Duration of red-cell storage and complications after cardiac surgery. N Engl J Med. 2008;358:1229–39
3. Brunkhorst FM, Engel C, Bloos F, Meier-Hellmann A, Ragaller M, Weiler N, Moerer O, Gruendling M, Oppert M, Grond S, Olthoff D, Jaschinski U, John S, Rossaint R, Welte T, Schaefer M, Kern P, Kuhnt E, Kiehntopf M, Hartog C, Natanson C, Loeffler M, Reinhart KGerman Competence Network Sepsis (SepNet). German Competence Network Sepsis (SepNet). . Intensive insulin therapy and pentastarch resuscitation in severe sepsis. N Engl J Med. 2008;358:125–39
4. Rivers E, Nguyen B, Havstad S, Ressler J, Muzzin A, Knoblich B, Peterson E, Tomlanovich MEarly Goal-Directed Therapy Collaborative Group. Early Goal-Directed Therapy Collaborative Group. . Early goal-directed therapy in the treatment of severe sepsis and septic shock. N Engl J Med. 2001;345:1368–77
5. Fergusson DA, Hébert PC, Mazer CD, Fremes S, MacAdams C, Murkin JM, Teoh K, Duke PC, Arellano R, Blajchman MA, Bussières JS, Côté D, Karski J, Martineau R, Robblee JA, Rodger M, Wells G, Clinch J, Pretorius RBART Investigators. BART Investigators. . A comparison of aprotinin and lysine analogues in high-risk cardiac surgery. N Engl J Med. 2008;358:2319–31
6. Kavanagh BP. The GRADE system for rating clinical guidelines. PLoS Med. 2009;6:e1000094
7. Johnson VE. Revised standards for statistical evidence. Proc Natl Acad Sci U S A. 2013;110:19313–7
8. Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2:e124
9. Ioannidis JP. Effect of the statistical significance of results on the time to completion and publication of randomized efficacy trials. JAMA. 1998;279:281–6
10. Ferreira-González I, Busse JW, Heels-Ansdell D, Montori VM, Akl EA, Bryant DM, Alonso-Coello P, Alonso J, Worster A, Upadhye S, Jaeschke R, Schünemann HJ, Permanyer-Miralda G, Pacheco-Huergo V, Domingo-Salvany A, Wu P, Mills EJ, Guyatt GH. Problems with use of composite end points in cardiovascular trials: systematic review of randomised controlled trials. BMJ. 2007;334:786
11. Chavalarias D, Ioannidis JP. Science mapping analysis characterizes 235 biases in biomedical research. J Clin Epidemiol. 2010;63:1205–15
12. Poldermans D, Boersma E, Bax JJ, Thomson IR, Blankensteijn JD, Baars HF, Yo TI, Trocino G, Vigna C, Roelandt JR. The effect of bisoprolol on perioperative mortality and myocardial infarction in high-risk patients undergoing vascular surgery. Dutch Echocardiographic Cardiac Risk Evaluation Applying Stress Echocardiography Study Group. N Engl J Med. 1999;341:1789–94
13. Hébert PC, Wells G, Blajchman MA, Marshall J, Martin C, Pagliarello G, Tweeddale M, Schweitzer I, Yetisir E. A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. Transfusion Requirements in Critical Care Investigators, Canadian Critical Care Trials Group. N Engl J Med. 1999;340:409–17
14. Miller DR. Update to readers and authors on ethical and scientific misconduct: retraction of the Boldt articles. Can J Anaesth. 2011;58:777–9, 779–81
15. Miller DR. Retraction of articles written by Dr. Yoshitaka Fujii. Can J Anaesth. 2012;59:1081–8
16. Yang H, Raymer K, Butler R, Parlow J, Roberts R. The effects of perioperative beta-blockade: results of the Metoprolol after Vascular Surgery (MaVS) study, a randomized controlled trial. Am Heart J. 2006;152:983–90
17. Devereaux PJ, Yang H, Yusuf S, Guyatt G, Leslie K, Villar JC, Xavier D, Chrolavicius S, Greenspan L, Pogue J, Pais P, Liu L, Xu S, Malaga G, Avezum A, Chan M, Montori VM, Jacka M, Choi P. Effects of extended-release metoprolol succinate in patients undergoing non-cardiac surgery (POISE trial): a randomised controlled trial. Lancet. 2008;371:1839–47
18. Lexchin J, Bero LA, Djulbegovic B, Clark O. Pharmaceutical industry sponsorship and research outcome and quality: systematic review. BMJ. 2003;326:1167–70
19. Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA. 2005;294:218–28
20. Wijeysundera DN, Duncan D, Nkonde-Price C, Virani SS, Washam JB, Fleischmann KE, Fleisher LA. Perioperative beta blockade in noncardiac surgery: a systematic review for the 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines. J Am Coll Cardiol. 2014;64:2406–25