“Medical reversal” describes the publication of clinical data that contradict previously published data and establishes a new “evidence-based” standard. Worrisome at best (and harmful at worst), reversal is surprisingly common, and exists in nearly all medical specialties. Recent high-profile examples include vertebroplasty for spinal fractures1 and hormone therapy in postmenopausal women.2 The failure of initial reports of tight glucose control3 and daily sedative interruption4 to withstand subsequent scrutiny suggests that the evidence base underlying perioperative critical care may have a similarly unstable foundation.
Although the reasons for reversal are incompletely understood, potential mechanisms include fraud, error, inadequate statistical analysis, mechanistic plausibility,5 and cognitive biases. Underlying many of these possibilities is a tacit understanding that the later article is “true,” and the earlier, “reversed” findings must have been incorrect. But 3 other frames are possible: the earlier studies were correct (and the later studies were wrong), both earlier and later findings were correct (implying that the phenomenon being studied has changed over time), or neither finding is correct, and the truth lies elsewhere. In seeking an explanation for reversal, the choice of frame is not always easy because an adequately powered randomized prospective study design may not protect the finding against the subsequent reversal.6
In this issue of Anesthesia & Analgesia, Kalra et al7 publish the results of a meta-analysis of targeted temperature management (TTM) (cooling) after cardiac arrest. The authors identified 11 trials from 1966 to 2016 containing 4782 patients. Individual trials varied in rates of bystander cardiopulmonary resuscitation, cooling protocol, and method. After multivariate random-effects modeling, the authors concluded that TTM had no effect on mortality rates or cerebral performance status.
Anesthesia & Analgesia readers who are familiar with TTM research may wonder why another meta-analysis of TTM is needed. The history of TTM fits a classic reversal scenario: multiple small clinical trials with unclear consensus, followed by high-profile randomized clinical trials establishing a clinical “truth.” In the case of TTM, 2 small but high-profile trials were published in the same 2002 issue of the New England Journal of Medicine,8,9 with an accompanying editorial10 citing as evidence of correctness that the 2 studies were performed on different continents, produced similar effect sizes, and that a plausible mechanism existed for the protective effect. By 2010, cooling to 32°C–34°C was enshrined in American Heart Association guidelines for cardiopulmonary resuscitation11 and strongly recommended by 5 professional critical care societies.12
Then came the reversal. A 2013 study of 950 patients (>4 times the combined number of 213 in the initial 2002 trials) randomized patients to 33°C vs 36°C after cardiac arrest, and found no difference in survival or neurological outcome.13 Also published in the New England Journal, this study was intensely discussed, garnering an Altmetric score of 470 by January 201414 and 664 as of August 22, 2017, placing it in the top 5% of all research outputs scored by Altmetric.
Despite this apparent reversal, little has changed regarding the recommended use of temperature management after cardiac arrest in 2017. Both the 2016 Cochrane15 and the 2015 American Heart Association updates16 continue to recommend postarrest cooling. Critical care and emergency medicine specialists argue over cooling method, duration, timing, target, and applicability. With TTM, the net result of literature reversal is more, not less cognitive diversity.
In principle, meta-analysis is an ideal solution. By combining the results of several studies, meta-analysis increases statistical power, facilitates understanding of variability between studies, and allows generalization of results to a wider population. The TTM landscape, dotted with multiple small studies and divergent results, is tailor made for meta-analysis, and multiple groups have obliged with both negative17,18 and positive results.16 However, meta-analysis also presumes that an absolute truth underlies all studies, but it is detected with varying accuracy and precision. What if that assumption is wrong, and the story of TTM really is one in which one (or more) of the dueling studies is incorrect? Does it then make sense to lump the results of correct and incorrect studies together? In such cases, meta-analyses may result in less, not more, truth.
Even when all studies are correctly performed, repeated meta-analyses raise an issue akin to that encountered during interim analyses of a randomized controlled trial. Stopping early when data are few and the likelihood of an outlier effect is high risks a type 1 error, whereas addition of multiple negative trials with high heterogeneity risks never reaching clear significance. Aware of this challenge, meta-analysts have responded with trial sequential meta-analysis (TSA), a Bayesian modification of standard meta-analysis in which thresholds for significance or futility are adjusted based on available data. With TSA, thresholds for significance are raised when few data exist (and the likelihood of a false positive is high) and lowered when a wealth of data are available (and the likelihood of identifying a real effect with further studies is low). A 2017 treatise on TSA19 includes a sample TSA-based analysis of TTM explaining why early meta-analyses may have found a benefit, why later efforts (including Kalra et al7) are more inconclusive, and why future trials or meta-analyses of TTM are likely to be futile.
Deciphering whether the TTM story represents reversal or fertile ground for meta-analysis may not be easy. Kalra et al7 found considerable variation among studies in age, time to return of spontaneous circulation, percentage of bystander cardiopulmonary resuscitation, a large 12–28-hour variation in cooling duration, and multiple cooling techniques, including helmet, ice packs, and intravenous administration of cold fluid.11 Which of these elements may affect TTM results (if at all) is unknown. In a 2016 analysis of Cochrane reports, outcomes graded as having a high quality of evidence had no better concordance among individual trials than did outcomes graded as having a low quality of evidence.20 Observations such as these are disquieting because they suggest that only time can reveal the true answer.
So what should readers of Anesthesia & Analgesia make of repeated meta-analyses of TTM (or any other topic)? First, just as readers are critical of individual studies, so should they be appropriately thoughtful about meta-analysis results. Questions of applicability, generalizability, and actual benefit (as opposed to control group harm) are germane to both types of data. To the above, we would add the possibility that meta-analyses in fields in which reversal has occurred may be incorporating the results of not just underpowered, but also flawed research.
Absent a need to clarify 1 or more of the above questions, repeated meta-analyses, especially for low-certainty data, may be an exercise similar to “p hacking,” in which analyses are repeatedly run to extract a statistically significant finding, rather than to answer a hypothesis.21 Alternatively, repeat meta-analyses may serve a useful role in highlighting overlooked gaps in the data and pointing out questions for future investigation. Regardless, the role for repeated meta-analyses remains unclear, and such studies may confuse as much as they clarify.
Name: Mark E. Nunnally, MD, FCCM.
Contribution: This author helped prepare the manuscript.
Conflicts of Interest: None.
Name: Avery Tung, MD, FCCM.
Contribution: This author helped prepare the manuscript.
Conflicts of Interest: A. Tung is an executive editor, Critical Care and Resuscitation, for Anesthesia & Analgesia.
This manuscript was handled by: W. Scott Beattie, PhD, MD, FRCPC.
1. Kallmes DF, Comstock BA, Heagerty PJ, et al. A randomized trial of vertebroplasty for osteoporotic spinal fractures. N Engl J Med. 2009;361:569–579.
2. Hays J, Ockene JK, Brunner RL, et al.; Women’s Health Initiative Investigators. Effects of estrogen plus progestin on health-related quality of life. N Engl J Med. 2003;348:1839–1854.
3. Finfer S, Chittock DR, Su SY, et al.; NICE-SUGAR Study Investigators. Intensive versus conventional glucose control in critically ill patients. N Engl J Med. 2009;360:1283–1297.
4. Mehta S, Burry L, Cook D, et al.; SLEAP Investigators; Canadian Critical Care Trials Group. Daily sedation interruption in mechanically ventilated critically ill patients cared for with a sedation protocol: a randomized controlled trial. JAMA. 2012;308:1985–1992.
5. Cardiac Arrhythmia Suppression Trial (CAST) Investigators. Preliminary report: effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. N Engl J Med. 1989;321:406–412.
6. van den Berghe G, Wouters P, Weekers F, et al. Intensive insulin therapy in critically ill patients. N Engl J Med. 2001;345:1359–1367.
7. Kalra S, Arora G, Patel N, et al. Targeted temperature management after cardiac arrest: systematic review and meta-analyses. Anesth Analg. 2018;126:867–875.
8. Hypothermia After Cardiac Arrest Study Group. Mild therapeutic hypothermia to improve the neurologic outcome after cardiac arrest. N Engl J Med. 2002;346:549–556.
9. Bernard SA, Gray TW, Buist MD, et al. Treatment of comatose survivors of out-of-hospital cardiac arrest with induced hypothermia. N Engl J Med. 2002;346:557–563.
10. Curfman GD. Hypothermia to protect the brain. N Engl J Med. 2002;346:546.
11. Peberdy MA, Callaway CW, Neumar RW, et al.; American Heart Association. Part 9: post-cardiac arrest care: 2010 American Heart Association Guidelines for cardiopulmonary resuscitation and emergency cardiovascular care. Circulation. 2010;122:S768–S786.
12. Nunnally ME, Jaeschke R, Bellingan GJ, et al. Targeted temperature management in critical care: a report and recommendations from five professional societies. Crit Care Med. 2011;39:1113–1125.
13. Nielsen N, Wetterslev J, Cronberg T, et al.; TTM Trial Investigators. Targeted temperature management at 33°C versus 36°C after cardiac arrest. N Engl J Med. 2013;369:2197–2206.
14. Thoma B, Rolston D, Lin M. Global emergency medicine journal club: social media responses to the March 2014 annals of emergency medicine journal club on targeted temperature management. Ann Emerg Med. 2014;64:207–212.
15. Arrich J, Holzer M, Havel C, Müllner M, Herkner H. Hypothermia for neuroprotection in adults after cardiopulmonary resuscitation. Cochrane Database Syst Rev. 2016;2:CD004128.
16. Callaway CW, Donnino MW, Fink EL, et al. Part 8: post-cardiac arrest care: 2015 American Heart Association guidelines update for cardiopulmonary resuscitation and emergency cardiovascular care. Circulation. 2015;132:S465–S482.
17. Bhattacharjee S, Baidya DK, Maitra S. Therapeutic hypothermia after cardiac arrest is not associated with favorable neurological outcome: a meta-analysis. J Clin Anesth. 2016;33:225–232.
18. Villablanca PA, Makkiya M, Einsenberg E, et al. Mild therapeutic hypothermia in patients resuscitated from out-of-hospital cardiac arrest: a meta-analysis of randomized controlled trials. Ann Card Anaesth. 2016;19:4–14.
19. Wetterslev J, Jakobsen JC, Gluud C. Trial sequential analysis in systematic reviews with meta-analysis. BMC Med Res Methodol. 2017;17:39.
20. Gartlehner G, Dobrescu A, Evans TS, et al. The predictive validity of quality of evidence grades for the stability of effect estimates was low: a meta-epidemiological study. J Clin Epidemiol. 2016;70:52–60.
21. Gadbury GL, Allison DB. Inappropriate fiddling with statistical analyses to obtain a desirable p-value: tests to detect its presence in published literature. PLoS One. 2012;7:e46363.