From the *Department of Anaesthesia and Pain Management, Royal Melbourne Hospital, Melbourne, Australia; †Anaesthesia, Perioperative Medicine and Pain Medicine Unit, Melbourne Medical School and ‡Department of Pharmacology, University of Melbourne, Melbourne, Australia; §Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Australia; and ‖Department of Anaesthesiology and Perioperative Medicine, Auckland City Hospital, Auckland, New Zealand.
Accepted for publication January 29, 2104.
Funding: This work was funded by Australian National Health and Medical Research Council Project Grant APP104272 and Health Research Council of New Zealand project grant 12-308 to Dr. Short.
The authors declare no conflicts of interest.
Reprints will not be available from the authors.
Address correspondence to Kate Leslie, MBBS, MD, M Epi, FANZCA, Department of Anaesthesia and Pain Management, Royal Melbourne Hospital, Parkville, VIC, 3050, Australia. Address e-mail to email@example.com
A causality dilemma has hitherto existed in relation to low Bispectral Index (BIS) values and poor outcomes after general anesthesia in older patients. Does low BIS result in poor health or does poor health result in low BIS?1 Large observational studies have failed to answer this question because of their inherent inability to account for patients who are sensitive to anesthesia self-selecting into the deep anesthesia group,2–7 and a recent large randomized trial intending to recruit 970 patients was stopped after an interim analysis of 381 patients for futility (Table 1).8 Calls for evidence from randomized trials9,10 therefore remain unanswered.
In this issue of the Journal, Brown et al.11 report on a follow-up survival analysis of 114 patients originally enrolled in a randomized trial to assess postoperative delirium after hip fracture repair.12 These elderly patients (mean ± SD age 81.7 ± 7.2 years) received spinal anesthesia supplemented by either light (mean BIS 85.7 ± 11.3) or deep (49.9 ± 13.5) sedation using propofol and/or midazolam.11,12 Overall, 1-year mortality was similar in the light and deep groups (19.3% vs 29.8%; P = 0.21). However, when only sicker patients were considered (Charlson comorbidity index >4), light sedation was associated with lower 1-year mortality than deep sedation (22.2% vs 43.6%; P = 0.04).13
In this study, 2 reasons for an association between sedation depth and survival were investigated.11 First, the effect of postoperative delirium was considered, as patients in the light group were less prone to this complication than patients in the deep group (19% vs 40%; P = 0.02). Delirium is more likely with deep sedation/anesthesia12,14 and potentially may be a marker of anesthetic toxicity (either directly15 or via electroencephalographic burst suppression16). However, no interaction between delirium and sedation depth in mediating mortality was demonstrated. Second, the effect of arterial hypotension (defined as an intraoperative systolic blood pressure decrease >30% from preoperative values and/or systolic blood pressure <90 mm Hg) was explored, as hypotension was associated with increased mortality in recent studies.7 No significant difference in the duration of hypotension between the light and deep groups was demonstrated in all patients (9 ± 14 vs 13 ± 22 minutes; P = 0.2812) or those with Charlson comorbidity indices >4 (median [interquartile range] 0 [0–15] vs 5 [0–12] minutes; P = 0.7611). Appropriately Brown et al.11 advocated further research rather than speculating further on etiology or recommending an immediate change in practice. Our editorial will expand on the need for caution in immediately extrapolating the results of this interesting study to clinical practice.
Our first note of caution relates to sample size. The original sample size calculation for this study was based on the assumption of postoperative delirium incidences of 12% and 36% in the light and deep groups, respectively (power = 86%; α = 0.05).12 For the follow-up survival analysis, 80% power was estimated to detect a hazard ratio of 0.58 for survival in lightly sedated compared with deeply sedated patients.11 Although these powers are commonly accepted in the medical literature, it is instructive to recall that such studies will miss 14% and 20% of “true” results, respectively.
The combination of small sample size and low power may not only increase the likelihood of failing to demonstrate a true effect, but also may increase the risk of showing a spurious effect, due to mathematical inevitabilities or the co-occurrence of biases.17 Even in the presence of excellent study design, small trials with low power can produce unreliable findings due to low prior probabilities of finding true effects, low positive predictive values for claimed effects, and exaggerated estimates of effect size for true effects,17 the so-called “winner’s curse.”18 In the anesthesia literature, this curse is illustrated by spectacular risk reductions for myocardial infarction in early β-blocker trials19,20 that were not replicated by large trials21 or meta-analyses.22 Ioannidis18 urges caution in interpretation of large effects from early small trials and encourages larger trials in the discovery phase as a means to avoid being misled.
A more intuitive way of looking at sample size is to look at the fragility of a study.23 Fragility is the number of patients who would need to have a different outcome to change the result and provides a useful measure of the robustness of a study. It is akin to the concept of reproducibility, which is the likelihood that an identical study would produce the same result if done again.24 In Brown et al.,11 in the subgroup of patients with a Charlson comorbidity index >4, 10 of 45 patients died within a year in the light group and 17 of 39 patients died within a year in the deep group. Just 1 more patient dying in the light group, or 1 less patient dying in the deep group, would lead to a nonsignificant result (χ2 test, P = 0.07). Such fragile results can only be regarded as hypothesis generating, or requiring confirmation, as correctly identified by the authors. According to traditional power calculations, for a 20% absolute difference in mortality as found in this study, with α = 0.05 and power = 0.8, a confirmatory study would need 197 patients. Such a study would require 7 more patients to die in the light group, or 9 fewer patients to die in the deep group, to change the significance of the result and thus would be more robust.
Some problems with study design become worse in small low-powered trials.17 Small sample size increases the risk of imbalance at baseline, as evidenced in Brown et al.11 by a higher proportion of patients living independently in the light group than the deep group (74% vs 56%; P = 0.08).12 This measured imbalance is a signal for potential imbalance in unmeasured but important prognostic variables that may be alternative explanations for the result. Small sample size also decreases the precision of risk estimates, as evidenced by wide 95% confidence intervals (approaching 1.0 at their upper end) around the hazard ratios for survival (0.28–1.33, 0.19–0.97, and 0.12–0.94 for all patients and those with Charlson comorbidity indices of >4 and >6, respectively). Finally small sample size decreases the probability of demonstrating effects across all subgroup analyses (assumptions of proportionality for Cox hazard modeling were not supported for survival beyond 1 year). A small study may thus also fail to detect an important effect worthy of further investigation.
Our second note of caution relates to the generalizability of these results. When conducting a randomized trial, it is prudent to recruit patients who are at high risk of the primary outcome and ensure wide separation in the intensity of the intervention if both groups are to receive it. In previous studies, long-term survival has varied markedly (5.5%–24.3% mortality2–6; Table 1), but no study has included patients with the risk profile of these elderly hip fracture patients (long-term mortality 45%). Furthermore, previous studies observed patients having general anesthesia (BIS <60), with or without neuraxial blockade.2–8 In the current study, all patients received spinal anesthesia, some in combination with general anesthesia (i.e., BIS near 50) and some without significant hypnotic administration (i.e., BIS around 85). Although a protective effect of neuraxial blockade is strongly supported,25 the combination of neuraxial blockade with general anesthesia has been associated with poorer outcomes than general anesthesia alone in a recent propensity score–adjusted post hoc analysis of Perioperative Ischemic Evaluation (POISE) study patients.26 These factors make generalization away from hip fracture patients under spinal anesthesia injudicious. Finally, Brown et al.’s patients11 were randomized to dramatically different BIS values.12 In previous studies, depth of sedation was all in the general anesthesia range (i.e., BIS <60) and the differences in anesthetic depth among patients was smaller.2–8
We are currently recruiting to a 6500-patient international randomized controlled trial of volatile-based general anesthesia titrated to a BIS of 50 or 35 (Australian and New Zealand Clinical Trial Registry number 12162000632897). Eligible patients are aged ≥60 years, have significant comorbidities, and present for surgery lasting more than 2 hours. Our pilot study demonstrated the feasibility of BIS-guided titration and maintenance of similar arterial blood pressures, as well as 10% 1-year mortality in the index population.27 We hope that this large trial will definitively answer the question of whether low BIS values are truly associated with poor outcomes in elderly patients.
Name: Kate Leslie, MBBS, MD, M Epi, FANZCA.
Contribution: This author helped prepare the manuscript.
Attestation: Kate Leslie approved the final manuscript.
Name: Timothy G. Short, MBChB, MD, FANZCA.
Contribution: This author helped prepare the manuscript.
Attestation: Timothy G. Short approved the final manuscript.
This manuscript was handled by: Sorin J. Brull, MD, FCARCSI (Hon).
1. Leslie K, Short TG. Low bispectral index values and death: the unresolved causality dilemma. Anesth Analg. 2011;113:660–3
2. Monk TG, Saini V, Weldon BC, Sigl JC. Anesthetic management and one-year mortality after noncardiac surgery. Anesth Analg. 2005;100:4–10
3. Lindholm ML, Träff S, Granath F, Greenwald SD, Ekbom A, Lennmarken C, Sandin RH. Mortality within 2 years after surgery in relation to low intraoperative bispectral index values and preexisting malignant disease. Anesth Analg. 2009;108:508–12
4. Leslie K, Myles PS, Forbes A, Chan MT. The effect of bispectral index monitoring on long-term survival in the B-aware trial. Anesth Analg. 2010;110:816–22
5. Kertai MD, Pal N, Palanca BJ, Lin N, Searleman SA, Zhang L, Burnside BA, Finkel KJ, Avidan MSB-Unaware Study Group. . Association of perioperative risk factors and cumulative duration of low bispectral index with intermediate-term mortality after cardiac surgery in the B-Unaware Trial. Anesthesiology. 2010;112:1116–27
6. Kertai MD, Palanca BJ, Pal N, Burnside BA, Zhang L, Sadiq F, Finkel KJ, Avidan MSB-Unaware Study Group. . Bispectral index monitoring, duration of bispectral index below 45, patient risk factors, and intermediate-term mortality after noncardiac surgery in the B-Unaware Trial. Anesthesiology. 2011;114:545–56
7. Sessler DI, Sigl JC, Kelley SD, Chamoun NG, Manberg PJ, Saager L, Kurz A, Greenwald S. Hospital stay and mortality are increased in patients having a “triple low” of low blood pressure, low bispectral index, and low minimum alveolar concentration of volatile anesthesia. Anesthesiology. 2012;116:1195–203
8. Abdelmalak BB, Bonilla A, Mascha EJ, Maheshwari A, Tang WH, You J, Ramachandran M, Kirkova Y, Clair D, Walsh RM, Kurz A, Sessler DI. Dexamethasone, light anaesthesia, and tight glucose control (DeLiT) randomized controlled trial. Br J Anaesth. 2013;111:209–21
9. Monk TG, Weldon BC. Anesthetic depth is a predictor of mortality: it’s time to take the next step. Anesthesiology. 2010;112:1070–2
10. Kheterpal S, Avidan MS. “Triple low”: murderer, mediator, or mirror. Anesthesiology. 2012;116:1176–8
11. Brown CH, Azman AS, Gottschalk A, Mears SC, Sieber FE. Sedation depth during spinal anesthesia and survival in elderly patients undergoing hip fracture repair. Anesth Analg. 2014
12. Sieber FE, Zakriya KJ, Gottschalk A, Blute MR, Lee HB, Rosenberg PB, Mears SC. Sedation depth during spinal anesthesia and the development of postoperative delirium in elderly patients undergoing hip fracture repair. Mayo Clin Proc. 2010;85:18–26
13. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40:373–83
14. Chan MT, Cheng BC, Lee TM, Gin TCODA Trial Group. . BIS-guided anesthesia decreases postoperative delirium and cognitive decline. J Neurosurg Anesthesiol. 2013;25:33–42
15. Perouansky M, Hemmings HC Jr. Neurotoxicity of general anesthetics: cause for concern? Anesthesiology. 2009;111:1365–71
16. Watson PL, Shintani AK, Tyson R, Pandharipande PP, Pun BT, Ely EW. Presence of electroencephalogram burst suppression in sedated, critically ill patients is associated with increased mortality. Crit Care Med. 2008;36:3171–7
17. Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, Munafò MR. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14:365–76
18. Ioannidis JP. Why most discovered true associations are inflated. Epidemiology. 2008;19:640–8
19. Mangano DT, Layug EL, Wallace A, Tateo I. Effect of atenolol on mortality and cardiovascular morbidity after noncardiac surgery. Multicenter Study of Perioperative Ischemia Research Group. N Engl J Med. 1996;335:1713–20
20. Poldermans D, Boersma E, Bax JJ, Thomson IR, van de Ven LL, Blankensteijn JD, Baars HF, Yo TI, Trocino G, Vigna C, Roelandt JR, van Urk H. The effect of bisoprolol on perioperative mortality and myocardial infarction in high-risk patients undergoing vascular surgery. Dutch Echocardiographic Cardiac Risk Evaluation Applying Stress Echocardiography Study Group. N Engl J Med. 1999;341:1789–94
21. Devereaux PJ, Yang H, Yusuf S, Guyatt G, Leslie K, Villar JC, Xavier D, Chrolavicius S, Greenspan L, Pogue J, Pais P, Liu L, Xu S, Málaga G, Avezum A, Chan M, Montori VM, Jacka M, Choi PPOISE Study Group. . Effects of extended-release metoprolol succinate in patients undergoing non-cardiac surgery (POISE trial): a randomised controlled trial. Lancet. 2008;371:1839–47
22. Bangalore S, Wetterslev J, Pranesh S, Sawhney S, Gluud C, Messerli FH. Perioperative beta blockers in patients having non-cardiac surgery: a meta-analysis. Lancet. 2008;372:1962–76
23. Sessler DI, Devereaux PJ. Emerging trends in clinical trial design. Anesth Analg. 2013;116:258–61
24. Shafer SL, Dexter F. Publication bias, retrospective bias, and reproducibility of significant results in observational studies. Anesth Analg. 2012;114:931–2
25. Pöpping DM, Elia N, Marret E, Remy C, Tramèr MR. Protective effects of epidural analgesia on pulmonary complications after abdominal and thoracic surgery: a meta-analysis. Arch Surg. 2008;143:990–9
26. Leslie K, Myles P, Devereaux P, Williamson E, Rao-Melancini P, Forbes A, Xu S, Foex P, Pogue J, Arrieta M, Bryson G, Paul J, Paech M, Merchant R, Choi P, Badner N, Peyton P, Sear J, Yang H. Neuraxial block, death and serious cardiovascular morbidity in the POISE trial. Br J Anaesth. 2013;111:382–90
27. Short T, Leslie K, Campbell D, Chan M, Corcoran T, O’Loughlin E, Myles P. A pilot study for a prospective, randomized, double-blind trial of the influence of anesthetic depth on long-term outcome. Anesth Analg. 2014;118:981–6