Twenty-five clinician participants (mean age, 37 years; 64% female) and 28 undergraduate students (mean age, 21 years; 68% female) completed the study. Results for target transition detection and range identification are shown in Figures 1 and 2. Results for all Spo2 outcome measures are in Table 2 and for subjective judgments in Table 3.
Results for target transition detection accuracy and range identification accuracy are shown in Table 2. The odds ratio of participants correctly detecting a target transition with the enhanced display compared with the standard display was 7.4 (95% CI, 4.4–12.3; P < .001). From the generalized linear mixed model, the estimated percentage correct (standard error [SE]) for the enhanced display was 87% (1.3%), and for the standard display, it was 57% (2.1%).
The odds ratio of participants correctly identifying range with the enhanced display compared to the standard display was 2.7 (95% CI, 1.6–4.6; P < .001). The estimated percentage correct for the enhanced display was 86% (1.3%), and for the standard display, it was 76% (1.7%).
In addition to the significant effect of display for detecting target transitions, noted above, the superiority of the enhanced display over the standard display for detecting target transitions was less pronounced when participants experienced it first rather than second in the crossover design, with an odds ratio of 0.4 (95% CI, 0.2–0.9; P = .006). Specifically, the superiority of the enhanced display over the standard display was less extreme when the enhanced display was presented first 84% (2.1%) compared with 59% (3.0%) for the standard display, than when the enhanced display was presented second, 90% (1.6%) compared with 56% (3.0%) for the standard display. No other effects were significant for target transition detection accuracy. For the range identification accuracy, there were no significant effects apart from the effect of display.
For participants’ target transition detection latency, there was a significant effect of display with the enhanced display supporting faster detection, 2.4 seconds (0.5 seconds), than the standard display, 8.7 seconds (0.5 seconds), with a mean difference of 6.3 seconds (95% CI, 5.1–7.5 seconds; P < .001). In addition, there was an interaction of display with order (P < .001). The superiority of the enhanced display over the standard display was less extreme when it was presented first, 3.0 seconds (0.7 seconds) compared with 7.6 seconds (0.9 seconds) for the standard display, than when it was presented second, 1.8 seconds (0.7 seconds) compared with 9.8 seconds (0.9 seconds) for the standard display. There were no other significant differences for target transition detection latency.
For participants’ identification of absolute Spo2 value, the odds ratio of participants making a correct identification with the enhanced display compared with the standard display was 2.4 (95% CI, 1.6–3.5; P < .001). The estimated percentage correct for the enhanced display was 67% (1.8%), and for the standard display, it was 46% (1.9%). We found no other significant effects for participants’ identification of absolute Spo2 value.
For the distractor tasks (arithmetic verification and keyword detection), we assessed participants’ performance as a manipulation check. Participants answered arithmetic questions more accurately in the enhanced condition, 79% (1.2%), than in the standard condition, 76% (1.3%), with an odds ratio of 1.3 (95% CI, 1.2%–1.5%; P < .001). In addition, participants responded to a larger percentage of the arithmetic questions presented when in the enhanced condition, 94% (0.7%), than when in the standard condition, 93% (0.9%), with an odds ratio of 2.4 (95% CI, 2.0%–3.0%; P < .001). When accuracy was analyzed for only the arithmetic questions attempted, participants were more accurate in the enhanced condition, 86% (1.0%), than in the standard condition, 84% (1.1%), with an odds ratio of 1.2% (95% CI, 1.0%–1.3%; P < .001).
Clinicians answered fewer arithmetic questions than nonclinicians (for clinicians: 92% [1.5%]; for nonclinicians: 95% [0.8%]), with an odds ratio of 0.6 (95% CI, 0.3%–1.2%; P = .016). Clinicians also responded more slowly than nonclinicians (for clinicians: 2.6 seconds [0.1 second]; for nonclinicians: 2.2 seconds [0.1 second]), with a mean difference of 0.3 seconds (95% CI, 0.2–0.5 seconds; P < .001). However, clinicians responded more accurately than nonclinicians for the arithmetic expressions attempted (for clinicians: 88% [1.2%]; for nonclinicians: 82% [1.6%]) with an odds ratio of 1.8 (95% CI, 1.1%–2.8%; P = .003).
For the keyword detection task, participants were slightly more accurate in the enhanced condition, 98% (0.3%), than in the standard condition, 97% (0.4%), with an odds ratio of 2.4 (95% CI, 1.6%–3.8%; P < .001). Participants also responded slightly more quickly to the keywords in the enhanced condition, 2.5 seconds (0.05 seconds), than in the standard condition, 2.6 seconds (0.05 seconds), with a mean difference of 0.1 second (95% CI, 0.04–0.2 seconds; P = .031).
For the subjective judgments, participants reported that it was easier to judge Spo2 parameters and they were more confident in their judgments when using the enhanced display than when using the standard display (Table 3). There was no difference between clinicians’ and nonclinicians’ subjective opinions for either of the displays, except that clinicians found it easier to detect target transitions than nonclinicians did.
Participants detected target transitions more accurately using an auditory display enhanced with tremolo and brightness (87%) than when using a standard auditory display supplemented with an alarm in the critical range (57%). Such a substantial difference of 30 percentage points may be clinically relevant. Participants more accurately identified Spo2 range when using the enhanced display (86%) than when using the standard display (76%). This study confirms previous findings that improvements occurred while participants performed a visually presented task and in the presence of background noise.13 The study also adds new information. First, the superiority of the enhanced display held even in the presence of an additional distractor task of the same perceptual modality as the monitoring task. Second, the performance of clinicians and nonclinicians on the primary outcomes was not statistically distinguishable.
Secondary outcomes reinforce that performance is generally better with the enhanced display. Participants using the enhanced display were faster at detecting target transitions, and more accurate at identifying absolute Spo2 values, than when using the standard display. For example, participants’ ability to identify absolute Spo2 values improved by 20 percentage points, which may be a clinically relevant change.
We attribute these improvements to the additional acoustic properties of the enhanced display. Auditory displays comprising varied acoustic dimensions are more discriminable than those comprising fewer dimensions because they provide cues that are more easily perceived and recognized.19,20 In the target and low ranges of the standard display, participants had only varying pitch to detect whether Spo2 transitioned ranges and to judge Spo2 range. The enhanced display added tremolo to the varying pitch in the low range and both tremolo and brightness in the critical range. These additional properties informed the listener when range transitions occurred.21 If anesthesiologists are able to identify when Spo2 is trending toward a critical threshold, they may take remedial action before the threshold is breached and alarms sound. This may have implications for patient safety and may reduce reliance on alarms.
We found no difference in the performance of clinicians and nonclinician participants for any outcomes related to identification of Spo2 parameters. This may seem surprising given clinicians’ greater familiarity with auditory PO displays, whereas many nonclinician participants had never heard the PO display before. This also contrasts with research showing that anesthesiologists were more accurate than nonanesthesiologists at judging abnormal patient parameters using continuous auditory displays.3 However, in that study, participants monitored 5 vital signs simultaneously. Anesthesiologists’ increased domain expertise may have enabled them to integrate the information and perform better than nonanesthesiologists. In the present study, participants monitored only 1 vital sign, Spo2, and performed identification tasks rather than clinical judgment tasks, making it less surprising that nonclinician participants performed similarly to clinicians.
As visual stimuli in the OR increase in number and impose greater cognitive demands (eg, data entry into electronic patient record systems), visual attentional resources may become overloaded and vital cues about patient information missed. The auditory modality not only has greater capacity for processing perceptual cues22 but also is always available, unlike the visual mode, which has a narrower spatial focus. Thus, if accurate patient information is provided via the auditory modality, anesthesiologists may be able to allocate visual and auditory attention more efficiently and manage patient treatment more effectively.
Participants’ accuracies for the arithmetic task and the keyword detection task were greater in the enhanced condition than in the standard condition. Participants reported that it was easier to distinguish auditory changes with the enhanced display than with the standard display plus alarms. As a result, participants may have had more cognitive resources to direct toward the distractor tasks.
We found small differences between clinician and nonclinician participants for performance of distractor tasks, which may reflect biases in prioritization of tasks. Specifically, clinicians were more accurate than nonclinicians at verifying the arithmetic expressions to which they responded but slower at responding than nonclinicians, irrespective of display. Clinicians may focus on performing a task well because of a greater general concern with error and risk. Nonclinician participants may have been less motivated to avoid error and less sensitized to risk.
Participants reported that it was easier to identify Spo2 parameters, and that they were more confident in their judgments, with the enhanced display than with the standard display. Users’ subjective judgments of displays are important for display design. If users find a display intolerable or unreliable, it will not be accepted, despite possible benefits.23
Limitations and Future Research
This study had a number of limitations. First, it was conducted in an undisturbed room, whereas the OR is a noisy, dynamic environment. Nonetheless, testing the effectiveness of our displays was important before taking them to more realistic settings. Second, we had only 2 distractor tasks, whereas anesthesiologists perform numerous tasks.1 Third, we conducted multiple tests on secondary outcomes, raising the possibility of type I errors, but the effect of display was strong for all judgments of Spo2. Fourth, the keyword detection task may not have been sufficiently demanding, as indicated by high detection accuracies (>96% for both displays). Intervals of 3–7 seconds between phrases may have helped participants allocate their auditory attention. Future research may investigate a keyword detection task with continuous vocal dialogue. Fifth, the study was conducted in a room considerably smaller than the OR and with no standard OR equipment present. Speech intelligibility worsens with increased OR size and increases with OR content (eg, anesthetic machine, trolleys).24 Sixth, our clinicians were younger on average than the age at which presbycusis sets in,25 so older clinicians may not hear higher frequencies in the enhanced critical range. Seventh, there was only 1 researcher present during experimental trials. In the OR, anesthesiologists are often interrupted or distracted by a number of personnel, directly or indirectly.26 To address some of these limitations, we plan to compare anesthesiologists’ ability to identify Spo2 parameters using the enhanced and standard displays in a simulator with scripted anesthetic scenarios that include interruptions, distractions, and continual verbal conversations.
This study shows that a PO auditory display augmented with additional acoustic dimensions to signal low and critical ranges is more effective than a standard pitch plus alarm display for Spo2 parameter identification. Benefits are apparent in a setting with distractors and background noise. In addition, the effect held for both clinician and nonclinician participants. The enhanced display has the potential to improve clinicians’ perception and comprehension of Spo2 auditory stimuli when access to visual displays is compromised. This may allow for more efficient allocation of attention and potentially more timely and effective decision-making in the high-pressure environment of the OR. Systems that enhance anesthesiologists’ vigilance and monitoring performance are important in the ongoing endeavor to enhance patient safety.
The authors thank Dr Peter Moran (MBBS), Director of Anaesthesia, Princess Alexandra Hospital, Woolloongabba, Queensland, Brisbane, Australia, who provided support and a testing location at the Princess Alexandra Hospital and helped with recruitment and liaison with anesthetic staff members. The authors also thank T-Lok Tang (BSc [Hons]), Felicity Burgmann (BPsySc(Hons)), and Lachlan Peterson (UQ undergraduate student) who recorded spoken phrases. The authors thank our statistical advisors for input.
Name: Estrella Paterson, BPsySc(Hons).
Contribution: This author helped design the study, design and test the stimuli, design the experimental materials, test the software application, prepare ethics and governance applications, liaise with personnel at the hospital site, conduct the experiment, analyze the data, and write the report.
Conflicts of Interest: None.
Name: Penelope M. Sanderson, PhD.
Contribution: This author helped design the study, stimuli, and experimental materials, prepare ethics and governance applications, liaise with personnel at the hospital site, revise data analysis, and write the report.
Conflicts of Interest: P. M. Sanderson is a coinventor of a respiratory sonification (Sanderson and Watson, US Patent 7070570).
Name: Birgit Brecknell, PhD.
Contribution: This author helped design the stimuli, write the software application for the experiment, test the software application, and revise the report.
Conflicts of Interest: None.
Name: Neil A. B. Paterson, MBChB.
Contribution: This author helped design the study and stimuli, revise the report, and provide expert knowledge of anesthesia.
Conflicts of Interest: None.
Name: Robert G. Loeb, MD.
Contribution: This author helped design the study and stimuli, revise the report, and provide expert knowledge of anesthesia.
Conflicts of Interest: R. G. Loeb has received $1000 per year to be on the Masimo, Inc, Scientific Advisory Board.
This manuscript was handled by: Maxime Cannesson, MD, PhD.
1. Phipps D, Meakin GH, Beatty PC, Nsoedo C, Parker D. Human factors in anaesthetic practice: insights from a task analysis. Br J Anaesth. 2008;100:333–343.
2. Ansermino JM. Ehrenfeld JM, Cannesson M. Intelligent patient monitoring and clinical decision making. In: Monitoring Technologies in Acute Care Environments. 2014:New York, NY: Springer, 401–407.
3. Watson M, Sanderson P. Sonification supports eyes-free respiratory monitoring and task time-sharing. Hum Factors. 2004;46:497–517.
4. Ford S, Birmingham E, King A, Lim J, Ansermino JM. At-a-glance monitoring: covert observations of anesthesiologists in the operating room. Anesth Analg. 2010;111:653–658.
5. Schulz CM, Schneider E, Fritz L, et al. Visual attention of anaesthetists during simulated critical incidents. Br J Anaesth. 2011;106:807–813.
6. Morris RW, Mohacsi PJ. How well can anaesthetists discriminate pulse oximeter tones? Anaesth Intensive Care. 2005;33:497–500.
7. Schulze K, Gaab N, Schlaug G. Perceiving pitch absolutely: comparing absolute and relative pitch possessors in a pitch memory task. BMC Neurosci. 2009;10:106.
8. Stevenson RA, Schlesinger JJ, Wallace MT. Effects of divided attention and operating room noise on perception of pulse oximeter pitch changes: a laboratory study. Anesthesiology. 2013;118:376–381.
9. Loeb RG, Brecknell B, Sanderson PM. The sounds of desaturation: a survey of commercial pulse oximeter sonifications. Anesth Analg. 2016;122:1395–1403.
10. Deschamps ML, Sanderson P, Hinckfuss K, et al. Improving the detectability of oxygen saturation level targets for preterm neonates: a laboratory test of tremolo and beacon sonifications. Appl Ergon. 2016;56:160–169.
11. Hinckfuss K, Sanderson P, Loeb RG, Liley HG, Liu D. Novel pulse oximetry sonifications for neonatal oxygen saturation monitoring: a laboratory study. Hum Factors. 2016;58:344–359.
12. Paterson E, Sanderson PM, Paterson NA, Liu D, Loeb RG. The effectiveness of pulse oximetry sonification enhanced with tremolo and brightness for distinguishing clinically important oxygen saturation ranges: a laboratory study. Anaesthesia. 2016;71:565–572.
13. Paterson E, Sanderson PM, Paterson NAB, Loeb RG. Effectiveness of enhanced pulse oximetry sonifications for conveying oxygen saturation ranges: a laboratory comparison of five auditory displays. Br J Anaesth. 2017;119:1224–1230.
14. Sevdalis N, Healey AN, Vincent CA. Distracting communications in the operating theatre. J Eval Clin Pract. 2007;13:390–394.
15. International Electrotechnical Commission. IEC 60601-1–8:2006 - Medical electrical equipment – Part 1-8: General requirements, tests and guidance for alarm systems in medical electrical equipment and medical electrical systems. 2006. Geneva, Switzerland: International Electrotechnical Commission; Available at: https://webstore.iec.ch/publication/2599
. Accessed August 17, 2018.
16. Sanderson PM, Wee A, Lacherez P. Learnability and discriminability of melodic medical equipment alarms. Anaesthesia. 2006;61:142–147.
17. Martí C, Montserrat G-F, Josep LC. Methodological quality and reporting of generalized linear mixed models in clinical medicine (2000–2012): a systematic review. PLoS ONE 2014;9:e112653.
18. Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39:175–191.
19. Atyeo J, Sanderson PM. Comparison of the identification and ease of use of two alarm sound sets by critical and acute care nurses with little or no music training: a laboratory study. Anaesthesia. 2015;70:818–827.
20. Edworthy J, Hellier E, Titchener K, Naweed A, Roels R. Heterogeneity in auditory alarm sets makes them easier to learn. Int J Ind Ergon. 2011;41:136–146.
21. Watson MO, Sanderson P. Designing for attention with sound: challenges and extensions to ecological interface design. Hum Factors. 2007;49:331–346.
22. Murphy S, Fraenkel N, Dalton P. Perceptual load does not modulate auditory distractor processing. Cognition. 2013;129:345–355.
23. Edworthy J. Medical audible alarms: a review. J Am Med Inform Assoc. 2013;20:584–589.
24. McNeer RR, Bennett CL, Horn DB, Dudaryk R. Factors affecting acoustics and speech intelligibility in the operating room: size matters. Anesth Analg. 2017;124:1978–1985.
25. Baxter AD, Boet S, Reid D, Skidmore G. The aging anesthesiologist: a narrative review and suggested strategies. Can J Anaesth. 2014;61:865–875.
Copyright © 2019 International Anesthesia Research Society
26. van Pelt M, Weinger MB. Distractions in the anesthesia work environment: impact on patient safety? Report of a meeting sponsored by the anesthesia patient safety foundation. Anesth Analg. 2017;125:347–350.