Secondary Logo

Journal Logo

Comparison of Standard and Enhanced Pulse Oximeter Auditory Displays of Oxygen Saturation: A Laboratory Study With Clinician and Nonclinician Participants

Paterson, Estrella BPsySc(Hons)*; Sanderson, Penelope M. PhD*,†,‡; Brecknell, Birgit PhD*; Paterson, Neil A. B. MBChB‡,§; Loeb, Robert G. MD*,‖

doi: 10.1213/ANE.0000000000004267
Technology, Computing, and Simulation: Original Laboratory Research Report
Free

BACKGROUND: When engaged in visually demanding tasks, anesthesiologists depend on the auditory display of the pulse oximeter (PO) to provide information about patients’ oxygen saturation (Spo2). Current auditory displays are not always effective at providing Spo2 information. In this laboratory study, clinician and nonclinician participants identified Spo2 parameters using either a standard auditory display or an auditory display enhanced with additional acoustic properties while performing distractor tasks and in the presence of background noise.

METHODS: In a counterbalanced crossover design, specialist or trainee anesthesiologists (n = 25) and nonclinician participants (n = 28) identified Spo2 parameters using standard and enhanced PO auditory displays. Participants performed 2 distractor tasks: (1) arithmetic verification and (2) keyword detection. Simulated background operating room noise played throughout the experiment. Primary outcomes were accuracies to (1) detect transitions to and from an Spo2 target range and (2) identify Spo2 range (target, low, or critical). Secondary outcomes included participants’ latency to detect target transitions, accuracy to identify absolute Spo2 values, accuracy and latency of distractor tasks, and subjective judgments about tasks.

RESULTS: Participants were more accurate at detecting target transitions using the enhanced display (87%) than the standard display (57%; odds ratio, 7.3 [95% confidence interval {CI}, 4.4–12.3]; P < .001). Participants were also more accurate at identifying Spo2 range using the enhanced display (86%) than the standard display (76%; odds ratio, 2.7 [95% CI, 1.6–4.6]; P < .001). Secondary outcome analyses indicated that there were no differences in performance between clinicians and nonclinicians for target transition detection accuracy and latency, Spo2 range identification accuracy, or absolute Spo2 value identification.

CONCLUSIONS: The enhanced auditory display supports more accurate detection of target transitions and identification of Spo2 range for both clinicians and nonclinicians. Despite their previous experience using PO auditory displays, clinicians in this laboratory study were no more accurate in any Spo2 outcomes than nonclinician participants.

From the *School of Psychology, The University of Queensland, St Lucia, Queensland, Australia

School of Information Technology and Electrical Engineering (ITEE), The University of Queensland, Brisbane, Queensland, Australia

School of Clinical Medicine, The University of Queensland, St Lucia, Queensland, Australia

§Anaesthesia and Pain Management Services, Queensland Children’s Hospital, South Brisbane, Queensland, Australia

Department of Anesthesiology, University of Florida College of Medicine, Gainesville, Florida.

Published ahead of print 5 February 2019.

Accepted for publication May 2, 2019.

Funding: This work was supported by an Australian Research Council Discovery Project Grant DP140101822 to P.M.S., R.G.L., and D. Liu (grant investigator) and by an Australian Postgraduate Award to the first author.

Conflicts of Interest: See Disclosures at the end of the article.

Further analysis of data for nonclinician participants was presented at the Inter national Ergonomics Association Congress, August 26–30, 2018, Florence, Italy.

Reprints will not be available from the authors.

Address correspondence to Penelope M. Sanderson, PhD, School of Psychology, The University of Queensland, McElwain Bldg, St Lucia, QLD 4072, Australia. Address e-mail to p.sanderson@uq.edu.au.

See Editorial, p

Anesthesiologists maintain high levels of vigilance of monitored patient variables while managing numerous other tasks during surgical procedures.1 An important monitor in the operating room (OR) is the pulse oximeter (PO), which provides continuous visual and auditory displays of heart rate and oxygen saturation (Spo2). When engaged in other visually demanding tasks or when visual overload occurs,2,3 anesthesiologists depend on the auditory display for Spo2 information.4,5 However, current PO auditory displays do not always provide Spo2 information effectively.6–8 Furthermore, acoustic properties of auditory displays are not standardized across different PO manufacturers,9 which may hinder anesthesiologists’ ability to identify Spo2 parameters using different PO models.

Listeners can distinguish Spo2 ranges and transitions to and from a target range more accurately using displays enhanced with acoustic dimensions of tremolo and brightness than when using standard displays comprising varying pitch plus alarms set at clinically relevant thresholds.10–12 Paterson et al13 confirmed these findings when subjects performed a distractor task, and in the presence of background noise.

A limitation of the study by Paterson et al13 was that the single distractor task was presented visually. Many anesthetic tasks require auditory perception, which may interfere with perception of the PO auditory signal. Verbal communication, an auditory task, is essential for effective team performance in the OR; however, much of the communication in the OR is not relevant to the case14 and adds to general noise. While maintaining awareness of patient variables and performing anesthetic tasks, anesthesiologists monitor dialogue between OR team members to assess relevance to anesthetic management.

A further limitation of the study by Paterson et al13 was that participants were nonclinicians. Anesthesiologists’ greater familiarity with standard PO auditory displays could mean that they perceive the signal differently from nonclinicians. To address these limitations, we extended the research of Paterson et al13 by (1) adding an aurally presented distractor task (thereby in the same modality as the monitoring task) and (2) testing clinician as well as nonclinician participants.

Primary outcomes were participants’ accuracy at (1) detecting transitions into and out of the target Spo2 range and (2) identifying Spo2 range. We predicted that participants would more accurately detect target transitions and more accurately identify Spo2 range when using the enhanced display than when using the standard display. Secondary outcomes included the effect of display and expertise on participants’ latency to detect target transitions, accuracy to identify absolute Spo2 values, distractor task performance, and subjective judgments of tasks.

Back to Top | Article Outline

METHODS

This study was conducted at the Princess Alexandra Hospital in Brisbane, Australia, and at The University of Queensland (UQ), Brisbane, Australia. Ethics approval was granted by Metro South Human Research Ethics Committee (2017001254/HREC/17/QPAH/405). Written informed consent was obtained from all participants.

Back to Top | Article Outline

Study Design

In a crossover design, we tested the effect of 2 auditory displays (standard versus enhanced) on several outcomes with 2 types of participants (nonclinician versus clinician). Participants performed the experimental tasks over 2 blocks, using the standard display in one and using the enhanced display in the other. The standard auditory display was based on commercial PO auditory displays and comprised varying pitch pulse tones plus an alarm.9 The enhanced auditory display comprised varying pitch pulse tones enhanced with tremolo and brightness when outside a target range and no alarm.13 The order of presentation of the 2 displays was counterbalanced by the first author (E.P.) using MS Excel’s (Microsoft, Redmond, WA) RAND() function to control for practice and sequence effects.

Before each block of experimental trials, participants completed a training block of 10 trials with feedback provided. Each experimental block included 15 trials (each trial was a short scenario of 60 seconds duration). Participants experienced 1 block of 15 trials in the standard condition and a further block of 15 trials in the enhanced condition, making 30 trials overall. Within each display condition, the order of trials was randomized differently for each participant by the Java software (Oracle, Redwood Shores, CA) running the experiment.

At the start of each trial, the Spo2 value was visible on the computer screen for 3 seconds to provide a reference and then disappeared. During each trial, participants monitored Spo2, which ranged over 1, 2, or 3 predefined ranges: target (100%–97%), low (96%–90%), and critical (89%–80%). Heart rate was maintained at 72 beats/min. Participants performed 2 distractor tasks (arithmetic verification and keyword detection) for the duration of each trial. Simulated background OR noise was played throughout the experiment.

Back to Top | Article Outline

Study Outcomes

Primary outcome measures were the effect of display on participants’ (1) accuracy at detecting transitions between target and low in either direction during trials and (2) accuracy at identifying Spo2 range (target, low, or critical) at the end of each trial. Secondary outcomes were the effect of expertise on the above measures, and also the effect of both display and expertise on (1) latency of detecting transitions between target and low in either direction; (2) accuracy of identifying absolute Spo2 percentage values (a value from the 80%–100% range, plus or minus 1%) at the end of each trial; (3) accuracy of classifying arithmetic expressions as true or false; (4) latency of classifying arithmetic expressions; (5) response rate of classification of arithmetic expressions; (6) accuracy of identifying keywords in spoken phrases; (7) latency of identifying keywords in spoken phrases; and (8) subjective judgments.

Back to Top | Article Outline

Participants

Clinician participants were anesthesiologists and residents from the anesthetic department of a large metropolitan hospital recruited via email, presentations, and posters on department notice boards. They chose a small gift (<$10 in Australian dollars) as a reward for completing the study and were able to apply for continuing education points from their governing college (Australian and New Zealand College of Anaesthetists) if they qualified. Inclusion criteria were that they had to be qualified anesthesiologists or residents who were ≥2 years into their anesthetic training. Nonclinician participants were undergraduate students from UQ and were rewarded with a gift voucher ($30 in Australian dollars). The inclusion criterion was that they had to be undergraduate students at UQ. Any participant who reported hearing abnormalities was excluded.

Back to Top | Article Outline

Apparatus and Stimuli

Trials were presented using custom software created in Java (Java SE Development Kit, Version jdk1.8.0_40.jdk; Oracle, Redwood Shores, CA) on a MacBook Air laptop computer with a 13-inch screen (Apple Computer, Cupertino, CA).13 Participants responded to experimental tasks using a mouse and keyboard. Spoken phrases were recorded using a digital voice recorder (Olympus WS-833, Tokyo, Japan). Auditory displays and spoken phrases were presented through 2 speakers (Behringer MS40, Kirchardt, Germany). Background noise was played through an iPad Mini (Apple Computer, Cupertino, CA) connected to 2 external speakers (Edirol MA-7A; Roland Corporation, Osaka, Japan).

Back to Top | Article Outline

Auditory Displays

Acoustic characteristics of the auditory displays are provided in Table 1. In the standard display,13 pulse tones were pure sine wave functions ranging in a logarithmic series from 656 Hz at 80% Spo2 to 950 Hz at 100% Spo2: each 1% change in Spo2 corresponded to a 1.84% change in frequency. Tone duration was 150 milliseconds with 10 milliseconds fade-in and 10 milliseconds fade-out to eliminate acoustic artifacts. Whenever Spo2 crossed from the low to the critical range, an alarm sounded (IEC-Medium-General alarm [IEC-60601-1-8])15 and continued sounding once every 15 seconds while Spo2 remained in the critical range. If the trial started with Spo2 in the critical range, the alarm first sounded on either the second or third pulse tone.

Table 1.

Table 1.

In the enhanced display, the mapping of pitch to Spo2 levels was as for the standard display, but excluded the alarm.13 Tremolo was added to the pulse tones in the low and critical Spo2 ranges from 96% to 80%, and brightness was added in the critical Spo2 range from 89% to 80%. Tremolo was produced by modulating the amplitude of the pulse tone using sinusoids with 4 cycles of tremolo and 90% “wet” or depth, resulting in a vibrating effect. Brightness was produced by adding the first 3 odd harmonics of the fundamental pitch of the pulse tone (ie, third, fifth, and seventh harmonic) to produce a sharper sound quality. Volume was held constant across all ranges.

Back to Top | Article Outline

Distractor Tasks

Participants performed an arithmetic verification task and a keyword distractor task which ran alongside each other throughout each trial of the study. In the arithmetic verification task,13 participants classified arithmetic expressions as “true” or “false” (eg, 64–15 = 49 [correct response “True”]; 23 + 7 = 35 [correct response “False”]). On-screen feedback was provided as “Correct” or “Wrong.” Expressions were presented for the duration of each trial with equal proportions of true and false cases. The arithmetic task was forced paced, with a new expression presented every 5 seconds, regardless of whether the participant responded or not. Expressions were presented in the same order for each participant.

For the keyword detection task, we designed 30 linguistic scenarios, 1 per experimental trial. Within each trial, there were 7 spoken phrases, with each phrase lasting no more than 3 seconds and separated by an interstimulus interval of 5 ± 2 seconds. There were 210 phrases in total. In any 1 trial, there were 0–4 keyword phrases containing 1 of 3 keywords: BLOOD, PATIENT, or TABLE. Nonkeyword phrases never contained keywords, and their content was medically related or general in nature. Participants acknowledged detection of keywords by clicking a label on the screen denoting the keyword. Examples of keyword phrases were as follows:

“When did this patient last get antibiotics?”

“Let’s turn the table ninety degrees”

Examples of nonkeyword phrases were as follows:

“Did you see the final of the tennis on TV last night?”

“We have a Wilm’s tumor resection scheduled for tomorrow”

Back to Top | Article Outline

Background Noise

We used Audacity (Audacity Team [2012], Version 2.02; CMU, Pittsburgh, PA) to generate a sound file representing background noise in the OR. It included OR noises such as suction, clanging of instruments, a ventilator, and pop music with vocals. The file was played at an average sound level of 59.1 dB(A) ranging from 54.0 to 81.8 dB(A) for the duration of the experiment.

Back to Top | Article Outline

Questionnaires

In a preexperiment questionnaire, participants provided information about their age, gender, musical training, and whether their hearing was normal. Music training was defined as having >1 year of formal music training.16 Clinician participants also indicated whether they were qualified or still in training and their years of experience in the anesthetic field. At the end of each experimental block, participants completed a questionnaire probing their reactions to the auditory display that they had just heard as well as the distractor tasks.

Back to Top | Article Outline

Procedure

We conducted the experiment in a quiet room. Each participant was tested individually. The participant sat at a desk in front of the laptop computer with the speakers set approximately 15 cm either side of the screen. The background noise file was played on 2 speakers located approximately 100 cm behind the participant.

We gave each participant an information sheet and explained the main points of the experiment using a standard protocol. Once consent to participate was obtained, we trained the participant to identify Spo2 parameters using 1 of the 2 displays and to perform the distractor tasks. After participants completed the first experimental block, they filled out a questionnaire about the display and tasks. Then they were trained on the other auditory display, completed the second block, and completed a questionnaire about that display and tasks.

Back to Top | Article Outline

Statistical Analysis

Given the repeated measurements over the 2 displays, we used a mixed-effects model approach. For measures with binary outcomes on each test trial (target transition detection accuracy, range identification accuracy, absolute Spo2 identification, arithmetic tasks attempted, arithmetic accuracy, and keyword task accuracy), we used a generalized linear mixed model with a binary distribution and logit link. This approach accommodates (1) nonnormal distribution of data and (2) both random and fixed effects.17 We selected a diagonal covariance structure type: this is the default model for generalized linear mixed models.

For the primary outcomes (target transition detection accuracy and range identification accuracy), we tested the main effects and interactions of display type (standard versus enhanced), group (clinician versus nonclinician), and order of display presentation. One within-subjects factor (display) and 2 between-subjects factors (group and order) and their interactions were defined as fixed effects and the subject as a random effect. Post hoc comparisons were made for primary outcome measures using type III tests of fixed effects with Bonferroni corrections (significance level: P = .05/2 = .025). For secondary measures with binary outcomes (absolute Spo2 identification, arithmetic tasks attempted, arithmetic accuracy, and keyword task accuracy), we tested the main effects and interactions of display type and group as fixed effects and subject as a random effect. Secondary outcomes were reported on an exploratory basis, so we did not adjust for multiple comparisons and tested with P < .05. A disadvantage of the generalized linear mixed model is that it does not produce familiar measures of effect size such as Cohen d and n2. Therefore, for binary responses, we report the odds ratio with 95% confidence interval (CI).

We used a linear mixed-model approach to analyze target transition detection latency, arithmetic response latency, keyword detection latency, and questionnaire data. For the latency measures, each participant’s average latency for each of the 2 displays was used. For the questionnaire data, responses were on a 1- to 9-point Likert scale. We tested the main effects and interactions of display type and group as fixed effects and subject as a random effect. We specified a fixed intercept and a random intercept with a diagonal covariance structure to account for repeated measures. Effect sizes were reported as difference between means with 95% CI.

Assumptions of normality for distributions of residuals were evaluated using Q-Q plots. All statistical tests were 2-tailed. We used SPSS version 25 (IBM, Armonk, NY) to analyze data.

Target transition detection latency data residuals were not distributed normally, but a natural logarithm transformation preserved normal distribution of residuals. For arithmetic task latency and keyword detection latency, we used the untransformed average latency for each participant for each display. For all latency response and questionnaire data, we used a scaled identity covariance structure with a fixed intercept and random intercept.

Back to Top | Article Outline

Sample Size

A power analysis was performed using G*Power18 using the t test family for dependent means. Range identification accuracies from a recent laboratory study13 with a between-subjects design with 20 participants in each display group produced for the standard display a mean of 77% and standard deviation (SD) of 11%, and for the enhanced condition a mean of 86% and SD of 11%. Adopting those values, and assuming a correlation between groups of 0.5, 2-sided testing, P = .025 to test the effect of display on (a) target transition detection accuracy and (b) range identification accuracy, and power = 0.9, we calculated that we needed a minimum of 22 participants in each group to detect a difference.

Back to Top | Article Outline

RESULTS

Figure 1.

Figure 1.

Figure 2.

Figure 2.

Table 2.

Table 2.

Table 3.

Table 3.

Twenty-five clinician participants (mean age, 37 years; 64% female) and 28 undergraduate students (mean age, 21 years; 68% female) completed the study. Results for target transition detection and range identification are shown in Figures 1 and 2. Results for all Spo2 outcome measures are in Table 2 and for subjective judgments in Table 3.

Back to Top | Article Outline

Primary Outcomes

Results for target transition detection accuracy and range identification accuracy are shown in Table 2. The odds ratio of participants correctly detecting a target transition with the enhanced display compared with the standard display was 7.4 (95% CI, 4.4–12.3; P < .001). From the generalized linear mixed model, the estimated percentage correct (standard error [SE]) for the enhanced display was 87% (1.3%), and for the standard display, it was 57% (2.1%).

The odds ratio of participants correctly identifying range with the enhanced display compared to the standard display was 2.7 (95% CI, 1.6–4.6; P < .001). The estimated percentage correct for the enhanced display was 86% (1.3%), and for the standard display, it was 76% (1.7%).

Back to Top | Article Outline

Secondary Outcomes

In addition to the significant effect of display for detecting target transitions, noted above, the superiority of the enhanced display over the standard display for detecting target transitions was less pronounced when participants experienced it first rather than second in the crossover design, with an odds ratio of 0.4 (95% CI, 0.2–0.9; P = .006). Specifically, the superiority of the enhanced display over the standard display was less extreme when the enhanced display was presented first 84% (2.1%) compared with 59% (3.0%) for the standard display, than when the enhanced display was presented second, 90% (1.6%) compared with 56% (3.0%) for the standard display. No other effects were significant for target transition detection accuracy. For the range identification accuracy, there were no significant effects apart from the effect of display.

For participants’ target transition detection latency, there was a significant effect of display with the enhanced display supporting faster detection, 2.4 seconds (0.5 seconds), than the standard display, 8.7 seconds (0.5 seconds), with a mean difference of 6.3 seconds (95% CI, 5.1–7.5 seconds; P < .001). In addition, there was an interaction of display with order (P < .001). The superiority of the enhanced display over the standard display was less extreme when it was presented first, 3.0 seconds (0.7 seconds) compared with 7.6 seconds (0.9 seconds) for the standard display, than when it was presented second, 1.8 seconds (0.7 seconds) compared with 9.8 seconds (0.9 seconds) for the standard display. There were no other significant differences for target transition detection latency.

For participants’ identification of absolute Spo2 value, the odds ratio of participants making a correct identification with the enhanced display compared with the standard display was 2.4 (95% CI, 1.6–3.5; P < .001). The estimated percentage correct for the enhanced display was 67% (1.8%), and for the standard display, it was 46% (1.9%). We found no other significant effects for participants’ identification of absolute Spo2 value.

For the distractor tasks (arithmetic verification and keyword detection), we assessed participants’ performance as a manipulation check. Participants answered arithmetic questions more accurately in the enhanced condition, 79% (1.2%), than in the standard condition, 76% (1.3%), with an odds ratio of 1.3 (95% CI, 1.2%–1.5%; P < .001). In addition, participants responded to a larger percentage of the arithmetic questions presented when in the enhanced condition, 94% (0.7%), than when in the standard condition, 93% (0.9%), with an odds ratio of 2.4 (95% CI, 2.0%–3.0%; P < .001). When accuracy was analyzed for only the arithmetic questions attempted, participants were more accurate in the enhanced condition, 86% (1.0%), than in the standard condition, 84% (1.1%), with an odds ratio of 1.2% (95% CI, 1.0%–1.3%; P < .001).

Clinicians answered fewer arithmetic questions than nonclinicians (for clinicians: 92% [1.5%]; for nonclinicians: 95% [0.8%]), with an odds ratio of 0.6 (95% CI, 0.3%–1.2%; P = .016). Clinicians also responded more slowly than nonclinicians (for clinicians: 2.6 seconds [0.1 second]; for nonclinicians: 2.2 seconds [0.1 second]), with a mean difference of 0.3 seconds (95% CI, 0.2–0.5 seconds; P < .001). However, clinicians responded more accurately than nonclinicians for the arithmetic expressions attempted (for clinicians: 88% [1.2%]; for nonclinicians: 82% [1.6%]) with an odds ratio of 1.8 (95% CI, 1.1%–2.8%; P = .003).

For the keyword detection task, participants were slightly more accurate in the enhanced condition, 98% (0.3%), than in the standard condition, 97% (0.4%), with an odds ratio of 2.4 (95% CI, 1.6%–3.8%; P < .001). Participants also responded slightly more quickly to the keywords in the enhanced condition, 2.5 seconds (0.05 seconds), than in the standard condition, 2.6 seconds (0.05 seconds), with a mean difference of 0.1 second (95% CI, 0.04–0.2 seconds; P = .031).

For the subjective judgments, participants reported that it was easier to judge Spo2 parameters and they were more confident in their judgments when using the enhanced display than when using the standard display (Table 3). There was no difference between clinicians’ and nonclinicians’ subjective opinions for either of the displays, except that clinicians found it easier to detect target transitions than nonclinicians did.

Back to Top | Article Outline

DISCUSSION

Participants detected target transitions more accurately using an auditory display enhanced with tremolo and brightness (87%) than when using a standard auditory display supplemented with an alarm in the critical range (57%). Such a substantial difference of 30 percentage points may be clinically relevant. Participants more accurately identified Spo2 range when using the enhanced display (86%) than when using the standard display (76%). This study confirms previous findings that improvements occurred while participants performed a visually presented task and in the presence of background noise.13 The study also adds new information. First, the superiority of the enhanced display held even in the presence of an additional distractor task of the same perceptual modality as the monitoring task. Second, the performance of clinicians and nonclinicians on the primary outcomes was not statistically distinguishable.

Secondary outcomes reinforce that performance is generally better with the enhanced display. Participants using the enhanced display were faster at detecting target transitions, and more accurate at identifying absolute Spo2 values, than when using the standard display. For example, participants’ ability to identify absolute Spo2 values improved by 20 percentage points, which may be a clinically relevant change.

We attribute these improvements to the additional acoustic properties of the enhanced display. Auditory displays comprising varied acoustic dimensions are more discriminable than those comprising fewer dimensions because they provide cues that are more easily perceived and recognized.19,20 In the target and low ranges of the standard display, participants had only varying pitch to detect whether Spo2 transitioned ranges and to judge Spo2 range. The enhanced display added tremolo to the varying pitch in the low range and both tremolo and brightness in the critical range. These additional properties informed the listener when range transitions occurred.21 If anesthesiologists are able to identify when Spo2 is trending toward a critical threshold, they may take remedial action before the threshold is breached and alarms sound. This may have implications for patient safety and may reduce reliance on alarms.

We found no difference in the performance of clinicians and nonclinician participants for any outcomes related to identification of Spo2 parameters. This may seem surprising given clinicians’ greater familiarity with auditory PO displays, whereas many nonclinician participants had never heard the PO display before. This also contrasts with research showing that anesthesiologists were more accurate than nonanesthesiologists at judging abnormal patient parameters using continuous auditory displays.3 However, in that study, participants monitored 5 vital signs simultaneously. Anesthesiologists’ increased domain expertise may have enabled them to integrate the information and perform better than nonanesthesiologists. In the present study, participants monitored only 1 vital sign, Spo2, and performed identification tasks rather than clinical judgment tasks, making it less surprising that nonclinician participants performed similarly to clinicians.

As visual stimuli in the OR increase in number and impose greater cognitive demands (eg, data entry into electronic patient record systems), visual attentional resources may become overloaded and vital cues about patient information missed. The auditory modality not only has greater capacity for processing perceptual cues22 but also is always available, unlike the visual mode, which has a narrower spatial focus. Thus, if accurate patient information is provided via the auditory modality, anesthesiologists may be able to allocate visual and auditory attention more efficiently and manage patient treatment more effectively.

Participants’ accuracies for the arithmetic task and the keyword detection task were greater in the enhanced condition than in the standard condition. Participants reported that it was easier to distinguish auditory changes with the enhanced display than with the standard display plus alarms. As a result, participants may have had more cognitive resources to direct toward the distractor tasks.

We found small differences between clinician and nonclinician participants for performance of distractor tasks, which may reflect biases in prioritization of tasks. Specifically, clinicians were more accurate than nonclinicians at verifying the arithmetic expressions to which they responded but slower at responding than nonclinicians, irrespective of display. Clinicians may focus on performing a task well because of a greater general concern with error and risk. Nonclinician participants may have been less motivated to avoid error and less sensitized to risk.

Participants reported that it was easier to identify Spo2 parameters, and that they were more confident in their judgments, with the enhanced display than with the standard display. Users’ subjective judgments of displays are important for display design. If users find a display intolerable or unreliable, it will not be accepted, despite possible benefits.23

Back to Top | Article Outline

Limitations and Future Research

This study had a number of limitations. First, it was conducted in an undisturbed room, whereas the OR is a noisy, dynamic environment. Nonetheless, testing the effectiveness of our displays was important before taking them to more realistic settings. Second, we had only 2 distractor tasks, whereas anesthesiologists perform numerous tasks.1 Third, we conducted multiple tests on secondary outcomes, raising the possibility of type I errors, but the effect of display was strong for all judgments of Spo2. Fourth, the keyword detection task may not have been sufficiently demanding, as indicated by high detection accuracies (>96% for both displays). Intervals of 3–7 seconds between phrases may have helped participants allocate their auditory attention. Future research may investigate a keyword detection task with continuous vocal dialogue. Fifth, the study was conducted in a room considerably smaller than the OR and with no standard OR equipment present. Speech intelligibility worsens with increased OR size and increases with OR content (eg, anesthetic machine, trolleys).24 Sixth, our clinicians were younger on average than the age at which presbycusis sets in,25 so older clinicians may not hear higher frequencies in the enhanced critical range. Seventh, there was only 1 researcher present during experimental trials. In the OR, anesthesiologists are often interrupted or distracted by a number of personnel, directly or indirectly.26 To address some of these limitations, we plan to compare anesthesiologists’ ability to identify Spo2 parameters using the enhanced and standard displays in a simulator with scripted anesthetic scenarios that include interruptions, distractions, and continual verbal conversations.

Back to Top | Article Outline

CONCLUSIONS

This study shows that a PO auditory display augmented with additional acoustic dimensions to signal low and critical ranges is more effective than a standard pitch plus alarm display for Spo2 parameter identification. Benefits are apparent in a setting with distractors and background noise. In addition, the effect held for both clinician and nonclinician participants. The enhanced display has the potential to improve clinicians’ perception and comprehension of Spo2 auditory stimuli when access to visual displays is compromised. This may allow for more efficient allocation of attention and potentially more timely and effective decision-making in the high-pressure environment of the OR. Systems that enhance anesthesiologists’ vigilance and monitoring performance are important in the ongoing endeavor to enhance patient safety.

Back to Top | Article Outline

ACKNOWLEDGMENTS

The authors thank Dr Peter Moran (MBBS), Director of Anaesthesia, Princess Alexandra Hospital, Woolloongabba, Queensland, Brisbane, Australia, who provided support and a testing location at the Princess Alexandra Hospital and helped with recruitment and liaison with anesthetic staff members. The authors also thank T-Lok Tang (BSc [Hons]), Felicity Burgmann (BPsySc(Hons)), and Lachlan Peterson (UQ undergraduate student) who recorded spoken phrases. The authors thank our statistical advisors for input.

Back to Top | Article Outline

DISCLOSURES

Name: Estrella Paterson, BPsySc(Hons).

Contribution: This author helped design the study, design and test the stimuli, design the experimental materials, test the software application, prepare ethics and governance applications, liaise with personnel at the hospital site, conduct the experiment, analyze the data, and write the report.

Conflicts of Interest: None.

Name: Penelope M. Sanderson, PhD.

Contribution: This author helped design the study, stimuli, and experimental materials, prepare ethics and governance applications, liaise with personnel at the hospital site, revise data analysis, and write the report.

Conflicts of Interest: P. M. Sanderson is a coinventor of a respiratory sonification (Sanderson and Watson, US Patent 7070570).

Name: Birgit Brecknell, PhD.

Contribution: This author helped design the stimuli, write the software application for the experiment, test the software application, and revise the report.

Conflicts of Interest: None.

Name: Neil A. B. Paterson, MBChB.

Contribution: This author helped design the study and stimuli, revise the report, and provide expert knowledge of anesthesia.

Conflicts of Interest: None.

Name: Robert G. Loeb, MD.

Contribution: This author helped design the study and stimuli, revise the report, and provide expert knowledge of anesthesia.

Conflicts of Interest: R. G. Loeb has received $1000 per year to be on the Masimo, Inc, Scientific Advisory Board.

This manuscript was handled by: Maxime Cannesson, MD, PhD.

Back to Top | Article Outline

REFERENCES

1. Phipps D, Meakin GH, Beatty PC, Nsoedo C, Parker D. Human factors in anaesthetic practice: insights from a task analysis. Br J Anaesth. 2008;100:333–343.
2. Ansermino JM. Ehrenfeld JM, Cannesson M. Intelligent patient monitoring and clinical decision making. In: Monitoring Technologies in Acute Care Environments. 2014:New York, NY: Springer, 401–407.
3. Watson M, Sanderson P. Sonification supports eyes-free respiratory monitoring and task time-sharing. Hum Factors. 2004;46:497–517.
4. Ford S, Birmingham E, King A, Lim J, Ansermino JM. At-a-glance monitoring: covert observations of anesthesiologists in the operating room. Anesth Analg. 2010;111:653–658.
5. Schulz CM, Schneider E, Fritz L, et al. Visual attention of anaesthetists during simulated critical incidents. Br J Anaesth. 2011;106:807–813.
6. Morris RW, Mohacsi PJ. How well can anaesthetists discriminate pulse oximeter tones? Anaesth Intensive Care. 2005;33:497–500.
7. Schulze K, Gaab N, Schlaug G. Perceiving pitch absolutely: comparing absolute and relative pitch possessors in a pitch memory task. BMC Neurosci. 2009;10:106.
8. Stevenson RA, Schlesinger JJ, Wallace MT. Effects of divided attention and operating room noise on perception of pulse oximeter pitch changes: a laboratory study. Anesthesiology. 2013;118:376–381.
9. Loeb RG, Brecknell B, Sanderson PM. The sounds of desaturation: a survey of commercial pulse oximeter sonifications. Anesth Analg. 2016;122:1395–1403.
10. Deschamps ML, Sanderson P, Hinckfuss K, et al. Improving the detectability of oxygen saturation level targets for preterm neonates: a laboratory test of tremolo and beacon sonifications. Appl Ergon. 2016;56:160–169.
11. Hinckfuss K, Sanderson P, Loeb RG, Liley HG, Liu D. Novel pulse oximetry sonifications for neonatal oxygen saturation monitoring: a laboratory study. Hum Factors. 2016;58:344–359.
12. Paterson E, Sanderson PM, Paterson NA, Liu D, Loeb RG. The effectiveness of pulse oximetry sonification enhanced with tremolo and brightness for distinguishing clinically important oxygen saturation ranges: a laboratory study. Anaesthesia. 2016;71:565–572.
13. Paterson E, Sanderson PM, Paterson NAB, Loeb RG. Effectiveness of enhanced pulse oximetry sonifications for conveying oxygen saturation ranges: a laboratory comparison of five auditory displays. Br J Anaesth. 2017;119:1224–1230.
14. Sevdalis N, Healey AN, Vincent CA. Distracting communications in the operating theatre. J Eval Clin Pract. 2007;13:390–394.
15. International Electrotechnical Commission. IEC 60601-1–8:2006 - Medical electrical equipment – Part 1-8: General requirements, tests and guidance for alarm systems in medical electrical equipment and medical electrical systems. 2006. Geneva, Switzerland: International Electrotechnical Commission; Available at: https://webstore.iec.ch/publication/2599. Accessed August 17, 2018.
16. Sanderson PM, Wee A, Lacherez P. Learnability and discriminability of melodic medical equipment alarms. Anaesthesia. 2006;61:142–147.
17. Martí C, Montserrat G-F, Josep LC. Methodological quality and reporting of generalized linear mixed models in clinical medicine (2000–2012): a systematic review. PLoS ONE 2014;9:e112653.
18. Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39:175–191.
19. Atyeo J, Sanderson PM. Comparison of the identification and ease of use of two alarm sound sets by critical and acute care nurses with little or no music training: a laboratory study. Anaesthesia. 2015;70:818–827.
20. Edworthy J, Hellier E, Titchener K, Naweed A, Roels R. Heterogeneity in auditory alarm sets makes them easier to learn. Int J Ind Ergon. 2011;41:136–146.
21. Watson MO, Sanderson P. Designing for attention with sound: challenges and extensions to ecological interface design. Hum Factors. 2007;49:331–346.
22. Murphy S, Fraenkel N, Dalton P. Perceptual load does not modulate auditory distractor processing. Cognition. 2013;129:345–355.
23. Edworthy J. Medical audible alarms: a review. J Am Med Inform Assoc. 2013;20:584–589.
24. McNeer RR, Bennett CL, Horn DB, Dudaryk R. Factors affecting acoustics and speech intelligibility in the operating room: size matters. Anesth Analg. 2017;124:1978–1985.
25. Baxter AD, Boet S, Reid D, Skidmore G. The aging anesthesiologist: a narrative review and suggested strategies. Can J Anaesth. 2014;61:865–875.
26. van Pelt M, Weinger MB. Distractions in the anesthesia work environment: impact on patient safety? Report of a meeting sponsored by the anesthesia patient safety foundation. Anesth Analg. 2017;125:347–350.
Copyright © 2019 International Anesthesia Research Society