Secondary Logo

Journal Logo

Objective Assessment of Checklist Fidelity Using Digital Audio Recording and a Standardized Scoring System Audit

Salgado, Douglas BSc*,†; Barber, Kimberly R. PhD; Danic, Michael DO

doi: 10.1097/PTS.0000000000000306
Original Article

Objectives The use of the World Health Organization Surgical Safety Checklist (SSC) has been reported to significantly reduce operative morbidity and mortality rates. Recent findings have cast doubt on the efficacy of such checklists in improving patient safety. The effectiveness of surgical safety checklists cannot be fully measured or understood without an accurate assessment of implementation fidelity, most effectively through direct observations of the checklist process. Here, we describe the use of a secure audio recording protocol in conjunction with a novel standardized scoring system to assess checklist compliance rates.

Methods We used a black box digital audio recording protocol to observe the execution of SSCs in real time. A novel checklist scoring system was used to quantify the implementation fidelity of a modified version of the SSC. Physician and staff perception of patient safety was also surveyed before and after implementation.

Results Audio-recorded audits revealed a precisely executed checklist 73.6% of the time compared with a previously reported compliance rate of 97.6%. Implementation fidelity was highest during preanesthesia and preincision checklist sections, whereas postprocedure checklist compliance and fidelity was consistently the lowest. Positive attitudes on patient safety by surgical staff increased by 11% from baseline.

Conclusions The use of a secure digital audio recording protocol is a simple yet effective tool for observing checklist performance. Moreover, the implementation of a standardized scoring system allows for the objective evaluation of checklist fidelity. Together, they provide a powerful auditing tool for identifying improvement.

From the *American University of Antigua College of Medicine

Genesys Office of Research, Genesys Regional Medical Center

American Anesthesiology of Grand Blanc, Genesys Regional Medical Center, Grand Blanc, MI.

Correspondence: Kimberly Barber, PhD, Genesys Office of Research, Ste 2442, One Genesys Pkwy, Grand Blanc, MI (e-mail:

The authors have no conflicts of interest to disclose.

Source of funding: Cardinal Health E3 Patient Safety Grant Program.

Online date: November 3, 2016

This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.

In 2008, the World Health Organization (WHO) published its first set of comprehensive guidelines for improving the safety of all surgical patients. Since then, these guidelines have been incorporated into a wide range of Surgical Safety Checklists (SSCs) designed to prevent avoidable operative errors and have since become implemented as a mandatory safety component in thousands of hospitals worldwide.1 This relatively rapid and widespread adoption of SSCs is due in large part to the findings of an early WHO study, which showed that the use of SSCs was associated with a significant reduction in the number of surgical related complications and deaths.2

The effectiveness of SSCs in reducing adverse surgical events has been attributed to a variety of underlying factors including but not limited to improved communication between surgical team members, enhanced identification of potentially harmful errors (ie, near misses), decreased distractions in the operating theater, and the fostering of positive attitudes related to safety culture.1,3,4

However, recent findings have questioned the true value of mandatory SSC implementation. In their study, Urbach et al compared patient data from more than 100,000 surgical procedures performed with and without the use of an SSC and found no significant reduction in the rates of surgical morbidity or mortality between the 2 groups.3 These findings cast doubts on the effectiveness of SSCs as a safety intervention and have served to reignite a longstanding debate between detractors and proponents of SSC implementation.

Advocates for the use of SSCs note that these contradictory findings by Urbach et al, and those of numerous other studies which have also failed to reproduce the positive SSC effects of the original WHO study, cannot be validated because these results are based largely on self-reported checklist compliance rates that are neither consistent with actual levels of checklist adherence nor do they account for individual variances in checklist fidelity.

These claims are supported by the findings of multiple studies indicating that, despite many hospitals reporting high to perfect rates of SSC compliance (eg, 90%–100%), a large proportion of these reports often include incomplete and/or inaccurately executed checklists.1,5 Furthermore, the success of any given intervention or program is dependent on the degree of implementation fidelity, defined as the degree of accuracy with which an intervention is implemented as it was intended by the developer/designer.6 The level of implementation fidelity has been shown to affect the degree of success a given intervention has in achieving its intended outcomes, with high levels of fidelity associated with greater success.6 Implementation fidelity should be properly evaluated before drawing conclusions about the effectiveness of any given intervention or program.

Despite the growing body of evidence on both side of the SSC debate, without a clear method for measuring the degree of implementation fidelity of SSCs, the true effectiveness of such an intervention remains unclear. Here, we describe a checklist performance auditing process composed of the following: 1) a secure “black box” audio recording protocol, and 2) a novel SSC fidelity scoring system (or SSC compliance scoring system).

Back to Top | Article Outline


The SSC auditing process described in the following sections was developed as an adjunct process to a previously approved safety initiative program implemented at 410 bed community teaching hospital, with 22,000 discharges annually comprising of an average of 5000 inpatient and 10,000 outpatient surgical procedures. The hospital is the flagship organization of the Health System, a regionally integrated health-care delivery system, which provides medical education programs and patient care services throughout Genesee County, Michigan.

This hospital was selected to participate in a system wide patient safety initiative, looking at the feasibility of using audio “black box” recording devices inside the OR to record SSC performance and the effect such an approach might have on WHO SSC compliance rates. Surgical procedures were recorded to observe the performance of the checklists using a digital recorder. The digital recorder was placed in a secure location within the operating room before the start of the first operative procedure on any given recording day and was removed after the completion of the last scheduled operative procedure. Recorders remained turned on for up to 12 hours, recording the audio as an MP3 digital audio file.

The hospital uses a 24-item modified version of the WHO SSC. The checklist is organized into 3 sections of questions that must be read out loud using the provided hard copy (Fig. 1), so as to avoid the potential for “skipping” individual checklist items that may result from rogue memorization. Questions are designed to be read in a sequential order by an assigned member of the surgical team designated as the checklist recorder at 3 specific perioperative periods: preanesthesia, preincision, and postprocedure. Checklist recorders are required to ask all surgical staff members for a “timeout” and ensure all staff members have paused before beginning of each section. Individual questions are addressed to specific members of the surgical team as outlined on the checklist, and the recorder must allow time for an answer to be given before moving on to the next checklist item.



Upon the collection of the first 5 full day recordings, it became quickly apparent that to actually listen to the SSC process being carried out during the course of a single surgical procedure, it would require filtering through hours of nonpertinent and potentially sensitive audio data. Given the fact that the entire SSC process is only minutes long and executed during 3 distinct time frames for a given procedure, we needed a far more efficient and secure method to identify only the relevant SSC information. We thus developed the protocol described in the following methods section (Fig. 2) for the secure transfer, analysis, and storage of collected audio data.



For the improvement initiative, the only measure of SSC performance that was evaluated and reported on was the number of checklist items that were heard to be performed during audio playback. Only those individual checklist items that were audibly “observed” to have been carried out during the course of a given SSC could be counted as complete. A numerical tally of all completed checklist items was recorded, and a checklist was deemed 100% compliant, only if all required checklist items were in fact marked as completed. Self-reported checklist completion was compared with observed checklists to determine overall SSC compliance rate during the specified period of the initiative.

As more and more recordings were evaluated for compliance using this predetermined criteria, it became clear that it was far too narrow to accurately account for the large variations in how different individuals executed different checklist points and variations in overall adherence to required tasks (eg, requesting timeouts, background noise). A single missed checklist item could render a checklist incomplete and thus noncompliant, whereas in other cases, a checklist recorded as 100% compliant was sometimes still poorly executed. It was imperative that we develop a more comprehensive method that would go beyond the simple tallying of complete versus incomplete checklist items and thus better account for theses individual variations and provide a more accurate assessment of SSC implementation fidelity.

Back to Top | Article Outline


Audio Recording Protocol

A quality improvement protocol using the WHO Surgical Safety Checklist was implemented in 2014 through 2015. It included an assessment protocol by black box recording of checklist completion as it is being conducted in the operating suite. Audio recording was conducted using Olympus VN-702PC digital voice recorders (Center Valley, PA) fitted with Sony ECM-DS70P (Minato, Tokyo, Japan) condenser microphones. All digital recordings were transferred, analyzed, and stored according to the protocol outlined in Figure 2. Each full-length audio MP3 file was transferred from the digital recorder and downloaded on to a desktop computer for analysis. The MP3 file was then analyzed using the Audacity software program, a free, open source, and cross-platform software used for recording and editing sounds ( Individual surgical procedures were then identified within the original full-length MP3 recording and were spliced out and exported as single audacity sound files. Once all operative procedures had been identified and exported, the original full-length MP3 recording was immediately deleted from the desktop. The audacity program was then used to identify individual sections of the SSC for each recorded procedure, and each section was subsequently spliced and exported as single audacity sound files. After each individual section was spliced and exported, the original full-length recording of the given procedure was also immediately deleted. Each individual checklist section was then analyzed and given a score based on the standardized scoring system described below. Scores for each section were recorded using a standard spreadsheet.

All .aup and .au files were stored in separate Western Digital (WDC, Irvine, CA) external hard drives. Each WD external storage device was encrypted with a software key known only to authorized personnel. Each WD external drive was stored in 2 separate locations inside locked cabinets.

Although the original recording downloaded from the digital recorder contained the entire length of a recorded session, only a very small portion of an entire session actually remained as retrievable and stored information. A 12-hour recording session, which records 3 surgical procedures, might render less than 3 total minutes of audio per procedure. The above outlined protocol thus contains multiple safeguards to prevent any unauthorized access of the final recorded data.

Back to Top | Article Outline

Standardized Compliance Scoring System

A comprehensive method for recording the variations in checklist performance that goes beyond the simple tallying of complete versus incomplete checklist items is outlined in Figure 3. We assigned points for “requesting quiet” in each of the sections of the checklist to award teams for “speaking up.” Checklist items that required a follow-up question were classified as 2-part questions. The “original” and “follow-up” questions in a 2-part question were assigned the same number along with the letters “a” or “b,” respectively (eg, 1a and 1b). To differentiate between instances in which a given question was deemed to not have been asked (ie, incomplete), was asked but not asked accurately (ie, inaccurate), or was indeed asked as intended (ie, complete), we assigned a given question an accuracy score of 0, 1, or 2 points, respectively. Thus, each individual question had a maximum value of 2 points if performed correctly. If the answer to part “a” of a designated 2-part question did not require part “b” to be asked, then that question was treated as a one part question with a maximum possible value of 2 points. However, to account for those cases in which part “b” of a given 2-part question was required, we assigned each individual part of a 2-part question a maximum value of 1 point; making the combined maximum value of a 2-part questions equal to 2 points (not 4).



We also assessed the compliance of the session recorder in requesting a “timeout” before the start of each checklist section, as well as the level of background noise heard during each section. If the session recorder was heard to have “spoken up,” a score of 1 point was given, whereas if the request was not made at all, no points were given. If, during the completion of the checklist, the level of background noise (ie, instrument clatter, conversation, or radio) was deemed as disruptive, it received a “background noise” score of 1 point; a moderate level of noise was assigned 2 points, and a low level or quiet was assigned the maximum score of 3 points.

By adding up the assigned accuracy scores of individual questions, the “timeout request” score and “background noise” score, and then dividing it by the maximum possible score of all 3 scores in a given section, we were then able to better assess the overall compliance within each checklist section; moreover, we were able to compare compliance between sections of the checklist and compare overall compliance between entirely separate checklists.

Back to Top | Article Outline


A total of 128 surgical procedures were recorded to observe the performance of SSCs using the digital recording protocol. There were 100 cases (78.1%), which had viable audio that could be reported. Properly communicated checklists were correctly performed 74% of the time as measured by the audio audit. This was significantly lower than the self-completed paper checklist audit of 96.8% (P < 0.001). Highest point totals were achieved during the preanesthesia and preincision components (62.9% and 49.2%, respectively).

Overall, section B demonstrated the highest rate of completion, followed by sections A and C (89%, 81%, and 39%; P = 0.01). Compliance for requesting “quiet” in the OR was highest for section B (76%) compared with actually achieving quiet (per audio identification) across all sections (42%) (Table 1).



Individual items within each section on the checklist demonstrated varying rates of completion (Table 2). The lowest completion rates included question 9 in section A on risk of hypothermia (52%), question 10 in section B on need for venous thromboembolism prophylaxis, and question 4 in section C on addressing equipment problems.



Major distractors identified include personal conversations, staff changes, and room setup noise. We have found that nearly all staff complete the checklist. The period when points were most frequently missed in the preincision checklist was when the nurse answered her own question regarding venous thromboembolism prophylaxis needed, as she was the individual who performed the task. The quietest portion of the SSC was the preincision, and the most disruptive time was the preanesthesia checklist because of instrument noise. The most common disruption in all other SSC sections was radio noise. Importantly, no sentinel events occurred during the study period, and awareness regarding patient safety increased by 11% from baseline.

An attitude survey regarding the use of the WHO SSC was conducted before and after the audit intervention. Surveys were distributed to the following health-care providers: nurses (n = 67), physician (n = 17), resident/student (n = 21), and surgical tech (n = 38). A total of 143 survey responses were collected (pre = 86 and post = 57). There was an overall significant improvement of 27.4% in positive attitudes about the WHO checklist (pre = 39.5% vs post = 54.4%; P = 0.05). Survey responses demonstrated an improvement in attitude for every survey area (Table 3).



Our scoring algorithm facilitated accurately: 1) assessing and comparing individual performance/fidelity scores based on comparison between maximum score and calculated score, 2) comparing and contrasting degree of accuracy/fidelity between SSCs and within SSC sections or individual checklist points, 3) identifying areas in need for performance improvement and provide specific feedback/praise, and 4) ultimately determining the true efficacy of safety initiatives by comparing programs with high and low compliance and their subsequent morbidity and mortality rates.

Back to Top | Article Outline


To date, studies looking at the effectiveness of SSCs have relied mainly on the retrospective analyses of self-reported checklist compliance data, rather than on any type of prospective analysis based on verifiable or validated compliance data.4,7 Given the major role that implementation fidelity has in a program’s overall efficacy, we contend that the first step in determining the effectiveness of SSCs must therefore be to observe the execution of the SSC process itself in real time. As such, we recommend that all SSCs being audited be recorded whenever possible.

The real-time recording of SSCs can be achieved using a variety of methods ranging from the most basic (paper record) to the most high tech (remote digital devices). Different recording protocols have inherent advantages and disadvantages that should be weighed carefully before deciding on a preferred recording method (Table 1). Ultimately, the method chosen should be the most appropriate to the setting and the resources available so as to ensure that the acquisition of the required equipment, if any, is feasible; there is adequate staffing for the collection and analysis of the recorded data; and access and storage of the data collected meets all privacy and security requirements. Below are a few examples of different recording methods and some of their associated considerations that may be used as part of the recording phase of an SSC auditing process as summarized in Table 4.



Back to Top | Article Outline

Third-Person Recording

Third-person recording refers to the evaluation of checklist execution provided by a dedicated person (ie, checklist scorer) who is separate from the surgical team. This method requires no special equipment beyond a simple paper record. A major drawback to this method is that data collection is limited to the number of procedures an individual scorer can observe first hand. This method also does not allow for reviewing or double checking the accuracy of assigned checklist scores because there is no true recording of the actual SSC process once completed.

Back to Top | Article Outline

Audio Recordings

Similar to the use of “black box” recordings of cockpit activities in the aeronautics industry, recording the audio of surgical activities inside the operating theater can be a fairly straight forward method for capturing a large amount of checklist performance data. Compared with other recording protocols, “black box” audio require the least expensive technical equipment. However, black box recordings may also present the greatest challenge to recording high-quality audio because of their physical enclosure and inconspicuous placement inside the operating theater. A major drawback is that a single black box recording session captures hours of nonrelevant audio, which requires extensive postrecording processing to isolate the few minutes needed to evaluate a single SSC (Fig. 1). All relevant audio data will require some degree of processing and must be ultimately transcribed into a standardized scoring record, which can then be used for statistical analyses. The nature of the recorded audio data captured during this auditing process raises both Health Insurance Portability and Accountability Act privacy and liability concerns that must be fully accounted for before implementation and requires strict handling and storage protocols to be in place.

Back to Top | Article Outline

Handheld Device Recording

The use of digital handheld devices, such as tablets and smartphones, has become increasingly common place in all aspects of health-care delivery systems. These digital handheld devices are capable of recording both audio and video, simultaneously providing a convenient method for recording the SSC process. This method could allow for the automated transcription, scoring, and statistical analysis of the recorded checklist data. The data can easily be secured on the device itself using common encryption and password restrictions. It also allows for the selective recording of only the SSC execution, albeit in a nondiscreet manner. This type of selective recording would help to mitigate the privacy concerns previously mentioned, although it still does not eliminate them completely or minimize the need for the same robust security measures as with other recording methods. Despite the growing popularity of smart technology, we recognize that the relative cost of such devices and the cost of the software development may still make this method unfeasible for widespread implementation.

Back to Top | Article Outline

Video Recording

The use of closed circuit video cameras within the operating room can be an effective and discreet way to observe the execution of SSCs in real time. It allows for the collection of a relatively large amount of SSC data. Unlike audio recordings, the ability to visualize the start and end of each of the SSC sections requires far less amount of postrecording processing time and can help to increase the scoring accuracy of an SSC performance as it allows the checklist scorer to better observe the SSC process. Video recording observations still require the same amount of personnel as seen with audio recordings and raise similar privacy and liability concerns, given the nature of the recorded data. A fully integrated video recording system is considerably more expensive than the other methods. As with other audio/visual recording methods, the use of video cameras allows for an anonymous and reviewable scoring process, but unlike audio or even handheld device recordings, video cameras are capable of capturing the entire operating theater and thus provide a clearer assessment of all the surgical activities surrounding the execution of any given SSC.

Regardless of the recording method chosen to observe SSC performance, the second and perhaps most critical step in being able to quantify SSC implementation fidelity is to make use of a standardized and objective assessment tool. A standardized scoring system, such as the one we have developed, provides a clear and concise view of the degree of implementation fidelity for any given safety initiative.

Feedback to physicians and staff came directly from the audio audit, and attention to postprocedure sections was addressed to all the staff. Final results of actual completion versus simply checking boxes led to a re-education effort that patient safety requires more than mere documentation. Rather it is focused on mindfulness and communication between all members in the operating room.

Back to Top | Article Outline


We recommend implementation fidelity be thoroughly evaluated using real-time observations and a standardized compliance scoring system. In doing so, institutions can have an effective assessment of their surgical checklist process and thus enhance their patient safety environment.

Back to Top | Article Outline


1. Levy SM, Senter CE, Hawkins RB, et al. Implementing a surgical checklist: more than checking a box. Surgery. 2012;152:331–336.
2. Haynes AB, Weiser TG, Berry WR, et al. A surgical safety checklist to reduce morbidity and mortality in a global population. N Engl J Med. 2009;360:491–499.
3. Urbach DR, Govindarajan A, Saskin R, et al. Introduction of surgical safety checklists in Ontario, Canada. N Engl J Med. 2014;370:1029–1038.
4. Robblee JA. Surgical safety checklists in Ontario, Canada. N Engl J Med. 2014;370:2349.
5. Carroll C, Patterson M, Wood S, et al. A conceptual framework for implementation fidelity. Implement Sci. 2007;2:40.
6. Winawer NH. Surgical checklist and patient safety: more than just checking a box. NEJM Journal Watch Hospital Medicine. 2014.
7. Perry W, Kelley E. Checklists, global health and surgery: a five-year checkup of the WHO surgical safety checklist program. Clinical Risk. 2014;20:59–63.

safety checklist; surgery safety; WHO surgical safety checklist

Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved