Surgical techniques and technologies are continually evolving. Transitions such as those from open to laparoscopic and then to robotic techniques require learning, processing of new data inputs, and substantial adaptation by the surgeons.1 Such transformational shifts of human–system interactions2,3 in healthcare are often driven by the availability of new technologies that provide meaningful patient-derived benefits.4–10 Surgeons want to deliver cutting-edge, quality care using advanced technology and techniques to achieve patient benefits; however, there is little consideration for the mental and physical costs resultant from new practices and technologies. Closure of this knowledge gap is critical for researchers, surgeons, and administrators to better understand the implications of various procedural, technology, and process advancements on human–system interactions and to the day-to-day operating room (OR) work for surgeons and their surgical team.
Innovatively integrating advanced technology is not novel to other fields. In the automotive industry, cars are designed to automatically parallel park; however, the technologies—such as the rearview camera and back-up sensors—that preceded this advancement were intermediate steps to allow for forward facing attention and increased safety.11,12 With automotive advancements such as navigation, it was imperative to reduce the time of drivers’ eyes off the road, as this could create performance and safety issues.13,14 This shift in human–system interaction required considerable human factors engineering research to ensure optimal performance and driver safety.15–17 Similar to automotive design and advancement, there is opportunity for further human factors research to develop and implement advanced assistive technologies to better support surgical performance, improve patient safety, and reduce the burden on the provider.
The field of human factors research seeks to understand how humans interact with elements of their work-system (e.g., equipment, environment, tasks, people) to optimize system performance without sacrificing human wellbeing.18 As a result, understanding how the work-system affects surgical workload has gained momentum19 and can be considered a part of human–system integration efforts.20 Workload is a broad, multifaceted term that encompasses the human “cost” of performing a task.21 Despite concern with high OR workload22,23 and its links to surgical performance degradation,24–26 to date few efforts have been made to design surgical technologies with surgeon and team workload in mind.27,28
To understand how technology and the changes in human–system interactions impact workload, performance, and patient safety, it is necessary to first measure workload broadly across surgical specialties. The NASA-Task Load Index (TLX) consists of multiple factors and subjectively evaluates user demand for a variety of tasks.29 This validated tool has been used across professions, including healthcare, to quantify the demand on users21,30 and even adapted for surgery.31 This study utilized NASA-TLX and a Surg-TLX question to describe the current status of operative demand and identify opportunities for the greatest potential impact across various procedures. Therefore, this study aimed to (1) compare self-reported workloads across surgical procedures and (2) identify potential patient and/or procedural influences on high OR workload for the surgeon.
In this prospective descriptive study, surgeons at a large, quaternary academic hospital located in the Midwest were recruited to participate. The Institutional Review Board approved this study with the following inclusion criteria for interested participants: (1) hold a position as a currently practicing attending surgeon within the department of surgery at the institution and (2) are able to volunteer the time to complete the required questionnaires. Surgeons were instructed that participation was optional and that they could opt-out at any point of the study. Furthermore, individual results would be anonymous and not affect employment.
After enrollment, participants completed a prestudy survey on surgical specialty, general demographics (i.e., age, sex, height, and weight) and surgical experience in open, laparoscopic, and robotic surgery. Following this questionnaire, study coordinators monitored participant surgical schedules to identify the type and order of procedures for the day and input this information into Qualtrics software, Version 2017 (Qualtrics, Provo, UT). An automated email was then sent to each participant the morning of their surgeries with a questionnaire via a secure link for each procedure. The questionnaire included the case number for their surgical date and procedure type to indicate which procedure they were evaluating. Participants were asked to complete each questionnaire as soon as possible after the corresponding procedure. Between May 2017 and July 2017, surgeon participants were included in the study until they completed at least 20 questionnaires or they chose to discontinue the study.
Participants were expected to complete one modified NASA-TLX questionnaire for each procedure. This modified NASA-TLX questionnaire consisted of the NASA-TLX,29 1 question on distraction from the Surg-TLX,31 and a question to indicate variation from expected difficulty, reported as “expected,” “less difficult than expected,” or “more difficult than expected.” The NASA-TLX subscales include questions on mental demand, physical demand, temporal demand, performance, effort, and frustration. Both the NASA-TLX subscales and the Surg-TLX subscale item were rated on a 20-point scale (0 = low, 20 = high). Mental demand, physical demand, temporal demand, performance, effort, and frustration subscales were combined to create a composite NASA-TLX workload score (scaled to 0 = low, 100 = high). Surveys were collected for 662 surgeries.
Patient and Procedural Factors
As a retrospective analysis, patient and procedural factors were acquired from the medical record for the surgical patients operated on during 506 of 662 of the completed surveys, as the remaining 156 patients did not consent to retrospective research participation. Patients were able to opt out of participating in any research prior to their surgery; therefore, the data for those that opted out were not obtained for this portion of the study. The following patient and procedural factors were included for analysis: body mass index (BMI), age, sex, American Society of Anesthesiologists (ASA) Category, procedure type, and procedural duration (ie, skin to skin duration). The surgical specialties were based on American College of Surgeons (ACS) categorization and any specialties with fewer than 3 participants were aggregated into “other.”
Data analyses were conducted in SPSS Version X (IBM, Armonk, NY) and Minitab Inc version 17 (State College, PA). Descriptive statistics were performed on surgeon, patient, and procedural characteristics. Normally distributed variables were reported as means and standard deviations. Non-normally distributed variables were reported as medians with interquartile ranges. Questionnaire items were compared using t tests, Chi-square tests, Mann–Whitney U tests, Kruskal–Wallis H tests, and correlations, as appropriate. Comparisons by specialty were conducted using analysis of variances. Tukey post-hoc tests were performed to identify group differences. P values < 0.05 were considered significant for all comparisons.
Thirty-four surgeons (41% female) participated in this study rating an average (standard deviation) of 14.85 (SD = 7.94) surgeries each for a total of 662 surgeries with the modified NASA-TLX. Approximately 80% of surgeons’ survey responses (528/662) were completed within 48 hours of being sent. Patient and procedural data were gathered through the electronic health record for 506 consented patients. Approximately 60% of the procedures were performed as open cases with the remaining 40% as either laparoscopic or robotic. Descriptive statistics for the surgeons, patients, and procedures are provided in Tables 1 and 2. Surgeons were stratified according to surgical experience post residency into early career (0–5 yrs; n = 9), mid-career (6–15 yrs; n = 13), and advanced career (16+ yrs; n = 12). While there was no significant difference in perceived composite workload across the career levels, when analyzed by subscales, early career surgeons perceived significantly lower temporal demand (M = 3.67, SD = 3.67; P = 0.016) and distractions (M = 2.44, SD = 3.15; P = 0.026) than established career surgeons (M = 4.89, SD = 4.48; M = 3.45, SD = 3.81, respectively).
Surgeons reported a procedural difficulty level as expected or lower than expected for 78% of the procedures with the remaining 22% reported as more difficult than expected. When surgeons reported procedures as more difficult than expected, procedural durations (Mdn = 174 min) were significantly longer than the cases that were rated less than (Mdn = 90 min) or as difficult as expected (Mdn = 86 min; χ2(2) = 48.05, P < 0.001). When the cases were less difficult than expected, the actual duration was significantly shorter than the estimated duration [χ2(2) = 59.51, P < 0.001]; when the cases were rated more difficult than expected, the actual duration was significantly longer than the estimated duration (χ2(2) = 59.51, P < 0.001). Additionally, a significantly smaller percentage of open procedures (24%) compared with laparoscopic and robotic procedures (27%) resulted in procedural difficulty level that was higher than expected (χ2(2) = 11.635, P = 0.003). Surgeons reported poorer perceived performance during cases with unexpectedly high difficulty (P < 0.001). Frustration differed statistically across the three difficulty expectation levels, with frustration highest when the difficulty level was higher than expected (P = < 0.001) (Table 3).
Patient and Procedural Factors
The estimated case duration was highly correlated with procedural duration (r = 0.774, both at P<0.001). While none of the NASA-TLX subscales were highly correlated with patient factors or case durations, there were correlations among all NASA-TLX subscales. Strong correlation (ie, r > 0.707) was demonstrated between mental and physical demand (r = 0.798, P < 0.001). Effort was also strongly correlated with mental demand (r = 0.82, P < 0.001) and physical demand (r = 0.834, P < 0.01).
Mental demand (M = 7.7, SD = 5.56), physical demand (M = 7.0, SD = 5.66), and effort (M = 7.8, SD = 5.77) were the highest workload subscales. Performance (measured from 0 = perfect to 20 = failure) was the lowest subscale (M = 2.2, SD = 3.38; Figure 1). Surgeons surpassed the midpoint on the NASA-TLX scale (indicating an unsustainably high workload) during 40% of cases for mental demand and effort and exceeded the midpoint for physical workload in 34% of cases (Figure 2). There was no significant difference across surgical specialties for the 3 highest NASA-TLX subscales and averages were all below the midpoint (P > 0.05; Figure 3). Figure 4 demonstrates that when the surgery was reported as less difficult than expected, most NASA-TLX subscales except physical demand (P = 0.048) and frustration (P = 0.021) are not statistically lower than procedures that met the expectation of the surgeons. However, when the procedure was rated as more difficult than expected, all NASA-TLX subscales except distraction were significantly higher than the demand at the expected difficulty level (P < 0.05). Most subscales were about twice the demand reported during surgeries that met the surgeons expected difficulty.
In this descriptive study, a modified NASA-TLX workload assessment tool was administered electronically to 34 surgeons following surgical procedures to measure self-reported surgeon workload on 662 cases. The surgeon group as a whole consisted of highly experienced surgeons in open and laparoscopic procedures, with less experience in robotic cases. The patient population generally was middle-aged, with the majority of patients classified as ASA Class II or III. All 10 surgical specialties in the Department of Surgery at the institution were represented in the dataset, and General Surgery accounted for the largest number of cases. Procedure times ranged widely—likely due to the institutional classification as a quaternary care center where surgical cases included uncommon and specialized procedures as well as straight forward, conventional cases. Therefore, a large range of procedure times and patient health levels were expected. As this study aimed to identify areas of workload improvement, the data collected provide thorough variation in the type of patients and cases performed among a cross-section of the surgical staff. This is also mirrored in NASA-TLX summary results where surgeons reported minimum workload values for distraction to maximum values for mental demand.
Surgeons reported mental demand among the highest subscales. During 40% of cases, surgeon-reported mental demand exceeded the midpoint threshold on the NASA-TLX subscale. While effort was slightly higher than mental demand, the same percentage of cases exceeded the midpoint for effort as with mental demand. Physical demand in surgery has been well established;22,23 however, the burden of mental demand is less clear in the surgical literature. Studies have determined that high mental demand in the OR can pose a risk to surgical performance and increase adverse events.24–26 The midpoint threshold has been used to indicate an unsustainable demand in other industries.32,33 However, these specific sustainability thresholds have not been established within surgery. Based on the results in this study, future research should focus on minimizing mental demand and effort—the highest reported NASA-TLX subscales. These findings suggest that providing resources and technology for decision support (e.g., advanced imaging and 3-dimensional modeling), which could be potentially analogous to the rear-view camera for the automotive field, may aid in surgical decision-making and planning. If the resources and technology deliver supportive information with positive human–systems interactions, it may reduce mental demand for surgeons.
Expectation and Workload
To prepare for surgical cases with high workload, it is imperative to understand which cases have high workload. A pattern emerged from the data between expected surgical difficulty and both workload and reported performance. Reported workload and performance differed significantly for the surgeons when there was a deviation from the expected procedural difficulty. Specifically, when the surgical difficulty was higher than expected, all NASA-TLX subscales but distraction were significantly higher than cases that were rated at or below the expected difficulty level. While goal expectation has been studied in education and training34 and acknowledged as contributing to workload demand and workload variability,29 the impact of difficulty expectation on task demand has not been quantified. Deviation from expected difficulty is also related to the procedural duration. Procedures that were more difficult than expected lasted significantly longer than cases that were less difficult than or matched surgeon difficulty expectations. There was statistical difference across all three expectation categories where procedures were an average 46 minutes longer than estimated for procedures that were more difficult and an average of 21 minutes shorter when less difficult than anticipated. While the expected duration included anticipated turnover times and the actual duration was only the time from initial skin incision to skin closing, this time difference across expected difficulty levels demonstrates a component that could lead to poor estimation of procedural difficulty and workload. However, surgical duration was not highly correlated with any of the NASA-TLX subscales. This was not expected since longer cases have been associated with a higher work demand,30 and the lack of statistical significance could be due to the variability across surgical specialties and procedure types. This relationship between deviation from expectation and procedural duration demonstrates that unanticipated higher difficulty is associated with an increase in surgeon demand, perceived performance, and longer surgical durations. Longer surgical duration can impact both patients—due to extended time under anesthesia—and the hospital system—causing considerable scheduling and caseload problems.35,36 When performance is compromised, an increase in patient safety risk can arise.37,38 Procedures with unexpectedly high difficulty may be a potential area of high impact for human-system integration advancements to improve workload. By identifying case characteristics that may contribute to unexpected difficulty, surgical team members can prepare equipment and coordination efforts prior to surgery in anticipation.
Procedural and Patient Factors
To aid in procedural difficulty and duration planning, considerable research efforts have been dedicated toward understanding procedural and patient factors that impact surgical duration39–41 for scheduling. Duration may not be universally or equally critical to workload across surgical specialties. In this very diverse set of data, there was only a moderate correlation between procedural duration (skin to skin or wheels-in to wheels-out) and all the NASA-TLX subscales. This lack of significance may be due to the variability across surgical specialties as well as confounding by other specialty-specific patient factors or technologies. Surgeons frequently (up to 40% of the cases) surpassed the midpoint threshold despite duration variation indicating a high workload. Patient BMI was also not correlated with workload measures or duration in this study, which was not expected based on previous findings where almost half of the reference models for predicting difficulty level of laparoscopic cholecystectomies indicated BMI as a significant predictor and found patient BMI to be a significant predictor of procedural duration.42 More complex modeling with a subset of the data (e.g., by specialty, procedure, or difficulty expectation category) may be necessary to better understand the patient factors that may influence workload and procedural duration39,43 to improve the accuracy of procedure time estimations. Currently, there are standard practices for estimating surgical duration;44 however, the process is not without fault and has not been well-defined for all specialties or surgical complexities.45 To refine this process, accessing information that is available before the procedure (e.g., imaging results) could allow administrators and providers to prepare differently for these cases. For example, by including imaging results (specifically gallbladder thickness and impacted gallstones) in procedure prediction models, the model will likely achieve a better prediction of the surgical duration for laparoscopic cholecystectomies.39,43 Streamlining these types of imaging results could deliver concise information to surgeons and aid in the estimation of procedure time, potentially increasing surgical scheduling system efficiency. Additionally, surgeons and their surgical team may be able to prepare for a more realistic procedural difficulty, and patients may receive improved outcomes.
This descriptive study captured surgical demand across a wide variety of procedures. Therefore, patient variables cannot be adequately addressed due to the amount of heterogeneity in the patients across this variety of procedures. Next steps will include collecting a dataset with homogenous patient and procedural variables. Residents and fellows were not included in this study; however, these roles assume primary lead on portions of procedures at the participating institution. The diversity in surgeons and procedures limited the study of patient and procedural factors in depth due to the variability across surgical specialties. Further, the NASA-TLX workload survey may have limited translation across surgical specialties and may not appropriately measure the emotional component of workload involved. Workload results were unbalanced across providers as 20 surgeons completed surveys for less than the requested 20 procedures and 14 completed 20 or more. As surveys were completed following the procedure, results may be biased to participant recollection; however, 80% of surgeons completed their surveys within 48 hours of it being sent out. No other surgical team members were included in the surveyed personnel.
Future research on human factors in surgery will seek to further identify procedural, technological, and patient factors to better understand, predict, and mitigate high surgical workload. One way to reduce workload could be through the provision of decision support with technology. Additionally, refinement of difficulty expectation should be incorporated into surgical preparation and education. Improving both surgeon expectation and procedural time estimation will lead to enhanced provider workload outcomes and could improve teamwork as part of the human–systems integration. Exploration of individual NASA-TLX subscales, Surg-TLX or a combination NASA-TLX and Surg-TLX subscales may identify what is best suited for measuring surgical workload. Finally, surgeon identification of the early impact surgical areas—analogous to rearview cameras in driving—will truly advance the surgical practice and open the next chapter of surgical practice.
This study was the first of its kind to record high self-reported mental demand, physical demand, and effort for surgical procedures across a variety of specialties and identify an increase in workload when procedures were more difficult than expected. More objective and specialty-specific measures are needed to better understand surgical workload differences and implications of procedural and patient factors.
When procedural difficulty is greater than expected, there are negative implications for mental demand, physical demand, and performance. Work is underway to examine patient and case-related variables. Investigating procedure types where difficulty is higher than expected and surgeons reported increased workload will allow for human factors engineers and physicians to identify areas of initial priority for informing more realistic expectations of procedural difficulty. Improvement of human-system interactions in the OR will focus on teamwork, technology, and the OR environment to reduce workload and improve performance.
1. Berguer R, Smith W, Chung Y. Performing laparoscopic surgery is significantly more stressful for the surgeon than open surgery. Surg Endosc
2. Rasmussen J. Information processing and human-machine interaction. An approach to cognitive engineering. New York, NY:Elsevier Science Inc; 1986.
3. Hoc JM. From human–machine interaction to human–machine cooperation. Ergonomics
4. Canes D, Berger A, Aron M, et al. Laparo-endoscopic single site (LESS) versus standard laparoscopic left donor nephrectomy: matched-pair comparison. Eur Urol
5. Tsimoyiannis EC, Tsimogiannis KE, Pappas-Gogos G, et al. Different pain scores in single transumbilical incision laparoscopic cholecystectomy versus classic laparoscopic cholecystectomy: a randomized controlled trial. Surg Endosc
6. Pisanu A, Reccia I, Porceddu G, et al. Meta-analysis of prospective randomized studies comparing single-incision laparoscopic cholecystectomy (SILC) and conventional multiport laparoscopic cholecystectomy (CMLC). J Gastrointest Surg
7. Kunkala M, Bingener J, Park M, et al. Single-port and four-port laparoscopic cholecystectomy: difference in outcomes. Minerva Chir
8. Gangl O, Hofer W, Tomaselli F, et al. Single incision laparoscopic cholecystectomy (SILC) versus laparoscopic cholecystectomy (LC)-a matched pair analysis. Langenbecks Arch Surg
9. Kroh M, Chalikonda S, Chand B, et al. Laparoscopic completion cholecystectomy and common bile duct exploration for retained gallbladder after single-incision cholecystectomy. JSLS
10. Bingener J, Buck L, Richards M, et al. Long-term outcomes in laparoscopic vs open ventral hernia repair. Arch Surg
11. Neale VL, Dingus TA, Klauer SG, et al., An overview of the 100-car naturalistic study and findings. National Highway Traffic Safety Administration, Paper
12. Adell E, Varhelyi A, Alonso M, et al. Developing human–machine interaction components for a driver assistance system for safe speed and safe distance. IET Intell Transport Syst
13. Green P. The 15-second rule for driver information systems. in Proceedings of the ITS America Ninth Annual Meeting. Washington, DC:Intelligent Transportation Society of America; 1999.
14. Dingus TA, Hulse MC, Antin JF, et al. Attentional demand requirements of an automobile moving-map navigation system. Transport Res Part A Gen
15. Green P. Preliminary human factors design guidelines for driver information systems. Final report
16. Bengler K, Dietmayer K, Farber B, et al. Three decades of driver assistance systems: review and future perspectives. IEEE Intell Transport Syst Mag
17. Llaneras RE, Salinger J, Green CA. Human factors
issues associated with limited ability autonomous driving systems: Drivers’ allocation of visual attention to the forward roadway. in Proceedings of the 7th International Driving Symposium on Human Factors
in Driver Assessment, Training and Vehicle Design. University of Iowa Iowa City:Public Policy Center; 2013.
18. Carayon P. Handbook of Human Factors
and Ergonomics in Health Care and Patient Safety. Boca Raton, FL:CRC Press; 2016.
19. Carayon P, Schoofs Hundt A, Karsh BT, et al. Work system design for patient safety: the SEIPS model. Qual Saf Health Care
2006; 15: (suppl 1): i50–i58.
20. Fass D. Rationale for a model of human systems integration: the need of a theoretical framework. J Integr Neurosci
21. Hart SG. Nasa-Task Load Index (NASA-TLX); 20 Years Later.
Proceedings of the Human Factors
and Ergonomics Society Annual Meeting, 2006. 5: 904–908.
22. Sari V, Nieboer TE, Vierhout ME, et al. The operation room as a hostile environment for surgeons: physical complaints during and after laparoscopy. Minim Invasive Ther Allied Technol
23. Park A, Lee G, Seagull FJ, et al. Patients benefit while surgeons suffer: an impending epidemic. J Am Coll Surg
24. McCrory B, LaGrange CA, Hallbeck M. Quality and safety of minimally invasive surgery: past, present, and future. Biomed Eng Comput Biol
25. Yurko YY, Scerbo MW, Prabhu AS, et al. Higher mental workload is associated with poorer laparoscopic performance as measured by the NASA-TLX tool. Simul Healthc
26. Gallagher TH, Studdert D, Levinson W. Disclosing harmful medical errors to patients. N Engl J Med
27. Lowndes BR, Hallbeck MS. Overview of human factors
and ergonomics in the OR, with an emphasis on minimally invasive surgeries. Hum Factors Ergonomics Manuf Service Ind
28. Van Veelen MA, Nederlof EA, Goossens RH, et al. Ergonomic problems encountered by the medical team related to products used for minimally invasive surgery. Surg Endosc
29. Hart SG, Staveland LE. Hancock PA, Meshkati N. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. North Holland Press, Human Mental Workload
30. Yu D, Lowndes B, Thiels C, et al. Quantifying intraoperative workloads across the surgical team roles: room for better balance? World J Surg
31. Wilson MR, Poolton JM, Malhotra N, et al. Development and validation of a surgical workload measure: the surgery task load index (SURG-TLX). World J Surg
32. Mazur LM, Mosaly PR, Hoyle LM, et al. Relating physician's workload with errors during radiation therapy planning. Pract Radiat Oncol
33. Mazur LM, Mosaly PR, Hoyle LM, et al. Subjective and objective quantification of physician's workload and performance during radiation therapy planning tasks. Pract Radiat Oncol
34. Scaduto A, Lindsay D, Chiaburu DS. Leader influences on training effectiveness: motivation and outcome expectation processes. Int J Training Dev
35. Gupta N, Ranjan G, Arora MP, et al. Validation of a scoring system to predict difficult laparoscopic cholecystectomy. Int J Surg
36. Rosen M, Brody F, Ponsky J. Predictive factors for conversion of laparoscopic cholecystectomy. Am J Surg
37. Christian CK, Gustafson ML, Roth EM, et al. A prospective study of patient safety in the operating room. Surgery
38. Karsh B, Holden RJ, Alper SJ, et al. A human factors
engineering paradigm for patient safety: designing to support the performance of the healthcare professional. Qual Saf Health Care
2006; 15: (suppl 1): i59–i65.
39. Lowndes B, Thiels CA, Habermann EB, et al. Impact of patient factors on operative duration during laparoscopic cholecystectomy: evaluation from the National Surgical Quality Improvement Program database. Am J Surg
40. Zhou J, Dexter F, Macario A, et al. Relying solely on historical surgical times to estimate accurately future surgical times is unlikely to reduce the average length of time cases finish late. J Clin Anesth
41. Wright IH, Kooperberg C, Bonar BA, et al. Statistical modeling to predict elective surgery time: comparison with a computer scheduling system and surgeon-provided estimates. Survey Anesthesiol
42. Abdelrahman AM, Bingener J, Yu D, et al. Impact of single-incision laparoscopic cholecystectomy (SILC) versus conventional laparoscopic cholecystectomy (CLC) procedures on surgeon stress and workload: a randomized controlled trial. Surg Endosc
43. Thiels CA, Yu D, Abdelrahman AM, et al. The use of patient factors to improve the prediction of operative duration using laparoscopic cholecystectomy. Surg Endosc
44. American Medical Informatics Association, Hosseini N, Sir MY, Jankowski CJ, et al. Surgical duration estimation via data mining and predictive modeling: a case study. in AMIA Annual Symposium Proceedings. 2015.
45. May JH, Spangler WE, Strum DP, et al. The surgical scheduling problem: current research and future opportunities. Prod Operations Manage