Implementation of new surgical technologies is an ever-present challenge to practicing surgeons and their patients. While offering improvements over older, established techniques, these advantages are often unrealized until adequate experience with the novel approach is attained. Unfortunately, improvement in skill is often simplistically and infrequently scrutinized when these techniques are implemented.1
The da Vinci Surgical System has been used increasingly in several surgical specialties because of its enhanced minimally invasive surgical capabilities derived from three-dimensional, surgeon-controlled imaging and ergonomic instrumentation mimicking the human wrist. Gynecology and its surgical subspecialties have adopted the system for use in many complex procedures previously performed only through laparotomy. Although many studies show advantages of the robotic techniques over their open or laparoscopic counterparts,2–7 few studies have evaluated the learning curve associated with attaining proficiency with this approach.
Existing studies investigating the learning curve of robotic hysterectomy are limited by small sample size and inconsistent reporting of complication rates.8–10 These studies also focus on stabilization of operative time as defining proficiency, which has been called into question as the optimal method of constructing learning curves because it does not take patient outcomes into account.11 Our goals in this study were to characterize changes in perioperative parameters and complications of robotic hysterectomy with increasing surgeon experience. We also aimed to better elucidate the learning curve of robotic surgery by applying patient-centered outcome measures and analytic methods proposed in the literature for learning curve assessment.
MATERIALS AND METHODS
After Mayo Clinic Institutional Review Board approval, a database was constructed of all robotically assisted gynecologic procedures performed by the Division of Gynecologic Surgery at Mayo Clinic in Rochester. Patients who underwent robotically assisted procedures were identified through a search of all gynecologic operative notes using Surgical Operative Note Explorer for the terms robotic and da Vinci since the introduction of the da Vinci system at Mayo Clinic on January 1, 2007, through December 31, 2009. To ensure accuracy of the database and inclusion of all procedures, identified operative notes were manually reviewed and then crossreferenced with a separate search using the surgical information recording system at Mayo Clinic.
A review was then conducted of all robotically assisted hysterectomies with or without bilateral salpingo-oophorectomy. Data were abstracted from all office, nursing, anesthesia, operative, and phone notes in the electronic medical record using a standardized data collection form. Data collected included basic demographic information; body mass index (BMI, calculated as weight (kg)/[height (m)]2); obstetric, medical, and surgical histories; operative time; intraoperative injury; uterine weight; conversion to laparotomy; length of stay; and postoperative complications within 6 weeks of surgery. Operative time was defined as the time from first incision to closure of the last incision and included robot docking time. Intraoperative injury was defined as inadvertent injury to any organ or structure, including bowel, bladder, ureter, and large vessels, as noted in the operative report. Comorbid conditions were classified according to the Charlson comorbidity index,12 and postoperative complications were graded I through V, as described by Dindo et al.13 Specific outcomes investigated for change with operator experience were 1) operative time; 2) length of stay longer than 1 day, defined as calendar days from admission to discharge; 3) intraoperative complications, including incidental injury to bowel (excluding serosal injuries resulting from lysis of adhesions), bladder, ureter, or vessel, or conversion to laparotomy; and 4) any of the aforementioned intraoperative complications plus postoperative complications with a Dindo grade of II or higher (Appendix 1, available online at http://links.lww.com/AOG/A338). Of note, routine cystoscopy is not performed in conjunction with robotic hysterectomy at our institution.
The analysis focused on robotically assisted hysterectomies without lymphadenectomy performed by eight surgeons during the 36-month period. Because each surgeon started performing the procedure at a different time during the study period, “time” was defined as days from the first robotically assisted hysterectomy each surgeon performed in the cohort. To descriptively summarize the results over the days of experience, the procedures were divided into six subgroups according to whether a procedure was performed in the first 6 months of a surgeon's experience, the second 6 months, and so forth. The Spearman rank correlation coefficient was used to evaluate the correlation between days of experience and the continuously scaled patient characteristics (eg, BMI). The Cochran-Armitage trend test was used to evaluate changes in dichotomous patient characteristics (eg, Charlson comorbidity index greater than 0) across the six subgroups.
Univariable logistic regression models were fit to evaluate the relationships among each of the potential confounders and each of the binary outcomes (length of stay longer than 1 day, intraoperative complications, and any intraoperative or postoperative complications, respectively). Similarly, linear regression models were fit for the continuous outcome (operative time). The potential confounders were age, BMI, Charlson index (categorized as none, 1–2, 3–4, 5 or greater), uterine weight (after applying a logarithmic transformation), number of prior abdominal surgical procedures (categorized as none, one, two or more), and surgeon. Multivariable models were then fit to evaluate the relationship between time (ie, days of experience) and each of the outcomes after adjusting for the significant confounders. All calculated P values were two-sided, and P values <.05 were considered statistically significant.
To assess the learning curve, the cumulative summation chart methodology was used based on the occurrence of intraoperative complications (outcome 1) and any intraoperative or postoperative complications (outcome 2). Standard and risk-adjusted cumulative summation curves were constructed separately for the two learning curve outcomes. Proficiency was defined as the point at which each surgeon's cumulative summation curve crossed the acceptable control limit (H0) as derived from published complication rates of abdominal hysterectomy.14,15 (See Appendix 2, available online at http://links.lww.com/AOG/A339, for further description of the cumulative summation analysis.) Analyses were performed using the SAS 9.2 software package.
A total of 566 patients was identified as having a robotically assisted gynecologic procedure performed during the 36-month period. Surgeons' characteristics are summarized in Table 1. As depicted in Figure 1, robotic hysterectomy without lymphadenectomy represented the majority of the procedures (n=325 [57.4%]) with the addition of more complex procedures over time.
Table 2 summarizes the surgeons' experience with robotic hysterectomies and the patient characteristics within each 6-month period. During the first 6 months of the surgeons' experience, 63 hysterectomies were performed. Comparing across the six subgroups, the patients' mean age, median BMI, median uterine weight, and number of prior abdominal surgical procedures did not vary significantly. However, the proportion of patients with a Charlson comorbidity index score higher than 0 increased over the study period (P=.01), from 39.7% in the initial 6 months to 64.7% in the last 6 months.
A total of 110 patients (33.8%) had a length of stay of longer than 1 day, as depicted in Table 3. During the first 6 months of the surgeons' experience, the proportion of patients with length of stay longer than 1 day was 49.2% compared with 14.7% in the last 6 months. Of the factors listed in Table 2, age, uterine weight, and Charlson index were each identified as significantly associated with having a length of stay longer than 1 day (P<.05). After adjusting for these factors and surgeon, the odds of a length of stay longer than 1 day still significantly decreased with experience (odds ratio 0.58, 95% confidence interval [CI] 0.47–0.72) per each additional 6 months of experience.
The mean operative time continued to decrease with experience as well (Table 3; Fig. 2). During the first 6 months, the mean operative time was 3.5 hours compared with 2.7 hours in the last 6 months. Uterine weight and BMI were each identified as significantly associated with having a longer operative time (P<.05). After adjusting for surgeon and these two factors, the inverse relationship between operative time and surgical experience remained statistically significant (P=.005); mean operative time decreased by 0.11 hours (95% CI −0.18 to −0.03) for each additional 6 months of experience.
Intraoperative injury or conversion to laparotomy occurred in 18 patients (5.5%, 95% CI 3.1–8.0%); however, four of these were serosal bowel injuries in the setting of lysis of adhesions and one was a conversion secondary to an incidental finding of colorectal cancer. The 13 remaining complications were considered relevant and thus were considered in the analysis (Table 3). Of these 13, five injuries were repaired robotically (incidental cystotomy in three patients, colon injury in two), and eight were not repaired robotically (colon injury with conversion to laparotomy in one patient, ureteral injury with conversion to laparotomy in one, and six additional conversions). Of these six additional conversions, one was to retrieve a lost needle, and the other five were for completion of a difficult hysterectomy.
Of the factors listed in Table 2, the only one identified as significantly associated with having a relevant intraoperative complication was uterine weight; the median weight was 685 g compared with 140 g for the patients with and without an intraoperative complication, respectively (P<.001). After adjusting for uterine weight, there was a tendency for the odds of having an intraoperative complication to decrease with experience, although this did not reach statistical significance; for every additional 6-month period of surgeon experience, the odds of having a relevant intraoperative complication were 0.74 (95% CI 0.50–1.10).
Standard and risk-adjusted cumulative summation charts for intraoperative complications are presented in Figure 3 for the two surgeons with 36 months of experience. The other six surgeons' cumulative summation charts are not included; none had 36 months' experience at that point, none crossed H0, and all closely paralleled the two most experienced surgeons' charts. According to the standard cumulative summation chart (Fig. 3A), neither of the curves for the two surgeons crossed the unacceptable limit (upper control limit, H1). Only surgeon A performed enough procedures for the curve to cross the acceptable limit (lower control limit, H0) after 96 attempts. On the basis of the parameters specified a priori for the cumulative summation analysis (p0, p1, type I and II error rates), the average number of attempts needed to cross the acceptable limit was 91. The predicted risk of intraoperative complication was determined from a logistic model that included uterine weight. Using these estimates, a risk-adjusted cumulative summation curve was constructed for surgeons A and B (Fig. 3B). If the intraoperative complications were occurring as expected from the risk model, the curves in Figure 3B would be relatively flat around the reference line of 0. For both surgeons, the curves initially trended upward, indicating that the actual number of complications was initially more than expected. Surgeon A performed enough procedures to consistently have a lower than expected complication rate, and the expected complications continued to decrease. Surgeon B was beginning to trend downward toward the expected rate as predicted by the model.
Among the 325 patients, 53 (16.3%, 95% CI 12.3–20.3%) had a relevant intraoperative or postoperative complication with Dindo grade II or higher within the first 6 weeks (Table 3). Like with intraoperative complications, the only factor identified as significantly associated with any intraoperative or postoperative complication was uterine weight; the median weight was 210 compared with 140 g for the patients with any complication compared with those without (P=.01). The proportion with a complication did not change significantly over time, even after adjusting for uterine weight (P=.40 and P=.49, respectively).
Standard and risk-adjusted cumulative summation charts for any intraoperative or postoperative complications are presented in Figure 4. According to the standard cumulative summation chart (Fig. 4A), neither of the curves for surgeons A and B crossed the unacceptable limit (upper control limit, H1), and both surgeons performed enough procedures for their curves to cross the acceptable limit (lower control limit, H0). Surgeon A crossed this limit after 21 attempts, and surgeon B crossed this limit after 14. Based on the parameters specified a priori for the cumulative summation analysis (p0, p1, type I and II error rates), the average number of attempts needed to cross the acceptable limit was 44. The predicted risk of any intraoperative or postoperative complications was determined from a logistic model that included uterine weight. The risk-adjusted cumulative summation curves constructed using these estimates for each surgeon are shown in Figure 4B and show both surgeons to have complications occurring near the rate predicted by the model.
Cumulative summation analysis has been used recently in other fields for evaluation of surgical learning curves16–18 and has been found to be a more sensitive technique to evaluate learning curves than standard statistical approaches.19 Originally introduced to evaluate industrial quality control,20 the cumulative summation method determines if a process or procedure is acceptable or unacceptable using binary outcomes, like in this study, the presence or absence of a complication.
Prior studies investigating the learning curve of robotic hysterectomy use stabilization of operative times to determine competence and conclude that 20–50 cases are needed.8–10 This method is rather arbitrary and does not consider patient outcomes as the primary measure. Patient morbidity is the true measure of a surgeon's skill rather than an ability to perform a procedure in a consistent amount of time. Our cumulative summation analysis compared the learning curve with a benchmark complication rate, that of abdominal hysterectomy, to determine surgical expertise. Although we describe trends in operative time, complications, and length of stay with experience, these are included to elucidate our experience over time in more familiar terms.
The cumulative summation analyses resulted in two differing proficiency points depending on which complications are considered. Proficiency is reached much later, after 91 procedures, if only intraoperative injuries are used compared with 44 if any complications are considered. This difference is attributable to the infrequency of intraoperative injury. It may also suggest that differences in postoperative complications are greater between laparotomy and laparoscopy than in the difference in intraoperative complications between these approaches, which is consistent with many of the benefits of laparoscopy.21 Because postoperative complications are more dependent on surgical approach rather than skill, intraoperative morbidity is more representative of surgical proficiency. Therefore, we conclude that proficiency is reached after 91 procedures, considerably later than prior studies,8–10 because this is the point at which robotic hysterectomy causes less intraoperative morbidity than its alternative, abdominal hysterectomy.
The number of attempts calculated to cross the lower control limit for both outcomes showed different results than those observed from surgeons A and B. With all complications considered, we observed the surgeons to cross H0 after 21 and 14 attempts, sooner than the 44 attempts calculated from p0, p1, and type I and type II errors. Similarly, with only intraoperative complications considered, surgeon A crossed this limit after 96 attempts compared with 91 on the basis of predefined parameters. The similarity between the observed and calculated number of attempts to cross H0 lends further support to our decision to base the learning curve on intraoperative complications only.
Limitations of this study include reliance on the electronic medical record, the small number of surgeons with sufficient experience for learning curve analysis, and the inability to quantify trainee contributions to procedures. During the study period, the proportion of robotic procedures categorized as teaching cases increased from 20% to 60% (data not shown). The effect of operator experience on operative time may have been partially obscured by trainee education, although our data provide evidence for increasing patient safety with added experience, even as learners were increasingly incorporated into robotic procedures.
Our data provide a valuable description of the implementation of robotic hysterectomy at a large academic center as well as an objective, patient-centered evaluation of the learning curve. Robotic gynecologic surgery is an area in which such measures are greatly needed because this technology is being used by an increasing number of gynecologists. Furthermore, because of its original intended use as a quality control measure, the cumulative summation method provides a means to evaluate surgical quality prospectively, which may be of great use in individual or hospital practice settings with both new and established procedures. We suggest that standard cumulative summation charting be used to determine surgical proficiency and that both standard and risk-adjusted methods be used for surgical quality monitoring. Cumulative summation measurements should raise concern if the curve crosses the unacceptable control limit using standard cumulative summation charting or if complications trend upward, above the expected complication rate, using risk-adjusted methods. Although we conclude that proficiency is reached at approximately 91 robotic hysterectomies, an important implication of this analysis is that proficiency may not be attained at the same point by other gynecologic surgeons in different settings (ie, higher surgical volumes may likely result in shorter learning curves and vice versa). True evaluation of the learning curve must be individualized and compared with standard outcome measures so that the possible benefits of robotic hysterectomy as well as new surgical techniques across other fields may be translated into improved patient outcomes.
1. Ramsay CR, Grant AM, Wallace SA, Garthwaite PH, Monk AF, Russell IT. Assessment of the learning curve in health technologies: a systematic review. Int J Technol Assess Health Care 2000;16:1095–108.
2. Elliott DS, Frank I, Dimarco DS, Chow GK. Gynecologic use of robotically assisted laparoscopy: sacrocolpopexy for the treatment of high-grade vaginal vault prolapse. Am J Surg 2004;188:52S–6S.
3. DeNardis SA, Holloway RW, Bigsby GE 4th, Pikaart DP, Ahmad S, Finkler NJ. Robotically assisted laparoscopic hysterectomy versus total abdominal hysterectomy and lymphadenectomy for endometrial cancer. Gynecol Oncol 2008;111:412–7.
4. Geller EJ, Siddiqui NY, Wu JM, Visco AG. Short-term outcomes of robotic sacrocolpopexy compared with abdominal sacrocolpopexy. Obstet Gynecol 2008;112:1201–6.
5. Magrina JF, Kho RM, Weaver AL, Montero RP, Magtibay PM. Robotic radical hysterectomy: comparison with laparoscopy and laparotomy. Gynecol Oncol 2008;109:86–91.
6. Boggess JF, Gehrig PA, Cantrell L, Shafer A, Mendivil A, Rossi E, et al.. Perioperative outcomes of robotically assisted hysterectomy for benign cases with complex pathology. Obstet Gynecol 2009;114:585–93.
7. Estape R, Lambrou N, Diaz R, Estape E, Dunkin N, Rivera A. A case matched analysis of robotic radical hysterectomy with lymphadenectomy compared with laparoscopy and laparotomy. Gynecol Oncol 2009;113:357–61.
8. Lenihan JP Jr, Kovanda C, Seshadri-Kreaden U. What is the learning curve for robotic assisted gynecologic surgery? J Minim Invasive Gynecol 2008;15:589–94.
9. Pitter MC, Anderson P, Blissett A, Pemberton N. Robotic-assisted gynaecological surgery-establishing training criteria: minimizing operative time and blood loss. Int J Med Robot 2008;4:114–20.
10. Bell MC, Torgerson JL, Kreaden U. The first 100 da Vinci hysterectomies: an analysis of the learning curve for a single surgeon. S D Med 2009;62:91, 93–5.
11. Chen W, Sailhamer E, Berger DL, Rattner DW. Operative time is a poor surrogate for the learning curve in laparoscopic colorectal surgery. Surg Endosc 2007;21:238–43.
12. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis 1987;40:373–83.
13. Dindo D, Demartines N, Clavien PA. Classification of surgical complications: a new proposal with evaluation in a cohort of 6336 patients and results of a survey. Ann Surg 2004;240:205–13.
14. Dicker RC, Greenspan JR, Strauss LT, Cowart MR, Scally MJ, Peterson HB, et al.. Complications of abdominal and vaginal hysterectomy among women of reproductive age in the United States: the Collaborative Review of Sterilization. Am J Obstet Gynecol 1982;144:841–8.
15. Harris WJ. Early complications of abdominal and vaginal hysterectomy. Obstet Gynecol Surv 1995;50:795–805.
16. Komatsu R, Kasuya Y, Yogo H, Sessler DI, Mascha E, Yang D, et al.. Learning curves for bag-and-mask ventilation and orotracheal intubation: an application of the cumulative sum method. Anesthesiology 2010;112:1525–31.
17. Okrainec A, Ferri LE, Feldman LS, Fried GM. Defining the learning curve in laparoscopic paraesophageal hernia repair: a CUSUM analysis. Surg Endosc 2011;25:1083–7.
18. Novick RJ, Stitt LW. The learning curve of an academic cardiac surgeon: use of the CUSUM method. J Card Surg 1999;14:312–20; discussion 321–2.
19. Novick RJ, Fox SA, Kiaii BB, Stitt LW, Rayman R, Kodera K, et al.. Analysis of the learning curve in telerobotic, beating heart coronary artery bypass grafting: a 90 patient experience. Ann Thorac Surg 2003;76:749–53.
20. Page ES. Continuous inspection schemes. Biometrika 1954;41:100–15.
21. Johnson N, Barlow D, Lethaby A, Tavender E, Curr L, Garry R. Methods of hysterectomy: systematic review and meta-analysis of randomised controlled trials. BMJ 2005;330:1478.