In no single assessment moment were all of the surgeons in the division in agreement about the performance depicted in the video. More importantly, even in moments in which many of the surgeons were united in critique about the procedural choices made in the video, their agreement about principles dissolved once asked to speak about the individual steps of the procedure and what they would do differently. The full depiction of this variation in all 8 clips has been included in Appendix 1, http://links.lww.com/SLA/B225 with quotations from the data arranged from the most amount of disagreement to the least. In a condensed example from clip 4, we have described the disagreement of the surgeons in our sample as it pertained to a mid-procedure dissection performed between the kidney and the spleen (Table 6).
The surgeons in our sample explained how they would approach teaching a particular variation and why their threshold of principle and preference might land differently than another surgeon's. Some described taking time to justify to residents why their own variation works because they “truly believe that memorization [of each surgeon's variations] isn’t the way to go.”’(URS06). Other surgeons felt strongly that the learner's job is to memorize and apply the surgeon's variations: “I don’t always repeat (my variations to them). So I will tell them, ‘you’re a big boy, you’ve got to learn and remember…’ You’re in my OR, you do it my way”(URS03). Still other surgeons insisted that teaching should not increase the length of the procedure: “I would do something in about fifteen seconds that they would’ve taken two minutes to do or four minutes or five minutes or fifteen minutes to do. So I’d take (the tools) away, show them how it's done… If we don’t get to the next part quickly your next patient gets cancelled… we’re always battling the clock”(URS05). From detailed explanation to memorization to pacing expectations, the surgeons in our study approached teaching procedural variations to residents with broadly different attitudes. Our findings suggest that surgeons themselves place thresholds of principles and preferences differently and the surgeons’ beliefs about the learner's role regarding variations influences where their thresholds fall for a given procedure.
The surgeons in our sample disagreed about both how to apply principles and the consequences of failing to do so. Similarly, the surgeons in the sample were split on the value of the video clips on assessing competence. The video was enough for some surgeons to determine competence, whereas others felt making any decision whatsoever about the competence of the learner from the video would be irresponsible. Based on what they had seen in the video, 4 of 11 surgeons (36%) assessed the learner as competent to perform the procedure independently. Another 4 (36%) stated that the learner was not yet competent, and the remaining 3 surgeons (27%) wanted more information about the learner before passing judgment. This variation persisted across surgeons who routinely perform this common procedure and those who do not.
The surgeons’ attitudes toward variations played an important role in the decisions they made about the learner's competence. A surgeon who used different tools to isolate the renal artery than the tool initially chosen by the learner decided that the “management of the hilum”(URS03), and thus the procedure as a whole, had been incompetently performed. A different surgeon endorsed the learner as competent because “it's the same sort of technique I see myself using”(URS02). Another used a different approach to clips around the hilum but deemed the learner competent and the procedure to have been performed “very safely and very well”(URS07). This variation in threshold placement had implications for how the surgeons perceived they would proceed if this was their resident. What one surgeon would stop a resident from doing (thus indicating a principle) another would allow to proceed (thus indicating a preference) and yet another would praise as permissible mistake or even call an excellent performance.
The surgeons in our study agreed on the language of generic principles but appeared to disagree on how to apply them. This finding may help to explain some of the difficulties faced by intraoperative assessment in surgical education34 particularly around poor reliability of surgeon assessors11,12 and a lack of evidence-based benchmarking.13–15
The 5 generic principles endorsed by surgeons in the present study correlate with the generic principles proposed by current approaches to intraoperative assessment of technical skill. The Objective Structured Assessment of Technical Skill35—the most frequently employed tool for intraoperative technical skills assessment in studies using global ratings15—asks surgeons to assess learners’ respect for tissue; time and motion; instrument handling; knowledge of instruments; use of assistants; flow of operation and forward planning; and knowledge of the specific procedure on a 5-point Likert scale.35 Other approaches to surgical technical skill assessment operate according to the same principles-based logic. The Ottawa Surgical Operating Room Score (O-SCORE)36 prompts surgeons to assess learners’ knowledge of procedural steps, and procedure-based assessment37 instructs surgeons to assess learners’ use of instruments appropriately and safely and the learner's ability to work at an appropriate pace with economy of motion. The language of principles remains remarkably consistent across the surgical assessment literature.
Surgical principles may seem clear cut, but important questions about the application of these principles are beginning to emerge. Our previous research claimed that the way surgeons and residents speak of procedural variations begs the question: is a principle always a principle?7–9 The present study proposes an answer to that question. We found the same guiding principles were embedded within surgeons’ statements about safety and competence, but how they applied those principles in practice appeared to vary. What is a principle for one surgeon may well be a preference for another.
Supporting evidence for the persistence of variations in surgical principles continues to emerge from the surgical practice literature. Birkmeyer et al38 demonstrated that procedural approaches of practicing surgeons vary significantly and their rate of complications vary alongside them. Asch et al39 showed that imprinting of procedural variations and their associated complication rates may follow surgeons from their postgraduate training programs to their jobs in independent practice. And, most convincingly, Davidson et al40 used the example of pancreaticoduodenectomy to demonstrate that although surgical procedures may in theory have relatively few variations, in practice they are significantly more complex. Davidson et al40 closely examined the pancreaticoduodenectomy of 5 surgeons and found they differed in “steps and techniques employed”(p. 2) for 7 major areas of the procedure and 21 minor areas. They concluded that each of the 5 surgeons in the study essentially performed 5 different procedures but “all called (it) a Whipple's procedure”(p. 2).40 Although surgical procedures may be commonly thought to be as uniform as the textbook, close examination reveals meaningful heterogeneity in practice.
We found that surgeons’ individual thresholds between what they considered a non-negotiable principle and an inconsequential preference were loosely coupled with their assessment of the learner's performance. By “loosely coupled” we mean that surgeons invoke their personal variations when assessing learners, but do not strictly apply them to the point of expecting mimicry. Instead, a surgeon's attitude toward what constitutes a principle versus a preference seems to partially impact their decision about the competence of a learner. Ultimately, therefore, the degree to which a surgeon assesses a learner to be competent may be shaped by the surgeon's own threshold of principle and preference.
The loose coupling of thresholds and assessment could help to explain some of the challenges encountered by researchers working on the intraoperative assessment of technical competence. Recent research suggests that assessment by expert surgeons may not differ significantly from crowd-sourced assessment by the lay public.16–20 And each of the 5 most recent reviews of technical skill assessment concluded that none of the currently available assessment tools are adequate for credentialing or licensure due to low intersurgeon reliability11,12 and the absence of evidence-based benchmarks of competence.13–15 Even when assessment tools have demonstrated impressive intersurgeon reliability they continue to find limited application due to poor usability10 and difficulty scaling up from the research setting to the clinical environment.34,43,44 Our findings can help explain why surgeons may look for their own variations and make competence judgments based on their individual thresholds of principle and preference.
Investment in assessment tools that harness subjectivity of surgeons may prove crucially important for the future of intraoperative assessment. Competency-based reforms in surgical education have attempted, on the one hand, to seek out consensus between surgeons about standardized competence of learners34,45 and, on the other hand, to longitudinally aggregate individual surgeons’ subjective intraoperative entrustment decisions.46,47 For example, current large-scale innovations in intraoperative assessment have focused on standardized assessment of technical skill using a high-stakes simulation-based practical examination48 or on low-stakes intraoperative assessment using mobile applications such as SIMPL to record shifts in surgeons’ subjective intraoperative entrustment over time.47 Research on the validity of assessment using standardized surgical principles drawn from clinical evidence15 as in standardized assessment remains a laudable aspiration for an evidence-based surgical world. Until then, our findings suggest that surgical educators consider a “programmatic” approach to assessment using tools that collect large samples of multiple low-stakes observations.49,50
Implementation of intraoperative assessment for the purpose of licensure continues to evade surgical education. The findings of the present study provide the first empirical evidence to suggest that the surgeons’ attitudes toward their own procedural variations may be the most important factor shaping the seemingly inescapable subjectivity in intraoperative assessment. Researchers and educators should consider the formative influence of surgeons’ thresholds of principle and preference on assessment of learner competence when designing, testing, and implementing approaches to competency-based surgical education.
We urge caution in interpreting our findings as some of the variation in assessment uncovered in our findings may be an artifact of the grounded theory design. We acknowledge that asking surgeons to make judgments based on decontextualized clips from procedures within their own specialty may seem to amplify the actual variations in practice.55 To deal with this potential over-representation we negotiated closely over which moments of variation should be reported in the data. Our goal was to emergently build a dataset and theoretical framework that could inform further research in the area of procedural variations.
We chose to recruit an entire surgical division. Including 4 of the surgeons in the division who do not routinely perform laparoscopic nephrectomies may also over-represent procedural variations in the data; however, we chose this approach to sampling because it reflects the likely composition of clinical competence committees30 based on surgical specialty rather than by subspecialty. Keeping this question of variability in mind, we encourage future researchers to use our qualitatively derived findings to deductively analyze how tightly coupled thresholds and assessment may be.
1. Holmboe ES, Sherbino J, Long DM, et al. The role of assessment in competency-based medical education
. Med Teach
2. Gruppen LD, Burkhardt JC, Fitzgerald JT, et al. Competency-based education: programme design and challenges to implementation. Med Educ
3. Frank J, Mungroo R, Ahmad Y, et al. Toward a definition of competency-based education in medicine: a systematic review of published definitions. Med Teach
4. Tekian A, Hodges BD, Roberts TE, et al. Assessing competencies using milestones along the way. Med Teach
5. Ten Cate O. Trust, competence
, and the supervisor's role in postgraduate training. BMJ
6. Sklar DP. Competencies, milestones, and entrustable professional activities
: what they are, what they could be. Acad Med
7. Apramian T, Cristancho S, Watling C, et al. Thresholds of principle and preference: exploring procedural variation in postgraduate surgical education
. Acad Med
8. Apramian T, Cristancho SM, Watling CJ, et al. ‘They have to adapt to learn’: surgeons’ perspectives on the role of procedural variations
in surgical education
. J Surg Educ
9. Apramian T, Cristancho SM, Watling CJ, et al. ‘Staying in the game’: how procedural variations
judgments in surgical education
. Acad Med
10. Pereira EAC, Dean BJF. British surgeons’ experiences of a mandatory online workplace based assessment portfolio resurveyed three years on. J Surg Educ
11. Van Hove P, Tuijthof G, Verdaasdonk E, et al. Objective assessment of technical surgical skills. Br J Surg
12. Ahmed K, Miskovic D, Darzi A, et al. Observational tools for assessment of procedural skills: a systematic review. Am J Surg
13. Ghaderi I, Manji F, Park YS, et al. Technical skills assessment toolbox: a review using the unitary framework of validity. Ann Surg
14. Shalhoub J, Vesey AT, Fitzgerald JE. What evidence is there for the use of workplace-based assessment in surgical training? J Surg Educ
15. Szasz P, Louridas M, Harris KA, et al. Assessing technical competence
in surgical trainees: a systematic review. Ann Surg
16. Powers MK, Boonjindasup A, Pinsky M, et al. Crowdsourcing assessment of surgeon dissection of renal artery and vein during robotic partial nephrectomy: a novel approach for quantitative assessment of surgical performance. J Endourol
17. Kowalewski TM, Comstock B, Sweet R, et al. Crowd-sourced assessment of technical skills for validation of basic laparoscopic urologic skills tasks. J Urol
18. Deal SB, Lendvay TS, Haque MI, et al. Crowd-sourced assessment of technical skills: an opportunity for improvement in the assessment of laparoscopic surgical skills. Am J Surg
19. Holst D, Kowalewski TM, White LW, et al. Crowd-sourced assessment of technical skills: differentiating animate surgical skill through the wisdom of crowds. J Endourol
20. Ranard BL, Ha YP, Meisel ZF, et al. Crowdsourcing—harnessing the masses to advance health and medicine, a systematic review. J Gen Intern Med
21. Bernstein M, Khu KJ. Is there too much variability in technical neurosurgery decision-making? Virtual Tumour Board of a challenging case. Acta Neurochir
22. Kerver A, Leliveld M, Theeuwes H, et al. Inter-surgeon variation in skin incisions for tibial nailing in relation to the infrapatellar nerve. Injury Extra
23. Walter A, Buller J. Variability of reported techniques for performance of the pubovaginal sling procedure. Int Urogynecol J
24. Van der Vleuten C, Norman G, Graaff E. Pitfalls in the pursuit of objectivity: issues of reliability. Med Educ
25. Govaerts M, Van der Vleuten CPM. Validity in work-based assessment: expanding our horizons. Med Educ
26. Gingerich A, Van der Vleuten CPM, Eva KW, et al. More consensus than idiosyncrasy: categorizing social judgments to examine variability in mini-CEX ratings. Acad Med
27. Van der Vleuten C, Schuwirth LW, Scheele F, et al. The assessment of professional competence
: building blocks for theory development. Best Pract Res Clin Obstet Gynaecol
28. Hogan T, Hinrichs U, Hornecker E. The elicitation interview technique: capturing people's experiences of data representations. IEEE Trans Vis Comput Graph
29. Geertz C. Martin M, McIntyre L. Thick description: toward an interpretive theory of culture. Readings in the Philosophy of Social Science
. Cambridge, MA: The MIT Press; 1994. 311–323.
30. French JC, Dannefer EF, Colbert CY. A systematic approach toward building a fully operational clinical competency committee. J Surg Educ
31. Gardner AK, Scott DJ, Hebert JC, et al. Gearing up for milestones in surgery: will simulation play a role? Surgery
32. Charmaz K. Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis. 2nd edThousand Oaks, CA: SAGE Publications; 2014.
33. Glaser BG. The constant comparative method of qualitative analysis. Soc Probl
34. Arora S, Darzi A. Introducing technical skills assessment into certification: closing the implementation gap. Ann Surg
35. Martin J, Regehr G, Reznick R, et al. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg
36. Gofton WT, Dudek NL, Wood TJ, et al. The Ottawa Surgical Competency Operating Room Evaluation (O-SCORE): a tool to assess surgical competence
. Acad Med
37. Marriott J, Purdie H, Crossley J, et al. Evaluation of procedure-based assessment for assessing trainees’ skills in the operating theatre. Br J Surg
38. Birkmeyer JD, Finks JF, O’Reilly A, et al. Surgical skill and complication rates after bariatric surgery. N Engl J Med
39. Asch DA, Nicholson S, Srinivas SK, et al. How do you deliver a good obstetrician? Outcome-based evaluation of medical education. Acad Med
40. Davidson SJ, Rojnica M, Matthews JB, et al. Variation and acquisition of complex techniques: pancreaticoduodenectomy. Surg Innov
41. Reames BN, Shubeck SP, Birkmeyer JD. Strategies for reducing regional variation in the use of surgery: a systematic review. Ann Surg
42. Urbach DR. Closing in on surgical practice variations. Ann Surg
43. Lendvay TS, White L, Kowalewski T. Crowdsourcing to assess surgical skill. JAMA Surg
44. Szasz P, Louridas M, Harris KA, et al. Strategies for increasing the feasibility of performance assessments during competency-based education: subjective and objective evaluations correlate in the operating room. Am J Surg
2016; E-pub ahead of print.
45. Szasz P, Louridas M, De Montbrun S, et al. Consensus-based training and assessment model for general surgery. Br J Surg
46. Teman NR, Gauger PG, Mullan PB, et al. Entrustment of general surgery residents in the operating room: factors contributing to provision of resident autonomy. J Am Coll Surg
47. Bohnen JD, George BC, Williams RG, et al. The feasibility of real-time intraoperative performance assessment with SIMPL (System for Improving and Measuring Procedural Learning): early experience from a multi-institutional trial. J Surg Educ
48. De Montbrun S, Roberts PL, Satterthwaite L, et al. Implementing and evaluating a national certification technical skills examination: the colorectal objective structured assessment of technical skill. Ann Surg
49. Van der Vleuten C, Schuwirth LW. Assessing professional competence
: from methods to programmes. Med Educ
50. Cook DA, Brydges R, Ginsburg S, et al. A contemporary approach to validity arguments: a practical guide to Kane's framework. Med Educ
51. George BC, Teitelbaum EN, Meyerson SL, et al. Reliability, validity, and feasibility of the Zwisch scale for the assessment of intraoperative performance. J Surg Educ
52. Englander R, Flynn T, Call S, et al. Toward defining the foundation of the MD degree: core entrustable professional activities
for entering residency. Acad Med
53. Warm EJ, Held JD, Hellmann M, et al. Entrusting observable practice activities and milestones over the 36 months of an internal medicine residency. Acad Med
54. Van Loon KA, Teunissen PW, Driessen EW, et al. The role of generic competencies in the entrustment of professional activities: a nationwide competency-based curriculum assessed. J Grad Med Educ
55. Gingerich A, Ramlo S, Van der Vleuten C, et al. Inter-rater variability as mutual disagreement: identifying raters’ divergent points of view. Adv Health Sci Educ
2016; E-pub ahead of print.