Supervision of anesthesia residents and nurse anesthetists is a major responsibility of faculty anesthesiologists in the United States. By the term “supervision,” we refer to all clinical oversight functions directed toward assuring the quality of clinical care whenever the anesthesiologist is not the sole anesthesia care provider. Adequate supervision is a requirement for billing compliance and for residents’ postgraduate medical education, as mandated by the Centers for Medicare and Medicaid Servicesa and the Accreditation Council for Graduate Medical Education,b respectively.
We have shown previously that the de Oliveira Filho clinical supervision scale is reliable, valid, and useful for evaluating both individual anesthesiologists’ quality of supervision and group performance of supervision.1–11 At the University of Iowa, the evaluation process has consisted of daily, automated e-mail requests12 to anesthesia residents and nurse anesthetists to evaluate the supervision provided by each faculty anesthesiologist with whom they worked the previous day in operating rooms (ORs). The supervision scores provided by anesthesia residents and nurse anesthetists serve as an independent measure of the contribution of the faculty anesthesiologist to the care of the patient.10 When supervision quality was monitored and feedback provided to the faculty anesthesiologists, supervision quality increased for both residents and nurse anesthetists (multiple P all ≤.0011).10 Among nurse anesthetists, the increase was a mean of 0.28 units on the 4-point scale (SE 0.02), due principally to the questions associated with teaching (eg, “stimulate my clinical reasoning, critical thinking, and theoretical learning”; Table 1).10
There is limited information about the association between clinical subspecialization of faculty anesthesiologists and the effectiveness or quality of their supervision. In a previous national survey of anesthesia residents, supervision scores did not differ among specific rotations (eg, anesthesia clinical specialties).8 Rather, supervision scores were associated with teamwork during the rotation.8 Thus, nationally, there was no difference in the quality of supervision provided during cardiac anesthesia versus pediatric anesthesia, etc. However, that study did not address the question of whether individual faculty who tend to specialize in their clinical practice (“specialists”) provided better clinical supervision than faculty who specialized less (ie, “generalists”). Therefore, in the current investigation, we used retrospective observational data to evaluate the slope of the association between the level of specialization of individual anesthesiologists and the quality of their supervision.
In a prior study, we observed that cases’ physiological complexity and duration had no meaningful associations with daily faculty supervision scores.5 Therefore, we hypothesized that there would be no significant association between faculty anesthesiologists’ specialization and their supervision scores. We performed the study to test the hypothesis because (1) quality supervision is associated with less patient injury2,3,8; (2) anesthesiologist specialization is expensive and, based on our recent national study, will be associated with greater cost at many hospitals (see Discussion)13; and (3) specialization has at most limited operational benefits to defray the costs.14
The University of Iowa IRB declared that this investigation did not meet the regulatory definition of human subject research. All analyses were performed with deidentified data.
Evaluation requests were sent automatically to the providers by e-mail the following day when either a resident or a nurse anesthetist worked with a (faculty) anesthesiologist for at least 1 hour in an operative setting, as determined from the Epic anesthesia information management system (Epic Systems, Madison, WI).12 The locations from which cases were evaluated were the University of Iowa’s main surgical suite (32 ORs), ambulatory surgery center (8 ORs), urology suite, labor and delivery suite, electroconvulsive therapy suite, and non-OR locations such as the pediatric catheterization suite (Table 2). Residents and nurse anesthetists evaluated anesthesiologists’ supervision by logging in to a secure Web page.5 They answered the 9 questions developed by de Oliveira Filho et al1 using the corresponding 4-point Likert scale (Table 1). The evaluation could only be submitted when all 9 questions were answered and could not be revised after submission.5 The supervision score is the mean of the 9 answers.1
Supervision scores provided by residents and nurse anesthetists were considered separately.5,9 Supervision scores provided by residents are greater than those of nurse anesthetists both unpaired (ie, incorporating heterogeneities in case assignments) and pairwise by anesthesiologist (both P < .0001).5,9 For each anesthesiologist, the mean supervision score was calculated by taking the mean of each rater’s scores of the anesthesiologist and calculating the mean among the means of each rater.5,6 Thus, each rater (ie, not each evaluation) received equal weight.15,c For the calculations in Tables 3 and 4, there were 228 different values for the dependent variable. The standard error of the mean for each anesthesiologist was calculated using the standard deviation among raters of each rater's mean for the anesthesiologist.
For the current study, we used 2 periods studied previously. We used one 6-month period (July 1, 2013, to December 31, 2013) because it provided conditions when there was no feedback on supervision scores provided to anesthesiologists. We used another 6-month period because it provided different conditions, with feedback to anesthesiologists for both residents’ and nurse anesthetists’ evaluations (July 1, 2014, to December 31, 2014).10 These 2 periods included the same 6 months of the year (ie, levels of resident training).10d We used sample sizes of 6-month periods because we knew from our previous studies that 6 months was a sufficient duration in our department for there to be an adequate number of unique raters (for both residents and nurse anesthetists) to differentiate individual anesthesiologists’ mean supervision score from those of other anesthesiologists in the department.5,6,9,10
Independent Variable—Diversity (eg, “Specialization”)
For each supervising anesthesiologist and 6-month period, we calculated the proportion of all anesthetic cases attributable to each anesthetic Current Procedural Terminology® (CPT) code. The sum of the square of the proportions is the Herfindahl index, a measure of diversity.16 The inverse of the Herfindahl represents the effective number of common procedures.16 The fewer the number of common procedures, the greater the level of clinical specialization, regardless of the specific type of case (eg, cardiac or neurological).
Among the 31,126 records from the two 6-month periods, there were 491 records of labor analgesia, CPT 01967. These were excluded because a process for reliable entry of timing data for individual anesthesiologists had not yet been added to the electronic anesthesia information management system for these cases, which involved discontinuous anesthesia presence. We also deleted the 43 records with no valid anesthesia CPT code assigned by the anesthesia billing office, leaving 31,592 cases.
The diversity (degree of specialization) of each faculty anesthesiologist’s practice was measured in 3 different ways, attributing each case to: (1) the anesthesiologist who supervised for the longest total period of time, (2) the anesthesiologist starting the case (regardless of date or time), or (3) the anesthesiologist starting the case during “regular hours” (defined as a non- holiday Monday to Friday between 07:00 am and 02:59 pm). For the latter, we included only the corresponding 27,468 cases. After calculations were performed, we examined the anesthesia codes and anesthetic locations of the faculty anesthesiologists who had the least and greatest diversity of practice (Table 2; Figure).17–20 Each pattern of codes was reasonable.e
Our inferential objective was to test the slope between the diversity of practice of the supervising faculty anesthesiologist and the quality of the anesthesiologist’s supervision. Complicating the linear regression, both the dependent variable (supervision) and the independent variable (specialization) were subject to measurement error (Table 2). The standard error of the estimated number of common procedures of each anesthesiologist was estimated using the first- and second-order Taylor series expansions (ie, Delta method; see equations 10 to 12 and footnote m in Reference 16).16,21 The standard errors differed among anesthesiologists (Figure). The anesthesiologists differed in numbers of cases and numbers of raters based, in part, on differences in specialties’ typical case durations and relative assignments of anesthesia residents versus nurse anesthetists.
Bivariate-weighted least-squares regression allowed for inverse weighting based on the squares of both the standard errors of the mean supervision scores and the estimated number of common procedures. The bivariate-weighted least-squares regression was performed using the iterative method described by York22 and Williamson.23 The equations are summarized by Cantrell24 (equation 5). Tellinghuisen25 performed Monte-Carlo simulations of the method with magnitudes of error similar to ours (Figure; f = 0.50 in his Table 1). The resulting parameter estimates were unbiased to the third significant digit.25
Typical unweighted least-squares linear regression was not used in Table 3 because unweighted regression assumes that there is equal measurement error of the dependent variable (supervision) among faculty anesthesiologists and no error in the independent variable (specialization). Ignoring the measurement error in the independent variable (1) attenuates the regression estimate, making the effect smaller than it actually is, and (2) increases the residual error variance, reducing the power to detect a slope different from zero.26 This can be seen by comparing the results of Table 3 with the first rows of each of the 2 sections of Table 4 (ie, the rows without added confounders). In the subsequent rows of Table 4, unweighted linear regression was used for follow-up exploratory analyses that controlled for potential covariates.
P < .001 was treated as significant because there were 12 comparisons made (2 rater types × 2 periods × 3 attributions of each case to anesthesiologists) and absence of prior knowledge on the distribution among our faculty anesthesiologists in their numbers of common anesthesia procedures.
Characteristics of the anesthesiologists’ supervision scores and numbers of common anesthesia procedures are provided in Table 2.
The point estimates of all 28 slopes listed in Tables 3 and 4 are in the direction of greater specialization of practice of the evaluated faculty anesthesiologist being associated with significantly lower supervision scores. Based on 99.9% confidence limits, the largest potential positive association would be a change of 10 common procedures associated with a 0.027 greater supervision score on the 4-point scale (Table 3 legend); this is negligible.1–9 The difference of 10 procedures that we used as the denominator for the slope of the regression line (Table 3) is typical for the total diversity for a US facility (ie, entire horizontal axis of Figure; see Discussion).13
Among supervision scores provided by nurse anesthetists, during the third of the 6-month periods, the association was statistically significant whether the case was attributed to: (1) the anesthesiologist supervising the case for the longest total period of time or (2) the anesthesiologist starting the case (Table 3). However, the slopes of the relationship were small (eg, 0.109 ± 0.025 [SE] units on 4-point scale for a change of 10 common procedures). The associations were not statistically significant by unweighted linear regression when controlling for potential confounders, except for the average weekly hours of supervision in the operative settings studied (Table 4).
Among supervision scores provided by anesthesia residents, the associations were significant for the first of the two 6-month periods. However, again, the associations were small (eg, 0.127 ± 0.027 units for a change of 10 common procedures). The associations were not statistically significant by unweighted linear regression when controlling for potential confounders.
Our department’s faculty anesthesiologists’ heterogeneity of practice was sufficiently large that some of the slightly lower supervision scores we found among faculty with greater degrees of clinical specialization (ie, fewer common procedures) were statistically significant (eg, Figure uncorrected P = .0001). However, the magnitudes of the associations were small.1–9 We interpret our findings as providing evidence that greater clinical specialization does not result in better quality of clinical supervision.
Previously, Lubarsky and Reves27 described “formal groupings of faculty [anesthesiologists] practicing exclusively in one area occur most commonly in cardiac anesthesia … Pediatrics often [has] dedicated faculty, but they might routinely provide care outside their field of expertise. In many obstetric anesthesia practices, dedicated individuals provide day-time care, and faculty [anesthesiologists] focusing in another area often provides nighttime coverage.” They surveyed their faculty before and after “reorganization” that “limited practice almost entirely within one or two surgical subspecialty areas.”27 They found, “significant reduction” in perceived “excellent relationship with the residents” (mean 0.66 less on a 1–5 scale, P = .0051) and in the perceived “excellent relationship with our nurse anesthetists” (mean 0.61 less on the 1–5 scale, P = .0127).27 Both findings were relatively small reductions in the quality of the relationships and were not significant when corrected for the false discovery rate among the 29 other questions.28 They concluded: “subspecialty reorganization of the faculty [does] not substantially improve the faculty’s sense of satisfaction with their workplace.”27 Our findings suggest that, likewise, clinical specialization does not improve anesthesia residents’ and nurse anesthetists’ perceptions of the faculty anesthesiologists’ contribution to patient care.10 Anesthesia subspecialists do not appear to have an advantage over generalists when it comes to providing effective clinical supervision from the perspective of residents and nurse anesthetists.
Our results are important, in part, because quality supervision matters. From a recent editorial in Anesthesia & Analgesia: “Don’t we all recognize in our own institution the clinical superstars, the clinical workhorses, and the clinical dullards we never want to anesthetize us?”29 The anesthesia residents who work daily with the anesthesiologists make those distinctions and reliably so when using the supervision scale.3 Resident choices of anesthesiologists to care for their families are correlated with the supervision score (Kendall’s τb = +0.77, P < .0001).3 Supervision quality is positively correlated with all the dimensions of safety culture (all P < .0001).8 Anesthesia residents who report mean supervision scores for their department that are <3.00 (ie, less than “frequent”) report making more “mistakes that have negative consequences for the patient,” with an accuracy (area under the curve) of 89% (99% confidence interval, 77%–95%).2 Supervision less than “frequent” (ie, <3.00) predicts “medication errors (dose or incorrect drug) in the last year” with an accuracy of 93% (99% confidence interval, 77%–98%).2 Among residents reporting overall supervision during their current rotation that is less than frequent (ie, <3.00) versus frequent, the 10th, 25th, 50th, 75th, 90th, and 95th percentiles of errors are 1 vs 1, 1 vs 1, 2 vs 2, 3 vs 2, 4 vs 3, and 6 vs 4, respectively (P < .0001).8 There is no detected effect of resident burnout on numbers of reported errors while controlling for supervision (all P > .138 by different types of analyses).8
Our results are important, also, because clinical specialization is expensive.13 We recently analyzed the American Society of Anesthesiologist’s National Anesthesia Clinical Outcomes Registry (NACOR).13 Diversity of procedures was measured the same way as in our current article.13,16 For most facilities nationwide in the United States, the numbers of different procedures commonly performed by anesthesiologists during regular hours and off hours (ie, nights and weekends) were essentially the same, but there was only moderate similarity in the procedures between these periods.13 Thus, anesthesiologists who work principally within a single specialty during regular work hours likely do not have substantial contemporary experience with many procedures performed during off hours.13 Because patient outcomes are generally30 found to be worse for surgery during nights and weekends,31–34 there would be a logical inconsistency for a facility to choose to use subspecialty anesthesia teams to improve patient outcomes during regular hours, but not to do so off hours.13 Thus, when an anesthesia department considers specialization and the implications of the current article’s results, we are considering not just regular work hours, but the large costs of having specialization nights and weekends.13 Our findings suggest that clinical specialization beyond the major separately board-certified subspecialties might be an unnecessary organizational cost.
The statistical significance that we observed (Table 3) likely was enhanced by the wide spread of anesthesiologists’ diversity of practice in our department (ie, horizontal axis; Table 2 and Figure). Excluding the few anesthesiologists who had relatively lesser diversity (eg, ≤15 common types of anesthesia procedures) eliminates the statistical significance of the relationship. However, clinical diversity at this level (eg, ≤15 common types of anesthesia procedures) is typical for many anesthesia practices nationally. For the average facility in the United States, case diversity values for “regular hours,” “evenings,” and “weekends” are 13.59 ± 0.12, 13.12 ± 0.13, and 9.43 ± 0.13 procedures, respectively.13 In other words, the difference of 10 procedures that we used as the denominator for the slope of the regression line (Table 3) is typical for the total diversity for a US facility. Thus, what makes our department particularly unique is its large diversity of clinical practice within a single hospital, a focus of several previous studies.16,19,20f Consequently, we doubt that the significant correlation that we detected (Table 3; Figure) will be generalizable to most US hospitals because most hospitals have much less clinical diversity. At facilities with less clinical diversity, it is doubtful that there would be significant correlation between individual anesthesiologists’ specialization and supervision scores. This “limitation” is “good” operationally, because most hospitals are much smaller than the University of Iowa. At smaller hospitals, specialization among anesthesiologists can be difficult to implement because, to have staff scheduling done by service, generally there needs to be different numbers of specially trained anesthesiologists scheduled among days of the week.35
In conclusion, Lubarsky and Reves27 previously found that when faculty specialized, the faculty reported significantly reduced interaction with anesthesia residents and nurse anesthetists. We similarly found that specialization was associated with the faculty receiving statistically significant but small reductions in the quality of supervision that they provided to residents and nurse anesthetists.
Name: Franklin Dexter, MD, PhD.
Contribution: This author helped design the study, conduct the study, analyze the data, and write the manuscript.
Conflicts of Interest: The Division of Management Consulting performs some of the analyses described in this paper. Dr Dexter receives no funds personally other than his salary and allowable expense reimbursements from the University of Iowa and has tenure with no incentive program. He and his family have no financial holdings in any company related to his work, other than indirectly through mutual funds for retirement. Income from the Division’s consulting work is used to fund Division research.
Name: Johannes Ledolter, PhD.
Contribution: This author helped analyze the data.
Conflicts of Interest: None.
Name: Richard H. Epstein, MD.
Contribution: This author helped design the study and write the manuscript.
Conflicts of Interest: None.
Name: Bradley J. Hindman, MD.
Contribution: This author helped design the study, conduct the study, and write the manuscript.
Conflicts of Interest: None.
This manuscript was handled by: Nancy Borkowski, DBA, CPA, FACHE, FHFMA.
1. de Oliveira Filho GR, Dal Mago AJ, Garcia JH, Goldschmidt R. An instrument designed for faculty supervision evaluation by anesthesia residents and its psychometric properties. Anesth Analg. 2008;107:1316–1322.
2. De Oliveira GS Jr, Rahmani R, Fitzgerald PC, Chang R, McCarthy RJ. The association between frequency of self-reported medical errors and anesthesia trainee supervision: a survey of United States anesthesiology residents-in-training. Anesth Analg. 2013;116:892–897.
3. Hindman BJ, Dexter F, Kreiter CD, Wachtel RE. Determinants, associations, and psychometric properties of resident assessments of anesthesiologist operating room supervision. Anesth Analg. 2013;116:1342–1351.
4. Dexter F, Logvinov II, Brull SJ. Anesthesiology residents’ and nurse anesthetists’ perceptions of effective clinical faculty supervision by anesthesiologists. Anesth Analg. 2013;116:1352–1355.
5. Dexter F, Ledolter J, Smith TC, Griffiths D, Hindman BJ. Influence of provider type (nurse anesthetist or resident physician), staff assignments, and other covariates on daily evaluations of anesthesiologists’ quality of supervision. Anesth Analg. 2014;119:670–678.
6. Dexter F, Ledolter J, Hindman BJ. Bernoulli Cumulative Sum (CUSUM) control charts for monitoring of anesthesiologists’ performance in supervising anesthesia residents and nurse anesthetists. Anesth Analg. 2014;119:679–685.
7. Hindman BJ, Dexter F, Smith TC. Anesthesia residents’ global (departmental) evaluation of faculty anesthesiologists’ supervision can be less than their average evaluations of individual anesthesiologists. Anesth Analg. 2015;120:204–208.
8. De Oliveira GS Jr, Dexter F, Bialek JM, McCarthy RJ. Reliability and validity of assessing subspecialty level of faculty anesthesiologists’ supervision of anesthesiology residents. Anesth Analg. 2015;120:209–213.
9. Dexter F, Masursky D, Hindman BJ. Reliability and validity of the anesthesiologist supervision instrument when certified registered nurse anesthetists provide scores. Anesth Analg. 2015;120:214–219.
10. Dexter F, Hindman BJ. Quality of supervision as an independent contributor to an anesthesiologist’s individual clinical value. Anesth Analg. 2015;121:507–513
11. Dexter F, Szeluga D, Masursky D, Hindman BJ. Written comments made by anesthesia residents when providing below average scores for the supervision provided by the faculty anesthesiologist. Anesth Analg. 2016;122:1999–2005.
12. Epstein RH, Dexter F, Patel N. Influencing anesthesia provider behavior using anesthesia information management system data for near real-time alerts and post hoc reports. Anesth Analg. 2015;121:678–692.
13. Dexter F, Epstein RH, Dutton RP, et al. Diversity and similarity of anesthesia procedures in the United States during and among regular work hours, evenings, and weekends. Anesth Analg. 2016;123:1567–1573.
14. Dexter F, Wachtel RE, Epstein RH. Decreasing the hours that anesthesiologists and nurse anesthetists work late by making decisions to reduce the hours of over-utilized operating room time. Anesth Analg. 2016;122:831–842.
15. Kreiter CD, Wilson AB, Humbert AJ, Wade PA. Examining rater and occasion influences in observational assessments obtained from within the clinical environment. Med Educ Online. 2016;21:29279.
16. Dexter F, Ledolter J, Hindman BJ. Quantifying the diversity and similarity of surgical procedures among hospitals and anesthesia providers. Anesth Analg. 2016;122:251–263.
17. Dexter F, Thompson E. Relative value guide basic units in operating room scheduling to ensure compliance with anesthesia group policies for surgical procedures performed at each anesthetizing location. AANA J. 2001;69:120–123.
18. Dexter F, Macario A, Penning DH, Chung P. Development of an appropriate list of surgical procedures of a specified maximum anesthetic complexity to be performed at a new ambulatory surgery facility. Anesth Analg. 2002;95:78–82.
19. Dexter F, Wachtel RE, Yue JC. Use of discharge abstract databases to differentiate among pediatric hospitals based on operative procedures: surgery in infants and young children in the state of Iowa. Anesthesiology. 2003;99:480–487.
20. Wachtel RE, Dexter F. Differentiating among hospitals performing physiologically complex operative procedures in the elderly. Anesthesiology. 2004;100:1552–1561.
21. Simpson EH. Measurement of diversity. Nature. 1949;163:688.
22. York D. Least squares fitting of a straight line with correlated errors. Earth Planet Sci Lett. 1969;5:320–324.
23. Williamson JH. Least-squares fitting of a straight line. Can J Phys. 1968;46:1845–1847.
24. Cantrell CA. Review of methods for linear least-squares fitting of data and application to atmospheric chemistry problems. Atmos Chem Phys. 2008;8:5744–5487.
25. Tellinghuisen J. Least-squares analysis of data with uncertainty in x and y: a Monte Carlo methods comparison. Chemom Intell Lab Syst. 2010;103:160–169.
26. Fuller WA. Measurement Error Models. 1987:New York, NY: John Wiley & Sons; 1–99.
27. Lubarsky DA, Reves JG. Effect of subspecialty organization of an academic department of anesthesiology on faculty perceptions of the workplace. J Am Coll Surg. 2005;201:434–437.
28. Vasilopoulos T, Morey TE, Dhatariya K, Rice MJ. Limitations of significance testing in clinical research: a review of multiple comparison corrections and effect size calculations with correlated measures. Anesth Analg. 2016;122:825–830.
29. Shafer SL. Anesthesiologists make a difference. Anesth Analg. 2015;120:497–498.
30. Sessler DI, Kurz A, Saager L, Dalton JE. Operation timing and 30-day mortality after elective general surgery. Anesth Analg. 2011;113:1423–1428.
31. Singla AA, Guy GS, Field JB, Ma N, Babidge WJ, Maddern GJ. No weak days? Impact of day in the week on surgical mortality. ANZ J Surg. 2016;86:15–20.
32. Aylin P, Alexandrescu R, Jen MH, Mayer EK, Bottle A. Day of week of procedure and 30 day mortality for elective surgery: retrospective analysis of hospital episode statistics. BMJ. 2013;346:f2424.
33. Whitlock EL, Feiner JR, Chen LL. Perioperative Mortality, 2010 to 2014: a retrospective cohort study using the national anesthesia clinical outcomes registry. Anesthesiology. 2015;123:1312–1321.
34. Glance LG, Osler T, Li Y, et al. Outcomes are worse in US patients undergoing surgery on weekends compared with weekdays. Med Care. 2016;54:608–615.
35. Dexter F, Wachtel RE, Epstein RH, Ledolter J, Todd MM. Analysis of operating room allocations to optimize scheduling of specialty rotations for anesthesia trainees. Anesth Analg. 2010;111:520–524.