Secondary Logo

Journal Logo

Economics, Education, And Policy: Research Report

Anesthesia Residents’ Global (Departmental) Evaluation of Faculty Anesthesiologists’ Supervision Can Be Less Than Their Average Evaluations of Individual Anesthesiologists

Hindman, Bradley J. MD*; Dexter, Franklin MD, PhD; Smith, Thomas C. BS*

Author Information
doi: 10.1213/ANE.0000000000000444
  • Free

Supervision of anesthesiology residents is a major daily responsibility of faculty (academic) anesthesiologists. The word “supervision” is used here not as a U.S. billing term. Rather, we use the term supervision to include all clinical oversight functions directed toward assuring the quality of clinical care whenever the anesthesiologist is not the sole anesthesia care provider. Supervision of residents is required for both postgraduate medical educationa and billing compliance.b

We and others have been studying how the quality of anesthesiologists’ supervision in operating rooms can be evaluated, both individually1–4 and departmentally.5–8 For example, mean supervision scores of individual faculty were highly correlated with residents’ choice of the anesthesiologist to care for their families (Kendall τb = +0.77, P < 0.0001).2 Using a validated 4-point (1–4) supervision scale (Table 1), 44 of 47 (94%) of anesthesiology residents reported the minimum level of supervision that they expect from individual faculty anesthesiologists to be at least 3.00 (i.e., at least “frequent”).3 The mean individual faculty supervision score to meet minimum resident expectations was 3.40 ± 0.30 (SD).3 In a different study, anesthesiology residents who reported mean supervision scores for their entire department that were <3.00 reported making significantly more “mistakes that have negative consequences for the patient” and more “medication errors (dose or incorrect drug).”7 Therefore, prior studies suggest resident supervision by faculty anesthesiologists, both at the individual faculty level and at the departmental level, may be related to the quality of care patients receive.

Table 1
Table 1:
de Oliveira Filho et al.’s Instrument1,3 for Measuring Faculty Anesthesiologists’ Supervision of Anesthesiology Residents During Clinical Operating Room Care

Although it is logical for there to be a relationship between residents’ supervision scores for individual faculty anesthesiologists and their supervision scores for the entire department (comprised of these individual faculty), the relationship is not known. The supervision score of the entire department may or may not be the mean of all the individual faculty scores. Therefore, from each resident in our program, we obtained the mean of all individual faculty evaluations over a 36-week period and compared that mean with the resident’s corresponding global evaluation of all faculty anesthesiologists with whom they worked over the same period.


The University of Iowa IRB approved this survey study and waived the requirement to obtain a subject signature as documentation of consent.

We studied all anesthesiology residents in the clinical years 1, 2, and 3 (i.e., neither in the “base year” nor in fellowship). Consequently, there was a finite population. Our power analysis was relevant to the extent of confirming that we could detect an important difference. A 5% difference seemed the smallest managerially relevant. When we designed this study, we had 3 months of evaluations, and the mean of individual faculty scores was 3.71 ± 0.30 (mean ± SD of each resident’s scores for all faculty anesthesiologists each had evaluated). A 5% difference was an effect size of 0.63, where 0.63 = 5% × 3.71/0.30. At α = 0.05, and response rates of N = 35, 30, or 25, we would have 95%, 92%, or 86% statistical power. Since there were N = 39 residents, we considered it reasonable to proceed with the study provided we could achieve at least a 75% response rate (see below).

Daily Evaluation of Individual Faculty Supervision

Starting July 1, 2013, our department has sent daily e-mail requests to anesthesiology residents to evaluate the supervision provided by each anesthesiologist with whom they worked the previous day.2,3,7,8 Evaluation requests are faculty specific (derived from the electronic medical record) and are requested when, on any given day, a resident and faculty work together for at least 1 hour in either our hospital’s main surgical suite, ambulatory surgery center, urology suite, labor and delivery suite, electroconvulsive therapy suite, or pediatric catheterization laboratory. For details, see the Appendix of our recent article.4

Residents evaluate faculty supervision utilizing a secure Web page by answering 9 questions developed by de Oliveira Filho et al.1 In prior studies, we observed that individual faculty supervision scores were: (1) negligibly greater when, on the day being evaluated, a resident had more hours and intensity of care with the rated anesthesiologist (Kendall τb = +0.083 ± 0.014 [SE]) and (2) negligibly less when, on the day being evaluated, the rated anesthesiologist had more work with other (nonfaculty) anesthesia providers (τb = −0.057 ± 0.014).4 Similarly, individual faculty supervision scores were negligibly correlated with the number of days residents and faculty worked together (τb = −0.069 ± 0.023).4 Because faculty assignments differed dramatically among days (P < 0.0001), knowing these lack of associations was a necessary condition for the current study.

Evaluation of Global (Departmental) Faculty Supervision

Each resident (clinical anesthesia year [CA]-1, -2, -3) received 3 e-mails 1 week apart, to announce the study, to serve as a reminder and, finally, as a formal invitation to participate. Residents who chose to participate clicked a hyperlink on the 3rd e-mail taking them to a secure Web page where consent was verified before the survey (Table 1) was presented. The study survey asked each resident to evaluate the overall supervision received from all faculty anesthesiologists with whom they had worked in operating rooms from Monday, July 1, 2013, to Sunday, March 9, 2014. Participants had 3 days to complete the survey, between 6:00 AM on Friday, March 14 and 11:59 PM on Sunday, March 16. Each resident who completed the evaluation received a check for $25 mailed to his or her home address. Reliability and validity of the instrument when applied departmentally is shown in our companion article.9

Statistical Analysis

Resident and faculty identities were coded such that analysis was performed on a blinded basis. For each resident, comparison was made between the mean departmental supervision score provided on the study survey and the mean of all of their previously completed routine daily individual faculty evaluations, completed between July 1, 2013, and Thursday, March 13, 2014. The mean faculty supervision score was used because the mean is proportional to the cumulative global (departmental) score of interest multiplied by a constant, the total number resident–faculty interactions.

All P values were calculated using Monte Carlo simulation to 4 decimal places (StatXact-10, Cytel Inc., Cambridge, MA). Uniform distribution of the ratios between the observed minimum and maximum values was evaluated using the 1-sample Kolmogorov-Smirnov test. The 95% confidence interval (CI) for the median of the ratios was calculated using the Hodges-Lehmann method. The sign test compared the ratios to 1.0. Associations of ratios with continuous independent variables (numbers of evaluations and means of individual faculty scores) were evaluated using Kendall τb and reported ±SE. Associations of ratios with anesthesia resident class (e.g., CA-1, CA-2, CA-3) were evaluated using the Kruskal-Wallis test.


All 39 of the program residents chose to participate. Figure 1 shows the ratio of each resident’s global (departmental) evaluation score of faculty supervision (i.e., mean among 9 questions × 1 evaluation) to the same resident’s evaluations of individual faculty (i.e., mean among 9 questions × many daily evaluations). The mean global supervision score (3.22 ± 0.34, N = 39)c was significantly less (P < 0.0001) than the mean of individual faculty scores (3.75 ± 0.24, Table 2). The correlation between global and (mean) individual faculty scores was τb = 0.34 ± 0.11 (P = 0.0032). Using each resident’s (mean) individual faculty score, and segmenting the residents into 2 groups, residents who provided a global (departmental) score of ≤3.0 had lesser (mean) individual faculty scores than residents who provided global scores of >3.0 (Mann-Whitney P = 0.015). The median global/individual ratio was 86.2% (95% CI, 83.4%–89.0%). Using the most recent half of data, the median ratio and CI were 85.8% (95% CI, 83.0%–88.6%). Using the most recent 19 evaluations from each resident, the median ratio and CI were 85.9% (95% CI, 83.1%–88.7%). All ratios were ≤1.0 (P < 0.0001, Fig. 2).

Figure 1
Figure 1:
Ratio of each resident’s global (departmental) evaluation score of faculty supervision (i.e., mean among 9 questions × 1 evaluation) to the resident’s evaluations of individual faculty (i.e., mean among 9 questions × many evaluations). The median was 86% (95% confidence interval, 83%–89%). The corresponding summary statistics are in Table 2. The figure shows the uniform distribution (P = 0.64) described in the second paragraph of the Results and in the Discussion.
Table 2
Table 2:
Summary of the N = 39 Observational Data
Figure 2
Figure 2:
Data from Figure 1 plotted as each resident’s global (departmental) evaluation score of faculty supervision (i.e., mean among 9 questions × 1 evaluation) versus the resident’s evaluations of individual faculty (i.e., mean among 9 questions × many evaluations). The ratios are all ≤1.0 (sign test P < 0.0001).

The median was an appropriate summary statistic based on multiple analyses. The global/individual ratios were uniformly distributed (P = 0.64) between the observed minimums and maximums (Fig. 1). The ratios were not correlated with the mean value of individual faculty scores previously provided by each resident (P = 0.64, τb = −0.055 ± 0.12). The ratios were not correlated with the number of individual faculty evaluations previously provided by each resident (P = 0.49, τb = 0.08 ± 0.10). The ratios did not differ among the 3 resident classes (P = 0.37).d


Using the de Oliveira Filho et al. supervision question set,1 de Oliveira et al.7,9 found that residents who reported mean department-wide supervision scores <3.0 (“frequent”) reported significantly more frequent occurrences of mistakes with negative consequences to patients, as well as medication errors. In the accompanying Editorial, the recommendation for program administrators was to “measure … residents’ perception of overall faculty supervision and set the expectation that the faculty overall provide a minimum of frequent supervision to residents.”8Figure 1 shows that our residents’ perceptions of overall (departmental) faculty supervision were not the same as the overall average of their perceptions of individual faculty supervision, but were significantly correlated. The ratios of individual versus department evaluations of supervision did not depend on the characteristics of the residents or on the most recent faculty experiences of the residents.

Anesthesiology residents’ evaluations of the quality of overall (departmental) faculty supervision averaged 83% to 89% (median = 86%) of their corresponding scores of individual faculty anesthesiologists. Previously, (different) anesthesiology residents reported that supervision scores of individual anesthesiologists who met minimum expectations was 3.40 ± 0.30.3 Multiplying 3.40 ± 0.30 by 86% gives a minimum expected departmental supervision score of 2.92 ± 0.26, no different from 3.0, the critical departmental value associated with increased resident self-reports of clinical errors.7 Therefore, 3 different studies regarding resident evaluations of supervision have given comparable results.

Our study is limited in being from a single department, but this also provides an advantage for understanding the relationship between residents’ global (departmental) scores and their reports of patients’ outcomes.7 When residents are surveyed nationally, reports of errors are associated principally with the department’s supervision, but also with resident class and resident average work hours ≥70 hours per week.7 Yet, there were no differences among our residents’ classes in global (departmental) scores (P = 0.41) or ratios (P = 0.37) and none of our residents had reported duty hours exceeding 60 hours per week. In national surveys, reports of errors also are associated with burnout and depression.10 However, that is a small minority of residents, and yet, the ratios were distributed uniformly among our residents (i.e., not bimodal; Fig. 1). Consequently, when providing evaluations of individual faculty, residents may be biased to provide favorable scores because of appreciation for or loyalty to the individual faculty anesthesiologist with whom they just cared for a patient. Alternatively, even though the evaluation process is highly confidential,2,11 it is not anonymous, and residents may tend not to provide unfavorable individual scores out of fear that, despite safeguards, faculty may learn of their identity.

Regardless, in conclusion, departments should consider that their residents’ perceptions of program wide (departmental) supervision may be significantly less than their residents’ evaluations of individual faculty anesthesiologists (Figs. 1 and 2). This should be considered when interpreting national survey results (e.g., of patient safety), residency program evaluations, and individual faculty anesthesiologist performance.


Franklin Dexter is the Statistical Editor and Section Editor for Economics, Education, and Policy for Anesthesia & Analgesia. This manuscript was handled by Dr. Steven L. Shafer, Editor-in-Chief, and Dr. Dexter was not involved in any way with the editorial process or decision.


Name: Bradley J. Hindman, MD.

Contribution: This author helped design the study, conduct the study, and write the manuscript. This author is the archival author.

Attestation: Bradley J. Hindman has approved the final manuscript.

Name: Franklin Dexter, MD, PhD.

Contribution: This author helped design the study, conduct the study, analyze the data, and write the manuscript.

Attestation: Franklin Dexter has approved the final manuscript.

Name: Thomas C. Smith, BS.

Contribution: This author helped conduct the study.

Attestation: Thomas C. Smith has approved the final manuscript.


a ACGME Program Requirements for Graduate Medical Education in Anesthesiology. See Section II.B.2.a. Available at: Accessed September 8, 2014.
Cited Here

b Department of Health and Human Services, Centers for Medicare and Medicaid Services. CMS Manual System. Pub 100–04 Medicare Claims Processing, Transmittal 1859, November 20, 2009. Subject: MIPPA Section 139 Teaching Anesthesiologists. Available at: Accessed November 28, 2013.
Cited Here

c De Oliveira et al.7 reported median supervision scores for each of the 3 resident classes. The weighted mean of the medians among the 3 classes nationwide equaled 3.25, essentially indistinguishable from the value obtained among the residents in our department.
Cited Here

d The global (departmental) evaluation scores did not differ among the 3 resident classes (see Discussion), P = 0.41.
Cited Here


1. de Oliveira Filho GR, Dal Mago AJ, Garcia JH, Goldschmidt R. An instrument designed for faculty supervision evaluation by anesthesia residents and its psychometric properties. Anesth Analg. 2008;107:1316–22
2. Hindman BJ, Dexter F, Kreiter CD, Wachtel RE. Determinants, associations, and psychometric properties of resident evaluations of faculty operating room supervision in a US anesthesia residency program. Anesth Analg. 2013;116:1342–51
3. Dexter F, Logvinov II, Brull SJ. Anesthesiology residents’ and nurse anesthetists’ perceptions of effective clinical faculty supervision by anesthesiologists. Anesth Analg. 2013;116:1352–5
4. Dexter F, Ledolter J, Smith TC, Griffiths D, Hindman BJ. Influence of provider type (nurse anesthetist or resident physician), staff assignments, and other covariates on daily evaluations of anesthesiologists’ quality of supervision. Anesth Analg. 2014;119:670–8
5. Paoletti X, Marty J. Consequences of running more operating theatres than anaesthetists to staff them: a stochastic simulation study. Br J Anaesth. 2007;98:462–9
6. Epstein RH, Dexter F. Influence of supervision ratios by anesthesiologists on first-case starts and critical portions of anesthetics. Anesthesiology. 2012;116:683–91
7. De Oliveira GS Jr, Rahmani R, Fitzgerald PC, Chang R, McCarthy RJ. The association between frequency of self-reported medical errors and anesthesia trainee supervision: a survey of United States anesthesiology residents-in-training. Anesth Analg. 2013;116:892–97
8. de Oliveira Filho GR, Dexter F. Interpretation of the association between frequency of self-reported medical errors and faculty supervision of anesthesiology residents. Anesth Analg. 2013;116:752–3
9. De Oliveira GS Jr, Dexter F, Bialek JM, McCarthy RJ. Reliability and validity of assessing subspecialty level of faculty anesthesiologists’ supervision of anesthesiology residents. Anesth Analg. 2015;120:209–143
10. de Oliveira GS Jr, Chang R, Fitzgerald PC, Almeida MD, Castro-Alves LS, Ahmad S, McCarthy RJ. The prevalence of burnout and depression and their association with adherence to safety and practice standards: a survey of United States anesthesiology trainees. Anesth Analg. 2013;117:182–93
11. Dexter F, Ledolter J, Hindman BJ. Bernoulli cumulative sum (CUSUM) control charts for monitoring of anesthesiologists’ performance in supervising anesthesia residents and nurse anesthetists. Anesth Analg. 2014;119:679–85
© 2015 International Anesthesia Research Society