As part of the Next Accreditation System, the Accreditation Council for Graduate Medical Education (ACGME) requires training programs to provide semiannual reports of progress along defined milestones for individual resident trainees.1 For internal medicine residency programs, these milestones measure progress in 22 subcompetency areas. Providing such granular progress reports may be challenging for clinical competency committees (CCC), especially for large programs where CCC members are not able to directly observe all residents.
The Gap Between EPAs and Milestones
Many medicine residency programs have traditionally used evaluation questions framed as scales that compare each trainee with others (e.g., numeric 1–9 Likert scales; or “above,” “at,” or “below expectation”). These tools frequently suffer from halo effects and response range restriction and may not provide adequately granular evaluation data needed to complete milestone reports to the ACGME.2,3
Independent of the Next Accreditation System’s milestone reporting requirements, training programs across different subspecialties have been exploring evaluations based on the degree of faculty supervision required for a resident to perform given tasks, or trainee “trustworthiness.” First conceived by ten Cate, “entrustable professional activities” (EPAs) describe activities integral to each specialty that require specific knowledge and skills applied to particular clinical contexts.4–7 Attending physicians already use this multidimensional construct of trustworthiness when deciding what level of supervision is necessary for trainees in each specific situation.8 Prior literature has suggested that multiple learner, supervisor, and situational factors influence the decisions by attending physicians to trust their trainees.8–12 Generally, trainees with a higher degree of competence for a given context are allowed a greater degree of “entrustment” and require less direct supervision compared with trainees with a lower degree of competence.7,9,11 Evaluations that capture the degree of supervision required for a given EPA offer the opportunity to gauge trainee competence using this practice familiar to attending physicians.
However, medicine residency program directors that choose to assess competence using this concept of entrustment face the pragmatic problem of how to complete the required semiannual ACGME milestone reports. Residency programs face challenges in trying to connect results from EPA-based evaluations to milestones in the 22 ACGME subcompetencies because there is a wide gap between the constructs which underpin these two measures of trainee competence.
In this article, we describe a process to bridge the gap between EPA-based evaluations and ACGME milestone reports, developed in 2012 and 2013. We share early experiences from the University of Washington, a large urban academic training program with more than 170 residents spread across three primary teaching hospitals, and we discuss challenges and benefits of this method.
Mapping EPAs to Milestones
Conceptual framework and terminology
Prior literature has suggested that aligning scales and anchors with the priorities of evaluators is associated with improved reliability and discrimination.13,14Figure 1 displays the conceptual framework guiding our process and highlights the priorities of the education and evaluation “stakeholders” involved. For residents and attending physicians, rotation-based EPAs contextualize feedback in the authentic work of the clinic or hospital. For members of the CCC, tracking resident milestones across different rotations is necessary not only for semiannual ACGME reports but also because doing so may help identify trainees who require remediation.
In this article, we use the term “subcompetency” to describe each of the 22 domains reported semiannually to the ACGME for individual medicine residents15; these subcompetencies have been labeled as “reporting milestones” by other authors.16 We use the term “milestone reports” for the electronic files submitted to the ACGME by programs about residents’ progress in these subcompetencies.
Instructions provided from the ACGME’s Internal Medicine Milestones Project indicate that selecting response boxes at the bottom of each column “implies milestones in that column as well as in previous columns have been substantially demonstrated,” while selecting response boxes on the line between columns “indicates that milestones in lower levels have been substantially demonstrated as well as some milestones in the higher columns.”15 Because the term “milestones” is used variably among different authors, we prefer the term “milestone elements” to refer to the individual narrative descriptors in the columns of each subcompetency (see Figure 2 for an example of one such subcompetency and its milestone elements). Earlier versions of the internal medicine milestones focused on 142 items eventually labeled “curricular milestones.”17 The process described here does not incorporate curricular milestones because they are not directly identified in the ACGME reporting milestones.
Step 1: Identifying rotation-specific EPAs to evaluate residents
The Alliance for Academic Internal Medicine described 16 “end-of-training EPAs” that were developed from the perspective of a medicine resident ready to enter unsupervised practice.18 Although potentially helpful for residents considering the overall arc of their training or for residency programs assessing their educational offerings, these end-of-training EPAs (e.g., “Improve the quality of health care at both the individual and systems level”) are too broad for attending physicians evaluating residents in a discrete clinical environment.
In contrast, the activities of an individual resident performing the typical activities specific to a given clinical rotation can provide a useful opportunity for faculty evaluation and feedback. We use the term “rotation-specific EPAs” or “rotational EPAs” (elsewhere called “observed practice activities”)19 to distinguish rotation- or clinic-specific activities from the end-of-training EPAs.
For each rotation at the University of Washington, key faculty have selected 8 to 10 rotational EPAs to represent the essential work expected of a resident for that specific experience (see Appendix 1 for examples). These rotational EPAs are meant to be representative rather than comprehensive, as trying to capture a list of every possible activity expected of a resident would be impractical. These rotational EPAs are distributed to residents as learning objectives at the start of each rotation.
Assessment of residents across different clinical contexts requires creation of multiple rotation-specific evaluations, with some overlapping EPAs across rotations. In our program we chose a single set of EPAs for each rotation regardless of training level (i.e., no distinction between “intern” and “senior resident” evaluations). This allows the opportunity to see residents’ progression over the different stages of their training. For example, an early intern on a general ward rotation might reasonably be expected to be less trusted to coordinate discharge compared with a finishing senior resident. Faculty members are also asked to make this assessment of the degree of entrustment for a given EPA regardless of a resident’s level (e.g., “this intern could do this activity independently without my supervision”) rather than norm referenced to training-level peers (e.g., “this intern is independent compared to other interns”). Faculty members are asked to complete evaluation items for only those EPAs directly observed during a rotation or clinic experience.
As ten Cate and Scheele6 pointed out, entrustment is not usually given as an “all-or-nothing” but, rather, is afforded in degrees depending on both the clinical context and the competence of the trainee. Faculty members can gauge the degree of entrustment they give to a resident by attesting to the degree of supervision the trainee requires for rotational EPAs. For our program, we have written evaluation questions to include four levels of supervision (complete, partial, minimal, or ready to perform independently), plus a fifth level to describe a resident as capable of performing an EPA at an aspirational level (Appendix 1). This scale differs slightly from those suggested by other authors,7,20 but we chose anchors to align with ACGME expectations of resident supervision21: Complete supervision: a faculty member is “physically present with the resident and the patient”; partial supervision: a supervising physician is “physically within the hospital … immediately available”; minimal supervision: the supervisor is “not physically present … but is immediately available by means of telephonic and/or electronic modalities.”21 Two final anchors are used similarly to how they are described in the Internal Medicine Milestones Project15: As if independent: the trainee “is ready for unsupervised practice” in the particular activity; and aspirational: the trainee is truly exceptional and “reflect[s] the competence of an expert or role model.”15
Step 2: Selecting subcompetencies for rotational EPAs
After 8 to 10 representative EPAs are selected for a rotation or clinic, core faculty provide a description of the key behaviors related to each activity. The descriptions are edited to favor behaviors observable by faculty members (e.g., “diagnose” or “manage,” rather than “understand” or “appreciate”). These descriptions provide evaluators a shared frame of reference when assessing entrustment for each EPA.
Core rotation faculty then review the 22 ACGME subcompetencies and identify those that are integral to the work of each EPA, based on the description they have written for that activity. In our program, most of these EPAs have been associated with 4 to 8 subcompetencies, although this number depends on the complexity of the EPA.
Step 3: Mapping EPA-based evaluations to milestone elements
The milestone elements that provide the narrative descriptions within each of the 22 subcompetencies serve as a catalog of discrete behaviors that link the EPA-based evaluations to the program ACGME milestone reports. Figure 3 illustrates how the milestone elements from the 4 to 8 connected subcompetencies can be mapped to the different evaluation response categories for a given EPA. Within a subcompetency, milestone elements from columns to the right naturally align with greater degrees of entrustment compared with milestone elements from columns to the left. Only those milestone elements relevant to a given EPA are mapped to the levels of entrustment for that activity. For example, milestone elements that reference the intensive care unit or outpatient clinic for subcompetency PC3 (Figure 2) would not be mapped to the evaluation item for an EPA from a general medicine ward experience.
Linking evaluation responses at the level of the individual milestone elements, rather than at the level of each column within a subcompetency, allows programs to determine mapping according their culture, values, and expectations. For example, our program expects that for PC3 (Skill in Performing Procedures), each trainee “maximizes patient comfort and safety when performing procedures” not only at the “aspirational” level (where it was written by the Internal Medicine Milestone Project)15; rather, we prioritize this behavior for all resident procedures and expect this at each level of entrustment above “complete supervision.”
Displaying Milestone Progress
Using EPA-based evaluations to track progress among milestone elements
In this system, attending physicians do not directly view the mapping of milestone elements to these evaluation responses; they are asked instead to focus on providing residents formative feedback about the rotational EPAs. Once the individual responses for each evaluation item have been mapped to a list of milestone elements from several subcompetencies, these relationships can be integrated into a residency management evaluation system or other computerized database. As refinements are made, mapping can be edited and adjusted by the training program without requiring that new evaluations be completed by faculty members.
When a faculty member completes an evaluation and selects a degree of trustworthiness for a specific rotational EPA, the milestone elements mapped to that answer are marked as “confirmed.” For example, when a supervising physician attests that a resident can coordinate a patient discharge with “minimal supervision,” this would confirm multiple milestone elements (Figure 3), including “recognizes the importance of communication during times of transition” (from the subcompetency Systems-Based Practice 4) and “incorporates patient-specific preferences into plan of care” (from the subcompetency Interpersonal and Communication Skills 1). Milestone elements mapped to lower degrees of trustworthiness (in this example, with “partial supervision” or “full supervision”) are also confirmed because these behaviors are assumed to be present or surpassed by attainment of the higher level of trustworthiness. In this example, a resident is assumed to have confirmed the milestone elements (from SBP1) “participates in team discussions when required but does not actively seek input from others” once he or she is able to “actively engage in team meetings and collaborative decision-making.”
Tallying milestone elements for semiannual ACGME reports
Using this system, the residency program maintains a database that tallies the number of times each milestone element has been confirmed for each trainee. Once confirmed a predefined minimal proportion of opportunities or number of times, individual milestone elements are noted to be completed to program satisfaction. Because each milestone element is mapped to responses from multiple evaluation items, completing a milestone element to program satisfaction usually requires assessment by multiple supervising physicians across different clinical rotations.
This tracking of confirmed and completed milestone elements can be graphically displayed using evaluation software22; however, the process described here was developed without assistance from evaluation software vendors. Figure 4 illustrates how a program might mark progress for a hypothetical resident in subcompetency PC3 based on completed evaluations from several supervising physicians. In this example, the resident’s progress in subcompetency PC3 would be marked (“X”) at the box between the third and fourth columns because all milestone elements to the left of this mark have been completed to program satisfaction, while only one above that mark has been completed.15 These graphical displays summarize milestone information in a manner that allows the CCC to efficiently review each resident’s progress prior to submission to the ACGME.
Benefits and Limitations
At the University of Washington, for the Internal Medicine Residency Program we have connected rotation-specific EPA-based evaluations to the milestones of the 22 internal medicine ACGME subcompetencies by mapping evaluation responses to individual milestone elements. This process has allowed our program to convert data from evaluations clinically meaningful to residents and faculty (rotational EPAs) into the format used for reporting progress to the ACGME (milestone reports). The graphical display produced by this process is used by the CCC to help report each resident’s progress in the 22 subcompetencies. Using this system, our CCC is now able to provide recommendations for ACGME milestone reporting for all 174 residents in a single meeting.
Although the ACGME does not explicitly restrict programs from using the 22 subcompetencies as evaluation questions, these were never intended by their developers to be used as evaluation tools for rotations14; we also believe that doing so would not provide accurate data for milestone reporting. Evaluation questions based on the 22 subcompetencies would be potentially subject to the same halo and range restriction biases of previous norm-based evaluations,2,3 while not meaningfully providing trainees the feedback grounded in the work of their actual clinical experiences. Additionally, several subcompetencies are problematic if used as evaluation questions at the level of a clinical rotation. For example, if asked to evaluate subcompetency PC3 (Manages Patients With Progressive Responsibility and Independence), an attending physician may be able to directly observe a resident in one context (e.g., on a medicine ward) but may not be able to comment about his or her skill in other contexts (e.g., in clinic).
These methods allow programs to use the tools best suited for each different “constituency” group (Figure 1). For the learners and the evaluators, the EPAs provide context-specific, observable areas of focus for coaching and feedback, grounded in the actual work that occurs during clinical rotations.23 For CCC members, the graphical display of individual attainment of milestone elements in the 22 subcompetencies allows efficient development of ACGME milestone reports, even for those residents not directly observed by CCC members.
The mapping strategy allows programs to design EPAs of varying degrees of difficulty and to create milestone mapping according to their values and philosophy. For example, although at the University of Washington we do not expect residents to perform many EPAs independently until near the end of training, we do expect residents in some instances to perform activities without need for close supervision before the end of the intern year (e.g., safe sign-outs between providers). We expect high levels of professionalism and interpersonal communication from our trainees, and interns understand that most of the behaviors described in the “ready for unsupervised practice” columns in these subcompetencies should be demonstrated early in their first year.
The described process does not automate the generation of the ACGME reports, nor does it eliminate the work of the CCC despite the greater efficiencies offered. The CCC must review the graphical displays to decide how the program reports residents’ milestone progress. This review is critical to interpret changes over time and to reconcile results with few observations or discordant data. The CCC faculty must examine unexpected results, as regression to an individual’s progress may reflect artifacts of schedule or mapping. For example, a resident who is on a research rotation without EPA-based evaluations may artificially appear to have regressed because he or she has not had opportunity to demonstrate certain milestone elements during a six-month reporting period.
Prior authors have debated the relative advantages and disadvantages to evaluation systems based on checklists versus global rating scales,24,25 with several reviews generally favoring the latter for reasons including better reliability.13,26,27 The evaluation system we have described here can be conceptualized as a series of global assessments of resident competence in multiple rotation-specific contexts by different supervising physicians. Although additional studies of reliability and validity will be needed to provide further support for these methods, in our program we find that confirmation of milestone elements appears to occur earlier (i.e., they are further to the right compared with peers) among those residents believed to be at higher degrees of global competence (e.g., future chief residents). In contrast, those residents who have been flagged for attention generally lag in confirmation of milestone elements (i.e., they are further to the left compared with peers) for a given subcompetency.
Although these methods have proven to be tremendously helpful in completing the ACGME milestone reports for our program, a small number of subcompetencies are not well captured using these methods. For example, the subcompetency “Learns and improves at the point of care” (Practice-Based Learning and Improvement, PBLI4) has proven difficult to connect with our rotational EPAs. Other subcompetencies contain milestone elements that do not easily map to evaluation responses; for example, “Skill in Performing Procedures” (PC4) includes milestone elements such as “Possesses technical skill and has successfully performed all procedures required for certification,” which is challenging to map to an EPA-based evaluation completed by an individual. Although we continue to adjust evaluation items and refine milestone element mapping, these limitations serve to emphasize that no single strategy is likely to provide comprehensive assessment in all areas. Programs may need to use other methods such as requiring completion of quality improvement projects (PBLI4) or procedure logs (PC4) to capture domains that are hard to assess using EPA-based evaluations.
Although potentially useful for many different training programs, these methods are especially helpful for large programs where trainees may not interact directly with the CCC members. The methods we describe here require significant initial program investment and are designed to facilitate evaluation data that inform milestone reporting for large numbers of residents. We would not necessarily recommend these methods for smaller programs. Programs in which a limited number of core faculty already directly observe most residents in multiple contexts may find that this work does not provide additional benefit for ACGME milestone reporting.
In our experience at the University of Washington, the investment of faculty time and effort to create rotation-specific EPAs for our evaluations, and to map responses to the milestone elements, has ultimately created a process that meets multiple requirements of the Next Accreditation System. This process provides substantial flexibility, creates an evaluation system that captures the essential work of each rotation, and connects data from rotation evaluations to resident progress reports in the 22 ACGME subcompetencies. Although initial effort was significant, this work has led to substantial savings in program time needed to create the semiannual ACGME milestone reports. The development of the rotational EPAs also gave our program opportunity to closely review the educational goals of each of our clinical experiences, and it has led to renewed dialogue regarding the curricular content and competency-based outcomes of each rotation. Ultimately, this process has strengthened our resident education by improving the quality of feedback from faculty and provides our program with improved evaluation data that help meet the requirements of the Next Accreditation System.
1. Nasca TJ, Philibert I, Brigham T, Flynn TC. The next GME accreditation system—rationale and benefits. N Engl J Med. 2012;366:1051–1056.
2. Haber RJ, Avins AL. Do ratings on the American Board of Internal Medicine resident evaluation form detect differences in clinical competence? J Gen Intern Med. 1994;9:140–145.
3. Thompson WG, Lipkin M Jr, Gilbert DA, Guzzo RA, Roberson L. Evaluating evaluation: Assessment of the American Board of Internal Medicine resident evaluation form. J Gen Intern Med. 1990;5:214–217.
4. ten Cate O. Entrustability of professional activities and competency-based training. Med Educ. 2005;39:1176–1177.
5. ten Cate O. Trust, competence, and the supervisor’s role in postgraduate training. BMJ. 2006;333:748–751.
6. ten Cate O, Scheele F. Competency-based postgraduate training: Can we bridge the gap between theory and clinical practice? Acad Med. 2007;82:542–547.
7. ten Cate O. Nuts and bolts of entrustable professional activities. J Grad Med Educ. 2013;5:157–158.
8. Kennedy TJ, Regehr G, Baker GR, Lingard L. Point-of-care assessment of medical trainee competence for independent clinical work. Acad Med. 2008;83(10 suppl):S89–S92.
9. Hauer KE, ten Cate O, Boscardin C, Irby DM, Iobst W, O’Sullivan PS. Understanding trust as an essential element of trainee supervision and learning in the workplace. Adv Health Sci Educ Theory Pract. 2014;19:435–456.
10. Hauer KE, Oza SK, Kogan JR, et al. How clinical supervisors develop trust in their trainees: A qualitative study. Med Educ. 2015;49:783–795.
11. Sterkenburg A, Barach P, Kalkman C, Gielen M, ten Cate O. When do supervising physicians decide to entrust residents with unsupervised tasks? Acad Med. 2010;85:1408–1417.
12. Biondi EA, Varade WS, Garfunkel LC, et al. Discordance between resident and faculty perceptions of resident autonomy: Can self-determination theory help interpret differences and guide strategies for bridging the divide? Acad Med. 2015;90:462–471.
13. Crossley J, Jolly B. Making sense of work-based assessment: Ask the right questions, in the right way, about the right things, of the right people. Med Educ. 2012;46:28–37.
14. Holmboe ES. Realizing the promise of competency-based medical education. Acad Med. 2015;90:411–413.
16. Caverzagie KJ, Iobst WF, Aagaard EM, et al. The internal medicine reporting milestones and the next accreditation system. Ann Intern Med. 2013;158:557–559.
17. Green ML, Aagaard EM, Caverzagie KJ, et al. Charting the road to competence: Developmental milestones for internal medicine residency training. J Grad Med Educ. 2009;1:5–20.
18. Caverzagie KJ, Cooney TG, Hemmer PA, Berkowitz L. The development of entrustable professional activities for internal medicine residency training: A report from the Education Redesign Committee of the Alliance for Academic Internal Medicine. Acad Med. 2015;90:479–484.
19. Warm EJ, Mathis BR, Held JD, et al. Entrustment and mapping of observable practice activities for resident assessment. J Gen Intern Med. 2014;29:1177–1182.
20. Chen HC, van den Broek WE, ten Cate O. The case for use of entrustable professional activities in undergraduate medical education. Acad Med. 2015;90:431–436.
22. 2015.AnnArbor, Mich: MedHub LLC.
23. Choe JH. Beyond “good job!”: Using EPAs to improve resident feedback. SGIM Forum. 2015;38:5, 10–11.
24. Regehr G, MacRae H, Reznick RK, Szalay D. Comparing the psychometric properties of checklists and global rating scales for assessing performance on an OSCE-format examination. Acad Med. 1998;73:993–997.
25. Norman GR, Van der Vleuten CP, De Graaff E. Pitfalls in the pursuit of objectivity: Issues of validity, efficiency and acceptability. Med Educ. 1991;25:119–126.
26. Ilgen JS, Ma IW, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. Med Educ. 2015;49:161–173.
27. Norman G. Editorial—checklists vs. ratings, the illusion of objectivity, the demise of skills and the debasement of evidence. Adv Health Sci Educ Theory Pract. 2005;10:1–3.
28. Kogan JR, Conforti LN, Iobst WF, Holmboe ES. Reconceptualizing variable rater assessments as both an educational and clinical care problem. Acad Med. 2014;89:721–727.
Appendix 1 Examples of Rotation-Specific EPAs for Residents at the University of Washington Internal Medicine Residency Programa