Assessing Trainees and Making Entrustment Decisions: On the Nature and Use of Entrustment-Supervision Scales

ten Cate, Olle PhD; Schwartz, Alan PhD; Chen, H. Carrie MD, PhD

doi: 10.1097/ACM.0000000000003427
Competency-based medical education seeks to increase the quality and safety of health care by delineating standards of competence and ensuring that graduates of training programs—undergraduate or postgraduate—meet those standards.1–3 Assessment of competencies in the clinical workplace is not easy.4–11 A recent movement toward entrustment decision making for entrustable professional activities (EPAs)12 as an approach to assessing trainees has been welcomed as an alternative. The EPA model uses levels of supervision to define how much autonomy a trainee should be afforded—that is, how much responsibility the trainee can safely be given. This has led to the use of entrustment-supervision scales (ES scales), ranging from “no permission to execute the EPA” to “ready to supervise others for this EPA.” ES scales have been applied in different ways, have been questioned,13,14 and have occasionally been misunderstood. Our aim is to clarify (1) the underlying argument linking entrustment with assessment and its expression as a scale; (2) the distinction between ad hoc entrustment decisions and summative, formalized decisions; (3) the noncontinuous nature of entrustment decisions and ES scales; (4) the difference between retrospective and prospective scales; and (5) entrustment for unsupervised practice and supervision of others, as well as the program, context, and specialty specificity of ES scales. We will explain why ES scales should not be viewed and used as “just another rating scale” for workplace-based assessment.15

The Argument for Entrustment in the Assessment of Medical Trainees

EPAs make the connection between education and patient care16 by delineating activities that a licensed physician, certified specialist, or other credentialed health professional is expected and allowed to perform into manageable units of practice that can be overseen, observed, assessed, and documented. Key in the use of EPAs is the entrustment of trainees, under a designated level of supervision, with clinical activities while still in training.17 Entrustment aligns the construct of assessment with how supervising practitioners work with and make decisions about trainees in the workplace.18–20 In the clinical workplace, trainees learn by participating in the care of patients; they make legitimate contributions to patient care teams while gradually becoming autonomous members of such teams. Supervising clinicians make decisions about what work to entrust trainees with at what degree of supervision to arrive at safe care for the patient while still challenging the trainee to master relevant clinical activities.21,22 As trainees progress, supervising clinicians entrust them with increasingly greater patient care responsibility until they can enact patient care activities safely with minimal supervision.23Entrustment as an assessment framework makes for strong consequential validity,24–26 as entrustment of a trainee not only informs decisions about their educational progression but also decisions about permission to engage in clinical activities, anticipating quality of patient care.27

Levels of Entrustment and Supervision as a Scale

The notion that defining a suitable level of supervision for a task can serve as a benchmark to track trainee progression has led to the creation of various measurement tools. Although increasingly called “entrustability scales,”28 we prefer to call these ES scales, to avoid conflation with scales of proficiency.29 Conceptually, ES scales operationalize the progressive autonomy for which health professions education strives.30–33 ES scales can guide teacher interventions within what Vygotsky has named the “Zone of Proximal Development” of a trainee.22,34

A common generic 5-level scale model for supervision in EPAs can be summarized (with small variations) as 1 = May be present but may not practice EPA, 2 = May practice EPA under direct (proactive) supervision, 3 = May practice EPA under indirect (reactive) supervision, 4 = May practice EPA unsupervised (under distant oversight), and 5 = May act as supervisor for junior trainees for this EPA.16,35 This scale is meant to be used prospectively—that is, to express a decision to be made regarding the trainee’s future responsibility or autonomy.

ES scales should therefore reflect the extent of permissible engagement in professional practice, rather than being a measure of competence. Some ES scales that have been proposed seem to be disconnected from decisions about permissible practice. Several authors use the word “entrustable” to reflect a high proficiency level of a trainee (e.g., in a 3-point scale: “requiring corrective response – developing skills – entrustable”36—a use of the word entrustable that we do not endorse) but do not state what the scale values mean for supervision or engagement in practice. Other frequently used scales focus on retrospectively reporting how much supervision was provided when the trainee was last observed (“e.g., 1: I had to do, 2: I had to talk them through, 3: I had to prompt them from time to time, 4: I needed to be in the room just in case, 5: I did not need to be there”37) rather than prospectively indicating a level of responsibility a trainee is ready to assume and an observer is ready to assign.28,38Table 1 provides an overview of various ES scales proposed in the past decade and includes a classification of their prospective versus retrospective nature.

Table 1:
Varieties of Entrustment-Supervision Scales Published Through 2018, as Known to the Authors

Reporting observed performance (what did she do?), proficiency (how good is she?), and entrustment (is she ready for autonomy?) are conceptually separate steps in trainee assessment and entrustment decision making, requiring increasingly higher levels of inference. The last step is the aim of the full workplace-based assessment process and of ES scales.

The last step of entrustment decision making is a 2-phase process. The first phase is an assessment of readiness to be trusted with a specific clinical responsibility, and the second is a decision to transfer that responsibility. The “intention to trust” (basically a feeling of trust) mediates the decision.39–41 It is helpful to separate these 2 phases of assessment and decision making, as they may operate somewhat independently. Depending on circumstances, a supervisor may have great trust but still not make the entrustment decision, while in other cases, there may be marginal trust, but a supervisor is willing to take the risk, for instance, when the workload on a clinical unit is overwhelming and all hands are needed, or when situational context features are favorable.33 The first phase may involve a scale to characterize a trainee’s stage of development and may show features of a continuous scale. In contrast, an ES scale for the second phase can only have discrete steps, as decisions about a given level of supervision or permission can only be dichotomous (yes or no).

Krupat has criticized ES scales because “what should be minimized at all costs is making judgments in which layers of inference are placed between the observation and the judgment”13 (emphasis in the original). However, supervisors must make entrustment decisions based on whatever observations and other information they have. Therefore, the purpose of entrustment is to add inference to observations. It is a weakness of existing workplace-based assessment that raters are not forced or even invited to think about the consequences of their ratings.

Entrustment Decisions in Ad Hoc Versus Summative Situations

Entrustment decisions made ad hoc critically differ from summative entrustment decisions.17,19 Supervising clinicians need to balance educational benefits with patient safety33,42 when making ad hoc entrustment decisions, which are influenced by trainee proficiency, the propensity of the supervisor to trust, the nature of the task, the context, and the supervisor–trainee relationship.43,44 Case complexity, clinic workload, and presence of attending staff with advanced nonmedical degrees (e.g., nurse practitioners) have been reported to be associated with supervision intensity.45 Such decisions are also discipline-specific. In surgery, for instance, the supervisor’s decision to scrub in or not to scrub in may be a significant gradation within direct supervision (as the unscrubbed supervisor cannot take over a procedure immediately). That would not be a meaningful distinction in primary care and internal medicine, which require different shades of supervisory presence than surgical specialties.15

Summative entrustment decisions, on the other hand, reflect the formal steps to increase autonomy and decrease supervision. Aggregate information from multiple ad hoc decisions made by different supervisors in a variety of contexts, plus other sources of information, can inform a reliable picture of trainee readiness and guide summative decision making. The summative entrustment decision is not an aspiration but a commitment to allowing the trainee to practice with a given level of supervision. The ultimate summative entrustment decision is awarding the license or certificate that allows a professional to exercise the breadth of their professional practice. Training programs that employ EPAs break this decision down into units of professional practice that can be overseen, observed, assessed, and documented. Summative entrustment decisions should result in consequences for the trainee’s privileges and duties: an actual change in supervision and autonomy.

At the same time, entrustment decisions cannot be made with perfect knowledge of their consequences; they require trust. The supervisor or educational team cannot and need not observe the trainee under all possible variations of events that could occur during an activity. Similarly, Clinical Competency Committees, required for progress decisions in U.S. graduate medical education,46 rarely have complete information about trainee performance in every context. Decision makers need only sufficient information to be confident that the trainee will also perform well in situations that have not yet been observed. This inference may not always prove right, and the decision can, in hindsight, be a poor one. Even a well-grounded entrustment decision cannot preclude adverse events. Trust, by definition, involves the acceptance of risk.41,47,48 Well-designed clinical education incorporates practices to decrease this risk to an acceptable level for entrustment, but cannot eliminate it.

The Ordinal Nature of ES Scales

Entrustment of a trainee with a task in health care is a decision. Although decisions to provide increasingly less supervision result in a gradual increase of autonomy, each such decision remains discrete and these decisions, therefore, do not form a true continuum. Ad hoc decisions about supervision in the daily clinical setting may be subtly titrated to what a trainee needs within context-specific demands. Even when formalized in a scale with many response options (e.g., “Comfortable to leave trainee to go on brief coffee break,” “Happy to leave the theater block but remain immediately available in the hospital”49), each option reflects a discrete decision.

Likewise, summative entrustment decisions should serve as certifications and have no “in-between” options. On an ES scale with 5 entrustment or supervision levels, “scores” of 3.5 or 3.7 are illogical. In entrustment decisions, the question is not what is the competence of this trainee?, but what level of supervision will this trainee require to facilitate learning and to limit the risk to patients? Indeed, choosing a supervision threshold for consistent safe patient care may be a noncompensatory judgment best informed by a trainee’s lowest score, rather than average score. A summative decision can require direct supervision (conventionally level 2: supervisor in the room) or indirect supervision (conventionally level 3: supervisor not in the room but quickly available), but not much in between (a supervisor “at the door post” is not a useful distinction). Summative entrustment decisions are like promotion decisions: no one is an assistant professor-and-a-third. Entrustment is a decision based on inferences about trainee performance, not an inference itself. Accordingly, we refer to, and recommend, “entrustment-supervision scales” rather than “entrustability scales”28,37 as better capturing the construct.

ES scales represent a set of ordered, nested decisions, in which permission to act at each increasing level of autonomy (or decreasing level of supervision) is an observable decision that explicitly permits actions at each identified “lower” level gradation. They are ordinal rather than interval scales. The order of the steps is fixed, but the distances between the steps need not be equal, in terms of development, time between decisions, or quality measures. As such, ES scales should not be summarized with interval-level statistics such as means.

There may be a linguistic reason that results in the tendency to use ES scales as continuous. The adjective “entrustable” was meant to refer to activities,12 not to trainees or scales.29 However, as authors began to speak of “pre-entrustable” and “entrustable” trainees,50–52 the meaning of the word shifted. Recently, an “entrustability scale” was presented with 3 anchors (“pre-entrustable,” “emerging,” and “entrustable”) and 5 score values.53 Although we appreciate that “pre-entrustable” was chosen to avoid words like “untrustworthy,” we prefer “readiness” for a specific level of autonomy over “entrustable” when describing a trainee. We believe referring to trainees as entrustable is confusing and threatens to co-opt entrustment into serving as yet another label for a psychological continuum of competence, rather than a description of a set of actionable (yes or no) decisions.

Retrospective Versus Prospective ES Scales

ES scales published to date include some that can be classified as retrospective (“how much supervision did I provide this trainee with?”)37,54 and some that have a prospective nature (“how much supervision will this trainee need with future patients?”).49,55–57 Although both types consider supervision, retrospective scales reflect observed performance only; in contrast, prospective scales require clinicians to project ahead, consider the unknown, gauge the level of risk, and determine what level of supervision this trainee is ready for in upcoming cases.

The retrospective or prospective orientation of an ES scale is not a trivial distinction. It reflects fundamentally different views of assessment. Entrustment decisions focus on the unknown, on novel encounters with patients and contexts that may be different from those that were previously observed and from those that guided the decision. Usually, trainees are assessed for what they learned and how they demonstrably met educational objectives, rather than for what they are expected to achieve in the future. Yet, institutions and educators with a responsibility to prepare trainees for licensing for a profession with critical responsibilities need to estimate their graduates’ readiness for these duties. That is the core inference for entrustment. Although retrospective scales report observations that may form the basis for this inference, prospective scales directly capture the decision to entrust. The scales measure different things.58,59 An analogy can be drawn between reporting rainfall after a hurricane and issuing a warning to evacuate as the storm approaches. Knowledge of past patterns of precipitation contributes substantially to predictions of rainfall, but decisions about whether to evacuate necessarily involve forecasting and a consideration of risk.

Prospective decisions must take more into account than specific abilities, including general qualities of trainees, such as integrity, reliability, humility, and agency.17,19 These scales align better with decisions clinicians are forced to make routinely when conducting patient care activities with a trainee. Yet, judging how trainees will do in unfamiliar situations (e.g., whether they will show agency and adaptive competence) is not easy59: the less acquaintance assessors have with the trainee, the less likely it is that such trust judgments will be reliable.60 Thus, clinicians find retrospective questions easier to answer and may not provide equivalent responses when using both prospective and retrospective scales.59 There is a natural tendency to evaluate trainees in comparison to each other (i.e., normatively). Moving to criterion-referenced evaluation (i.e., estimating readiness to work with more autonomy) is challenging, even for trainees.61

One option may be to use a hybrid approach, by reporting (1) how much supervision the trainee needed for a current case, plus (2) an indication of the case complexity, and (3) an estimate of how much supervision the trainee will require on future cases. Such a combined measurement broadly presents 3 possibilities. First, the supervisor may report that he or she believes the trainee will require more supervision in the future than received with the current case (i.e., autonomy should be reduced or revoked, the trainee requires remediation, or the supervisor’s ad hoc judgments could be improved). Second, the supervisor may believe the trainee’s current level of supervision should be maintained. In this case, the trainee could be valuably advised as to what additional practice he or she requires or what additional skills or qualities to display to warrant a further decrease in supervision. Third, the supervisor may believe the trainee requires less supervision in a next practice of the activity than he or she has required so far.

Entrustment for Unsupervised Practice and for Supervising Others

Summative entrustment for unsupervised practice (conventionally level 4) is often viewed as unconditional. This differs from ad hoc decisions, which are always context dependent and involve case complexity, availability of assistance (e.g., experienced nurses or others), time of the day, and workload of the unit. Yet, someone with a level 4 certification starting unsupervised practice in a different hospital may still need some initial supervision or monitoring in a new role to become familiar with colleagues, rules, equipment, and patient mix. This “extra” (ad hoc) supervision is usually understood as temporary, and may be provided by the new employer or the clinician’s peers, either formally or informally.62 Presumptive trust may be justified, but an initial supervisory check can confirm that. This ad hoc supervision does not necessarily belie the summative level 4 entrustment decision.

Entrustment to supervise (conventionally level 5) has sometimes been misunderstood as “permission to teach.” An advanced trainee teaching a junior is not necessarily qualified at level 5. Conversely, level 5 does not reflect a generic ability to teach or supervise but relates to a specific EPA. Supervision in clinical education can be understood as

the provision of guidance and support in learning and working effectively in health care by observing and directing the execution of tasks or activities and making certain that everything is done correctly and safely, from a position of being in charge.63

Following this definition, an entrustment-to-supervise decision implies much more than instructing others. As a trainee develops, the breadth of responsibilities and required qualities increases. The trajectory may start with a focus on skills at level 2. Next, more generic qualities such as integrity, reliability, humility, and agency35 become important for entrustment at levels 3 and 4. Similarly, entrustment at level 5 demands additional qualities, including managerial, supervisory, and assessment skills, above and beyond mastery of the activity. Supervision implies the ability to manage trainee missteps and mitigate risks for the patient.

Specialty- and Context-Specific Features and Regulatory Constraints of ES Scale Values

Hatala et al15 have stressed that Ottawa Surgical Competency Operating Room Evaluation or Ottawa Clinic Assessment Tool entrustment scales, developed for surgery and recommended for use across all disciplines in Canada,28,37,64 may be less useful for internal medicine. This makes sense. Supervisors of different specialties operationalize entrustment in practice differently: “Can I leave the theater (anesthesiologist)?,”65 “Shall I assist scrubbed or unscrubbed (cataract ophthalmologist)?,” “Can I send the trainee for a patient home visit or do I need to join (family doctor)?,” “Shall I put on a lead apron or just watch the screen (interventional cardiologist)?,” “Do I need immediate report or is delayed report safe (pediatrician)?,” “Can I switch my phone to airplane mode or not (internist)?” Although one generic scale may serve as a general frame of reference, different operationalizations for stages of training, specialties, and professions may be necessary to align entrustment decisions with clinical practice.18 This suggests avoiding uniform scales.

Chen et al have added steps to the general 5-level framework to make it suitable for entrustment considerations in undergraduate medical education,55 just as Weller et al have done for postgraduate anesthesiology training.49 These more detailed scales with finer gradations appear useful in ad hoc decision making. In contrast, formal, summative entrustment decisions generally do not require more than 5 levels, and, depending on the program, can arguably use fewer. In undergraduate medical education, level 2 (direct supervision) and level 3 (indirect supervision) reflect the dominant decisions; in postgraduate medical education, levels 3 (supervised), 4 (unsupervised), and 5 (acting as supervisor) are dominant.

Finally, legislation may preclude withholding supervision, even for the most proficient trainees. Medical students are generally prohibited by law from unsupervised practice, no matter how competent and ready for entrustment they appear when performing activities. This inherent limit on decision making makes it tempting to speak of trainee “entrustability” (as “the decision I would make if I could”), which has become unfortunately misunderstood as a continuous competence construct. A better response would be to truncate the scale and include only those decisions that are actually permissible choices for a given activity for a given developmental phase; that is, a faculty member assessing a medical student would be presented with only the first 3 levels of the typical 5-level ES scale because decisions at those levels are the only decisions possible. An additional advantage is that one common set of supervision gradations can be used across the educational continuum, but with decisions made only on those levels that are relevant for a given phase of education. Alternatively, in a competency-based time-variable education system, a decision to entrust a student to practice all activities with the autonomy of a resident could act as the trigger to advance the student to residency.66


Entrustment decision making, as an approach to assessment of trainees in the workplace, is more than new anchors on an old scale. Fundamentally, this approach should force supervisors and programs to think about the consequences of trainee engagement in clinical practice, weighing the benefits and the risks of discrete steps toward maturity and autonomy as a clinical practitioner. Entrustment decisions for units of professional practice should focus educators on the goal of competency-based education: graduating practitioners that can be trusted, by whomever will depend on their service, to provide safe and high-quality care. We agree with Hatala et al15 that it is time to start learning from the experiences of various specialty domains with entrustment decision making. However, we should be cautious that the evidence should not be derived from experience with scales that were not created or used for true entrustment decision making.


