Secondary Logo

Contextualizing Work-Based Assessments of Faculty and Residents

Is There a Relationship Between the Clinical Practice Environment and Assessments of Learners and Teachers?

Stroud, Lynfa, MD, MEd; Kulasegaram, Kulamakan, PhD; McDonald-Blumer, Heather, MD, MSc; Lorens, Edmund, MEd; St. Amant, Lisa; Ginsburg, Shiphra, MD, PhD

doi: 10.1097/ACM.0000000000002502
Research Reports

Purpose Competence is bound to context, yet seldom is environment explicitly considered in work-based assessments. This study explored faculty and residents’ perspectives of the environment during internal medicine clinical teaching unit (CTU) rotations, the extent that each group accounts for environmental factors in assessments, and relationships between environmental factors and assessments.

Method From July 2014 to June 2015, 212 residents and 54 faculty across 5 teaching hospitals at University of Toronto rated their CTU environment using a novel Practice Environment Rating Scale (PERS) matched by block and hospital. Faculty-PERS data were paired to In-Training Evaluation Reports (ITERs) of residents supervised during each block, and Resident-PERS data to Resident Assessment of Teaching Effectiveness (RATE) scores of the same faculty. Differences between perceptions and assessments were tested using repeated-measures MANOVAs, ANOVAs, and correlations.

Results One-hundred sixty-four residents completed the PERS; residents rated the CTU environment more positively than faculty (3.91/5 vs. 3.29, P < .001). Residents were less likely to report considering environmental factors when assessing faculty (2.70/5) compared with faculty assessing residents (3.40, P < .0001), d = 1.2. Whereas Faculty-PERS ratings did not correlate with ITER scores, Resident-PERS ratings had weak to moderate correlations with RATE scores (overall r = 0.27, P = .001).

Conclusions Residents’ perceptions of the environment had small but significant correlations with assessments of faculty. Faculty’s perceptions did not affect assessments of residents, potentially because they reported accounting for environmental factors. Understanding the interplay between environment and assessment is essential to developing valid competency judgments.

L. Stroud is associate professor, Department of Medicine, and education researcher, Wilson Centre for Education, University of Toronto, Toronto, Ontario, Canada.

K. Kulasegaram is assistant professor, Department of Family and Community Medicine, and education scientist, Wilson Centre for Education, University of Toronto, Toronto, Ontario, Canada.

H. McDonald-Blumer is associate professor, Department of Medicine, University of Toronto, Toronto, Ontario, Canada.

E. Lorens is research officer, Department of Medicine, University of Toronto, Toronto, Ontario, Canada.

L. St. Amant is research and curriculum coordinator for postgraduate medical education, University of Toronto, Toronto, Ontario, Canada.

S. Ginsburg is professor, Department of Medicine, and scientist, Wilson Centre for Education, University of Toronto, Toronto, Ontario, Canada.

Funding/Support: This study was funded by a Royal College of Physicians and Surgeons of Canada, Medical Education Research Grant.

Other disclosures: None reported.

Ethical approval: This study was approved by the University of Toronto Health Sciences Research Ethics Board.

Previous presentations: International Conference on Resident Education, Niagara Falls, Ontario, Canada (peer-reviewed oral abstract presentation), September 30, 2016; and Research in Medical Education Conference, Association of American Medical Colleges (AAMC), Seattle, Washington (peer-reviewed oral abstract presentation), November 15, 2016.

Supplemental digital content for this article is available at

Correspondence should be addressed to Lynfa Stroud, Sunnybrook HSC, Room D4-70a, 2075 Bayview Ave., Toronto, Ontario, Canada M4N 3M5; telephone: (416) 480-6100, ext. 83627; e-mail:

Competency-based frameworks that guide postgraduate education in Canada (CanMEDS)1 and the United States (Accreditation Council for Graduate Medical Education)2 are increasingly focused on work-based assessments (WBAs). However, there is still a propensity to conceptualize competence as an individual trait,3 despite evidence that individuals are not similarly competent across different environments. For example, physician performance is known to vary over time and across clinical contexts,4–6 and factors such as patient mix, patient complexity, patient volumes, and teamwork are known to complicate WBAs of individual physicians.7 Additionally, it is recognized that not all problems related to physician performance stem from individual physician competence but may instead reflect interactions between individual and system-related influences.3 , 8 Despite this, environmental factors are not typically measured or explicitly accounted for in WBAs, yet failing to consider contextual factors may lead to erroneous assessments of behaviors9 , 10 and may lead to missed opportunities for feedback and enhancement of learning.

The influence and direction of environmental effects on performance remain speculative and require investigation. Although we hypothesize that faculty and residents share similar perceptions of their clinical environment given the commonality of their experience and that both would report taking environmental factors into consideration when completing assessments of the other, the nature of this relationship is challenging to anticipate a priori. For example, a positive environment may allow an individual to perform better, which would be reflected in higher ratings; similarly, a challenging environment could cause poor performance and lower ratings. Conversely, a challenging environment might be accounted for during assessment, meaning that an individual may receive a high rating if he or she is judged to be performing well despite the environment.

The “context” of clinical encounters includes three domains and their interactions: patient factors, physician factors, and setting factors.5 The importance of the environment, or setting factors, in medical education is well described in relation to learning and curriculum design,11–13 and to patient safety and health care delivery,14 with the majority of related scales developed to provide feedback on improving the learning environment.15 Several existing scales specifically measure the clinical postgraduate educational environment, such as the Postgraduate Hospital Medicine Education Environment Measure, known as PHEEM,16 and the Dutch Residency Educational Climate Test, or D-RECT.17 However, these also largely evaluate and track experiential aspects of the environment, such as support and supervision, so that programs may optimize the learning environment for their residents, rather than more specific or tangible aspects of the setting. For example, factors such as physical space and resources, hospital policies, care transitions, and access to consultative service may all influence how individuals are able to function in the clinical setting, yet are not incorporated in existing scales.16 , 17 These additional factors may be worthy of consideration when interpreting resident performance, particularly with implementation of competency-based medical education (CBME), where a greater emphasis is placed on WBAs of residents.

There is growing recognition that attempts to measure competency in the workplace should be contextualized.18–25 For example, for entrustable professional activities (EPAs), an entrustment decision is made about a particular competence within a particular setting and relating to a particular patient complexity.20 However, it is unclear, within the broad concept of context, which specific elements of a given environment—or setting factors—are important to consider in relation to assessments, and how they may affect different aspects of performance. To date, there has been little consideration given to the practice environment in which residents and faculty work and, more specifically, if or how it may affect their assessments.26 Understanding the interplay between the environment and assessment is essential to developing valid judgments about competence.

Therefore, using the previously developed Practice Environment Rating Scale (PERS),27 this study aimed to examine the relationship between ratings of the internal medicine (IM) clinical teaching unit (CTU) practice environment and faculty and residents’ WBA scores. Specifically, we explored faculty and residents’ perspectives of the practice environment during IM CTU rotations, the extent to which each group accounted for environmental factors in their assessments, and the relationship between environmental factors and WBAs.

Back to Top | Article Outline


Rating scale development

In a previous study,28 we conducted focus group discussions with faculty and trainees on IM CTU rotations across five teaching hospitals at University of Toronto from 2010 to 2011 to determine the major environmental factors perceived to affect work-based performance in the clinical setting, or practice environment, that may or may not be changeable. We subsequently used these data to develop two PERSs—one for faculty (F-PERS) and one for residents (R-PERS)—to rate the environment on IM CTU rotations.27 As part of their development, the scales were piloted on a cross-sectional group of faculty (n = 50) and residents (n = 172) on CTU in 2012.27 A principal components analysis determined that the faculty scale (21 items) and the resident scale (27 items) each displayed an eight-factor solution. Analysis demonstrated adequate values for the Kaiser–Meyer–Olkin (KMO) criteria for sampling (0.71) for the R-PERS but a KMO of only 0.51 for the F-PERS. However, Bartlett’s test for the identity matrix was significant for both PERSs, indicating appropriateness for factor analysis (P < .02). Good overall scale reliability was also demonstrated for each, with an alpha coefficient of 0.77 for the F-PERS and 0.85 for the R-PERS. The eight factors that we identified were busyness (2 items on each); team function (3 items faculty, 4 items residents); transitions in care (4 items on each); patient complexity (2 items on each); assistance from others (5 items on each); hospital space and resources (3 items faculty, 5 items residents); others’ awareness of my role (1 item faculty, 3 items residents); and one factor that was different for faculty and residents, reflecting their positions: helpfulness of senior resident (for faculty, 1 item) and helpfulness of supervisor (for residents, 2 items).

The proposed factors generated by the analysis were presented to the research team (L.S., H.M.B., S.G.) for expert input and contextualization with qualitative data that accompanied our pilot study. This led to slight alterations to our scales based on this input. Explicitly, we added an item about “time and effort required to discharge a patient” to the scale and subdivided two previous questions that combined physical space and resources for patient care and teaching on the ward and in the emergency room into separate items on each scale. Therefore, our final PERSs, after our minor adjustments, included 30 items for the R-PERS (see Supplemental Digital Appendix 1, available at and 24 items for the F-PERS (see Supplemental Digital Appendix 2, available at Participants rated each item on the extent to which they perceived that it was present in the CTU environment during their most recent rotation using a five-point Likert scale (1 = almost never, 2 = 25% of the time, 3 = 50% of the time, 4 = 75% of the time, 5 = almost always). Busyness was rated on a five-point scale, but with slightly different anchors (see Supplemental Digital Appendices 1 and 2, available at, for details). Two additional questions asked participants to use five-point Likert scales to rate how optimal the work environment on CTU was (1 = strongly disagree to 5 = strongly agree) and the extent to which they perceived that they took the environment into account when assessing others (i.e., faculty’s perceptions when assessing their residents and residents’ perceptions when assessing their faculty using a five-point Likert scale [1 = not at all, 2 = a little, 3 = somewhat, 4 = a lot, 5 = absolutely]). Residents were also asked whether they would recommend their CTU rotation to others on the basis of their experiences.

Once we finalized our PERS scales, a research assistant (L.S.A.) conducted cognitive testing with two faculty and three residents (two senior and one junior) to determine how respondents understood the items and how they choose their responses, and to make further small refinements in clarity and language of questions.

Back to Top | Article Outline

Data collection

We collected data across one full academic year (July 2014 to June 2015) to ensure that residents would remain in the same postgraduate year (PGY) and to optimize the likelihood that they experienced the same hospital CTU throughout the study period, as residents rotate hospitals between years but seldom within a year. This also permitted us to examine seasonal variation in responses. During this period, all general IM faculty and all residents on inpatient IM CTUs at each of the five teaching hospitals at the University of Toronto were asked to complete a single PERS after each CTU rotation they just finished. The PERS was distributed electronically and, for faculty, accompanied the In-Training Evaluation Reports (ITERs) that they completed for each resident; for residents, it accompanied the Resident Assessment of Teaching Effectiveness (RATE) that they completed for faculty. Our ITER and RATE forms have been previously shown to be reliable tools with validity evidence for their use in WBAs in our setting.29–31 Two researchers (L.S.A., E.L.) worked together to extract, download, collate, and link all faculty and resident data from these three forms (PERS, ITER, RATE) into one anonymized database. Participation was voluntary, and all participants provided informed consent. The University of Toronto Research Ethics Board approved this study.

Back to Top | Article Outline

Data analysis

To be able to test all associations in our analysis, we only included data for which we could link a full set of all four forms (F- and R-PERS, ITERs, and RATEs) for a given rotation. To be explicit, for a given rotation this required faculty to have completed a F-PERS of the environment and ITERs of residents supervised during the block, and residents that were supervised to have completed an R-PERS of the environment and RATEs of the same faculty.

We first looked at descriptive data for faculty’s and residents’ ratings of the environment, and for whether they perceived that they took the environment into account in their assessments of others. For ratings of the environment, we calculated an aggregated mean of the items included within the eight environmental factors (or subscales) and an aggregated mean of all items on the scale to generate an overall PERS score. We then looked for differences by time of year, hospital site, and interactions for these; and, for residents, whether they were medical or nonmedical residents. Differences between perceptions were tested using repeated-measures MANOVAs for each analysis, including separate tests for differences between sites (coded as each CTU) and across time (season, coded in three-month quarters); for residents we also compared differences in perceptions between IM and non-IM residents (two MANOVAs for faculty and three for residents). Specific comparisons between faculty and resident willingness to account for environment in assessment in overall rating of environment were analyzed using a repeated-measures ANOVA. Missing data for reliability and factor analyses were handled using mean imputation if the respondent had completed over 80% of the scale. Otherwise, data were removed using listwise removal. For all other analyses, only participants providing complete data for each PERS form were included in the analysis so that evaluations of the CTU environment were available from both faculty and residents. Only a small minority of analyzed PERS data had missing items (< 4%). The overall rating was substituted where appropriate.

To examine the relationship between the PERS and WBAs, we tested correlations between the F-PERS and ITERs, that is, the faculty’s perception of the environment and his or her assessment of the resident; and between the R-PERS and RATEs, that is, the resident’s perception of the environment and his or her assessment of the faculty, at the individual rotation level. Correlation coefficients were computed for the overall aggregated mean and for the eight subscales on the F-PERS with the overall score and with the CanMEDS domain-specific subscores on the ITER (also generated using aggregated means of items within each domain). As the RATE form only has seven items, for these relationships, correlations were computed between the R-PERS overall aggregated mean and between the R-PERS subscale scores and the RATE overall and each of the seven items on the RATE. Analysis of data was corrected for multiple comparisons where possible using the Bonferroni method and for post hoc comparisons using the least significant differences method. We used SPSS statistical software, version 22 (IBM Corporation, Armonk, New York) to analyze the data.

Back to Top | Article Outline



From July 2014 to June 2015, 54 faculty and 164 residents across 5 teaching hospitals rated their IM CTU environment using the PERS and simultaneously could be paired with ITER and RATE forms. Overall, 212 unique residents were available to provide data throughout the year, but the size per rotation varied. Average completion rates across each rotation were 39% for the F-PERS and 32% for the R-PERS. Of the 164 learners for whom complete data sets existed, 122 were IM residents (100 PGY-1 and 12 PGY-2 or PGY-3), and 42 were non-IM PGY-1 residents. Responses were equally distributed across hospital sites and seasons.

Back to Top | Article Outline

Perceptions of the environment

On aggregate, residents rated the CTU environment more positively than did faculty, with a moderate effect size (3.91/5 vs. 3.29, F(1, 91) = 17.12, P < .001, d = 0.5) (see Table 1), and this effect was replicated in the repeated-measures MANOVA of all items. None of the correlations between F-PERS and R-PERS ratings reached thresholds for significance after correction for multiple comparisons; further, no correlations exceeded 0.3 in terms of absolute magnitude. There was no effect of hospital site (coded as each CTU) or season (coded in three-month quarters) on either the faculty or resident PERS ratings according to the MANOVA analysis. IM residents rated their environment more positively than non-IM PGY-1s (4.11/5 vs. 3.72, F(1, 160) = 9.32, P < .0001, d = 0.62). Faculty were more likely to report that they take environmental factors into account when assessing residents (3.40/5) compared with residents assessing faculty (2.70, F(1, 92) = 18.2, P < .0001, d = 1.2).

Table 1

Table 1

Back to Top | Article Outline

Relationship between environmental ratings and assessments

In looking at the relationship between how individuals rated their environment and the assessments that they gave others, different patterns emerged for faculty and for residents. For faculty, there was no correlation between their ratings of the environment on the F-PERS and their assessments of residents working within that environment, either on the overall score or CanMEDS-specific domains of the ITER (Table 2). However, when we looked at residents’ ratings of the environment on the R-PERS and their assessments of faculty supervising them within the same environment, there were small to moderate correlations on the RATE forms (Table 3). For example, higher R-PERS scores on busyness correlated with lower faculty RATE scores (r = −0.23, P = .002). Conversely, better team functioning, better transitions in patient care, and greater awareness of the resident’s role by others were associated with more positive ratings of faculty on the RATE (r = 0.21, P = .006; r = 0.18, P = .02; r = 0.26, P = .01; respectively). The highest correlation was between having helpful supervisors and the overall RATE (r = 0.45, P < .001). There was also a correlation between the overall environment rating by residents and the overall RATE given to faculty (r = 0.27, P = .001).

Table 2

Table 2

Table 3

Table 3

Back to Top | Article Outline


Our PERS scales measured the perceptions of factors perceived to affect work-based performance in the practice environment. It was important that we determined specific factors in our setting that are not measured by existing scales.16 , 17 Our scales also sought the faculty perspective of the clinical practice environment, which, to our knowledge, is not typically characterized in other environmental surveys.

Overall, residents rated the CTU environment more positively than faculty did. Somewhat surprisingly, IM faculty’s ratings were even lower than those of non-IM PGY-1s rotating through medicine. Although we did not collect data to specifically explain this result, several possibilities may contribute to our finding. Faculty and residents may have different comparators against which they rated the CTU environment. Faculty may compare the CTU with other clinical settings, such as ambulatory clinics, or with nonclinical settings, such as their research environments, which may be perceived more positively. Conversely, residents may compare the CTU with other clinically demanding rotations, or simply not know any differently. Additionally, residents are able to regularly voice their concerns with the clinical environment as they routinely complete standard generic rotation effectiveness scales at the end of each rotation and undergo an in-person debriefing by the chief resident, which permits them a venue to document issues with each rotation. However, faculty are never asked their thoughts about their work environment; hence, this may have been their first opportunity to express their opinions, which therefore may have disproportionately focused on the negative. Providing faculty with a mechanism through which to anonymously express their perceptions about the CTU environment was an unanticipated positive aspect of this research, and an area that warrants further investigation to understand their perceptions.

Faculty reported being more likely to take environmental factors into account when assessing residents than vice versa; this could be despite or because of their perceptions. It may be that although they perceive the environment negatively, they are able to separate these perceptions from their assessments, or alternatively, they may situate their assessments in the level of challenge presented by the environment. For example, previous research demonstrates that IM faculty take context into account when assessing professionalism of behavior10 and that their assessments go beyond traditional competencies to include more holistic impressions of performance.32 Additionally, Govaerts and colleagues33 , 34 found that, in a research setting, more experienced family practice faculty gave greater consideration to contextual cues and made more interpretations about performance in their assessments of learners compared with less experienced colleagues. We did not determine the experience level of our participating faculty, but if particularly experienced faculty contributed more so to responses, this may have also influenced findings. The experience of faculty and how they assess residents on WBAs may be an area for future study. Conversely, from the residents’ perspective, they may view themselves foremost as learners and to an extent as consumers, and therefore perceive that faculty should be teaching/performing to a criterion standard regardless of environmental circumstances.

This tendency for faculty and residents to differently account for environmental factors in assessments may in part explain our findings about the relationship between PERS ratings and WBAs. If faculty take the practice environment into account when assessing residents, this may support why our faculty PERS ratings did not correlate with their ITER assessments of residents, as faculty may be less influenced by or be more able to integrate environmental factors into their constructed assessments of performance. Conversely, if residents are less likely to account for the setting, their RATE assessments may be more heavily influenced by environmental factors. For example, in a clinical setting with high busyness ratings, faculty may give the benefit of the doubt to residents who perform adequately and adjust or inflate their scores accordingly.27 On the contrary, in this setting residents may develop more negative perceptions and rate their faculty lower.

Our findings prompt questions about the conceptualization of WBAs in CBME, whether the practice environment should be considered a construct-relevant or irrelevant source of variance, and whether it explicitly requires measurement. In medical education, historically and more broadly, context specificity is considered as a potential source of nonlearner variance in performance,35 and recent recognition of the importance of the context for WBAs has been described.18–25 Although situating performance within the local environment is essential for WBAs to be perceived as authentic and meaningful for those being assessed,36 our findings suggest that more complexity and nuance may be required when considering how the practice environment may be incorporated, consciously or unconsciously, into different WBAs of residents. It is possible that the influence of the practice environment may differ between ITERs, which aggregate performance of many observations over several weeks, and EPAs, which focus on a specific observation at a particular time.

There has also been a greater appreciation recently for the constructivist and subjective nature of assessments and the value of expert judgment in rating learners.37 Our findings support the constructivist nature of assessments, in that faculty seem to incorporate situational factors in making a judgment. However, exactly how faculty incorporate environmental factors into WBAs is not well understood, and many may struggle with how to do this appropriately.38 Additionally, it is unclear if or how faculty incorporate situational factors into their assessments, which may differ according to the type of WBA, such as ITERs or EPAs. These are areas worthy of future exploration that may be well suited to further qualitative studies or vignette studies.

There were several limitations to our study worth consideration. By design, we limited our study to one type of environment, and therefore it is unclear whether similar patterns of perceptions and influences on assessments would be observed in other clinical settings, such as in the operating room or in an ambulatory clinic. We also cannot determine causality and directionality of our findings with respect to environmental factors’ influence on residents’ assessments of faculty, and therefore reverse explanations could exist. For example, rather than a rotation with high busyness scores leading to lower RATE evaluations of faculty, a weaker faculty may make the environment seem worse and result in higher busyness scores. Our residents are also asked to complete other forms at the end of each rotation, and therefore receiving an additional form with the PERS may have led to survey fatigue and decreased our response rate. Also, as the PERS was collected at the end of the rotation, it may have been unduly influenced by recent events. Our conceptualization of environmental setting focused on observable factors, and although we strove to be comprehensive, there may have been unmeasured elements that constituted context that could have played a role in assessment. Lastly, we did not explore how faculty took environmental factors into account in constructing their assessments. This is another natural next step for further research in this area.

Back to Top | Article Outline


Residents and faculty perceived the same practice environment differently. These perceptions had small but significant correlations with residents’ assessments of faculty, although causality cannot be determined from this study. No relationship was identified between faculty’s ratings of the environment and their assessments of residents, potentially because faculty accounted for environmental factors, thus diminishing their effect. Understanding the interplay between context and assessment is essential to developing valid judgments about competence.

Back to Top | Article Outline


The authors would like to thank Ms. Bochra Kurabi, who was the research assistant who assisted in the development of the initial Practice Environment Rating Scale (PERS) and in the cognitive testing interviews.

Back to Top | Article Outline


1. Royal College of Physicians and Surgeons of Canada. The CanMEDS Physician Competency Framework. Accessed September 19, 2018.
2. Accreditation Council for Graduate Medical Education. Milestones. Accessed September 19, 2018.
3. Lingard L. What we see and don’t see when we look at “competence”: Notes on a god term. Adv Health Sci Educ Theory Pract. 2009;14:625–628.
4. Handfield-Jones RS, Mann KV, Challis ME, et al. Linking assessment to learning: A new route to quality assurance in medical practice. Med Educ. 2002;36:949–958.
5. Durning SJ, Artino AR, Boulet JR, Dorrance K, van der Vleuten C, Schuwirth L. The impact of selected contextual factors on experts’ clinical reasoning performance (does context impact clinical reasoning performance in experts?). Adv Health Sci Educ Theory Pract. 2012;17:65–79.
6. Ginsburg S, Bernabeo E, Ross KM, Holmboe ES. “It depends”: Results of a qualitative study investigating how practicing internists approach professional dilemmas. Acad Med. 2012;87:1685–1693.
7. Norcini JJ. Current perspectives in assessment: The assessment of performance at work. Med Educ. 2005;39:880–889.
8. Sturman MC, Cheramie RA, Cashen LH. The impact of job complexity and performance measurement on the temporal consistency, stability, and test–retest reliability of employee job performance ratings. J Appl Psychol. 2005;90:269–283.
9. Lingard L. Hodges BD, Lingard L. Rethinking competence in context of teamwork. In: The Question of Competence: Reconsidering Medical Education in the Twenty-First Century. 2012:London, UK: Cornell University Press; 42–70.
10. Ginsburg S, McIlroy J, Oulanova O, Eva K, Regehr G. Toward authentic clinical evaluation: Pitfalls in the pursuit of competency. Acad Med. 2010;85:780–786.
11. Harden RM. The learning environment and the curriculum. Med Teach. 2001;23:335–336.
12. Genn JM. AMEE medical education guide no. 23 (part 1): Curriculum, environment, climate, quality and change in medical education—A unifying perspective. Med Teach. 2001;23:337–344.
13. Genn JM. AMEE medical education guide no. 23 (part 2): Curriculum, environment, climate, quality and change in medical education—A unifying perspective. Med Teach. 2007;23:445–454.
14. Bagian JP, Weiss KB; CLER Evaluation Committee. The overarching themes from the CLER national report of findings 2016. J Grad Med Educ. 2016;8(2 suppl 1):21–23.
15. Soemantri D, Herrera C, Riquelme A. Measuring the educational environment in health professions studies: A systematic review. Med Teach. 2010;32:947–952.
16. Roff S, McAleer S, Skinner A. Development and validation of an instrument to measure the postgraduate clinical learning and teaching educational environment for hospital-based junior doctors in the UK. Med Teach. 2005;27:326–331.
17. Boor K, van der Vleuten C, Teunissen P, Scherpbier A, Scheele F. Development and analysis of D-RECT, an instrument measuring residents’ learning climate. Med Teach. 2011;33:820–827.
18. Lurie SJ, Mooney CJ, Lyness JM. Measurement of the general competencies of the Accreditation Council for Graduate Medical Education: A systematic review. Acad Med. 2009;84:301–309.
19. Hoff TJ, Pohl H, Bartfield J. Creating a learning environment to produce competent residents: The roles of culture and context. Acad Med. 2004;79:532–539.
20. ten Cate O, Snell L, Carraccio C. Medical competence: The interplay between individual ability and the health care environment. Med Teach. 2010;32:669–675.
21. Crossley J, Jolly B. Making sense of work-based assessment: Ask the right questions, in the right way, of the right people. Med Educ. 2012;46:28–37.
22. Lurie SJ. History and practice of competency-based assessment. Med Educ. 2012;46:49–57.
23. Govaerts M, van der Vleuten CP. Validity in work-based assessment: Expanding our horizons. Med Educ. 2013;47:1164–1174.
24. Govaerts MJ, van der Vleuten CP, Schuwirth LW, Muijtjens AM. Broadening perspectives on clinical performance assessment: Rethinking the nature of in-training assessment. Adv Health Sci Educ Theory Pract. 2007;12:239–260.
25. Kuper A, Reeves S, Albert M, Hodges BD. Assessment: Do we need to broaden our methodological horizons? Med Educ. 2007;41:1121–1123.
26. Mitchell M, Srinivasan M, West DC, et al. Factors affecting resident performance: Development of a theoretical model and a focused literature review. Acad Med. 2005;80:376–389.
27. Herold J, Stroud L, Bryden P, Kurabi B, Ginsburg G. Measuring the perceived influence of environmental factors on work-based assessment in medical education. Presented at: Canadian Conference on Medical Education; April 21, 2013.Quebec City, Quebec, Canada.
28. Stroud L, Bryden P, Kurabi B, Ginsburg S. Putting performance in context: The perceived influence of environmental factors on work-based performance. Perspect Med Educ. 2015;4:233–243.
29. Ginsburg S, Eva K, Regehr G. Do in-training evaluation reports deserve their bad reputations? A study of the reliability and predictive ability of ITER scores and narrative comments. Acad Med. 2013;88:1539–1544.
30. Ginsburg S, Brydges R, Imrie K, Lorens E. How should we change our teaching evaluation forms? Lessons from one department of medicine [abstract]. Med Educ. 2012;46(suppl 1):61.
31. Ginsburg S, Brydges R, Imrie K, Lorens E. The nature of residents’ written comments on teaching evaluation forms [abstract]. Med Educ. 2012;46(suppl 1):61.
32. Ginsburg S, Gold W, Cavalcanti RB, Kurabi B, McDonald-Blumer H. Competencies “plus”: The nature of written comments on internal medicine residents’ evaluation forms. Acad Med. 2011;86(10 suppl):S30–S34.
33. Govaerts MJ, Schuwirth LW, van der Vleuten CP, Muijtjens AM. Workplace-based assessment: Effects of rater expertise. Adv Health Sci Educ Theory Pract. 2011;16:151–165.
34. Govaerts MJ, van de Wiel MW, Schuwirth LW, van der Vleuten CP, Muijtjens AM. Workplace-based assessment: Raters’ performance theories and constructs. Adv Health Sci Educ Theory Pract. 2013;18:375–396.
35. van der Vleuten CP. When I say … context specificity. Med Educ. 2014;48:234–235.
36. Eva KW, Bordage G, Campbell C, et al. Towards a program of assessment for health professionals: From training into practice. Adv Health Sci Educ Theory Pract. 2016;21:897–913.
37. Gingerich A, Kogan J, Yeates P, Govaerts M, Holmboe E. Seeing the “black box” differently: Assessor cognition from three research perspectives. Med Educ. 2014;48:1055–1068.
38. Essers G, Dielissen P, van Weel C, van der Vleuten C, van Dulmen S, Kramer A. How do trained raters take context factors into account when assessing GP trainee communication performance? An exploratory, qualitative study. Adv Health Sci Educ Theory Pract. 2015;20:131–147.

Supplemental Digital Content

Back to Top | Article Outline
© 2019 by the Association of American Medical Colleges