Share this article on:

The Teamwork Mini-Clinical Evaluation Exercise (T-MEX): A Workplace-Based Assessment Focusing on Collaborative Competencies in Health Care

Olupeliyawa, Asela M. PhD; O’Sullivan, Anthony J. MD; Hughes, Chris PhD; Balasooriya, Chinthaka D. PhD

doi: 10.1097/ACM.0000000000000115
Research Reports

Purpose Teamwork is an important and challenging area of learning during the transition from medical graduate to intern. This preliminary investigation examined the psychometric and logistic properties of the Teamwork Mini-Clinical Evaluation Exercise (T-MEX) for the workplace-based assessment of key competencies in working with health care teams.

Method The authors designed the T-MEX for direct observation and assessment of six collaborative behaviors in seven clinical situations important for teamwork, feedback, and reflection. In 2010, they tested it on University of New South Wales senior medical students during their last six-week clinical term to investigate its overall utility, including validity and reliability. Assessors rated students in different situations on the extent to which they met expectations for interns for each collaborative behavior. Both assessors and students rated the tool’s usefulness and feasibility.

Results Assessment forms for 88 observed encounters were submitted by 25 students. The T-MEX was suited to a broad range of collaborative clinical practice situations, as evidenced by the encounter types and the behaviors assessed by health care team members. The internal structure of the behavior ratings indicated construct validity. A generalizability study found that eight encounters were adequate for high-stakes measurement purposes. The mean times for observation and feedback and the participants’ perceptions suggested usefulness for feedback and feasibility in busy clinical settings.

Conclusions Findings suggest that the T-MEX has good utility for assessing trainee competence in working with health care teams. It fills a gap within the suite of existing tools for workplace-based assessment of professional attributes.

Supplemental Digital Content is available in the text.

Dr. Olupeliyawa is lecturer, Medical Education Development and Research Centre, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka. At the time of writing, he was a doctoral candidate, School of Public Health and Community Medicine, UNSW Medicine, University of New South Wales, Sydney, Australia.

Dr. O’Sullivan is program authority, UNSW Medicine, and associate professor, Department of Medicine, St. George Clinical School, UNSW Medicine, University of New South Wales, Sydney, Australia.

Dr. Hughes is associate professor, Rural Clinical School, UNSW Medicine, University of New South Wales, Sydney. Australia.

Dr. Balasooriya is director, Medical Education Development, and senior lecturer, School of Public Health and Community Medicine, UNSW Medicine, University of New South Wales, Sydney, Australia.

Funding/Support: This research was part of Dr. Olupeliyawa’s doctoral research project. He was awarded the Australia and New Zealand Association for Medical Education (ANZAME) Postgraduate award, the UNSW Medicine Innovations in Learning & Teaching award, and a UNSW PhD completion scholarship.

Other disclosures: None reported.

Ethical approval: Ethical approval was obtained from the Human Research Ethics Advisory Panel, UNSW, as per ethical guidelines for conducting research in Australia (HREA reference no: 2010-7-29).

Previous presentations: A preliminary analysis of the psychometric and logistic data of this study was presented at the annual conference of the Australia and New Zealand Association for Health Professional Educators; July 2011; Alice Springs, Australia.

Supplemental digital content for this article is available at

Correspondence should be addressed to Dr. Olupeliyawa, Medical Education Development and Research Centre, Faculty of Medicine, 25, Kynsey Road, Colombo-08, Sri Lanka; telephone: +94112695300; e-mail:

The ability to work effectively within health care teams is an important outcome for medical graduates and has significant implications for patient safety. A considerable proportion of medical errors made by trainees occur because of lapses in collaboration,1,2 and interns’ first day of work is associated with increased in-hospital mortality.3 Collaboration with health care teams is an area in which medical graduates may encounter difficulty as they make the transition from student to intern.4,5

Educational strategies for developing medical graduates’ teamwork skills in working with health care teams include interprofessional education and simulation-based education. However, reviews of the literature have identified structural and attitudinal barriers for interprofessional clinical education6 and have shown the applicability of simulation to be limited for diverse team practice situations.7 Situated learning theory suggests that apprentices in workplace environments learn by participating in work-based tasks, supported by experienced practitioners.8 It has been argued that stage-appropriate situated learning experiences are important for a medical practitioner’s development of professional attributes, including teamwork.9 As the training of medical students who are nearing graduation occurs in settings similar to those in which they will practice as interns, using a situated learning approach in clinical clerkships may be an appropriate and sustainable educational strategy.

It has also been argued that assessment practices in higher education should include participation in and feedback on workplace practice and should support students’ learning.10 Direct observation of specific behaviors in realistic contexts has been considered to be important for assessing medical students’ profes sionalism.11 Self-monitoring informed by external feedback has been recommended as an effective feedback model for health care professionals.12 Thus, the literature suggests that participation in targeted work-based tasks, direct observation with focused feedback, and reflection are important considerations in the design of an effective assessment instrument that also promotes learning.

Existing instruments for workplace-based assessment (WBA) of professional attributes include the Professionalism Mini-Evaluation Exercise (P-MEX)13 for direct observation of behavior in ward-based encounters and the Team Assessment of Behaviors14 for multisource feedback from different health care team members. However, these and other existing WBA tools do not target the behaviors critical for collaboration and may not specifically promote situated learning of teamwork. On the basis of the theoretical concepts of promoting work-based learning described above, we developed, implemented, and evaluated an instrument for the WBA of medical students’ competence in collaborating with health care teams: the Teamwork Mini-Clinical Evaluation Exercise (T-MEX). The T-MEX targets behaviors identified as crucial for effective collaboration as an intern and provides a structured approach for the direct observation of student performance in key clinical team practice contexts, provision of feedback, and student reflection on the feedback.

Current concepts of validity suggest that the response format used in an instrument should be familiar to both the individuals being assessed and the assessors.15 We aimed to develop and validate the T-MEX on the basis of the format of the mini-Clinical Evaluation Exercise (mini-CEX),16 which is a well-accepted and validated instrument for WBA of medical trainees. The mini-CEX format is familiar to medical students and clinical assessors in many settings, including this study’s setting: the University of New South Wales (UNSW) in Sydney, Australia.

Back to Top | Article Outline


T-MEX development

We developed the T-MEX using information we gathered through a review of the literature, interviews with supervisors of interns, discussion with clinical educators, and a Delphi process. In 2009, we identified teamwork competencies as being described in the literature within the competency domains of shared understanding, communication, team support, and leadership.17 Using the critical incident technique,18 we then conducted interviews with specialist doctors serving as clinical supervisors of interns in Australia (n = 6) and Sri Lanka (n = 8) and developed 16 behavioral descriptions of important collaborative behaviors.19

Following the supervisor interviews, in a 2009 seminar with clinical educators at UNSW we discussed the options for assessment of teamwork competencies in relation to the Utility formula, which suggests that validity, reliability, acceptability, educational impact, and cost should all be considered in selecting an assessment method.20 There was consensus that direct observation of encounters would be more appropriate than observation of simulations and multisource feedback particularly when feasibility and educational impact are considered.

Through a subsequent Australia-wide Delphi study in 2010 with supervisors of interns (n = 103) incorporating exploratory factor analysis of the above-mentioned 16 behaviors, we determined that the “collaborative competencies” for an intern are “safe communication” with the team, “self-awareness [of one’s own role] and responsibility,” and skills to promote “supportive team relationships” (e.g., understanding team roles, respect, assertiveness, leadership skills).19 A recent review of WBA design suggests that outcome-level items (e.g., “establishes rapport”) are more reliable in observation-based assessments than are process-level items (e.g., “shook hands”).21 The first Delphi round19 helped us prioritize 12 of the behaviors and better phrase the behavioral descriptions at an outcome level. The second Delphi round helped us further prioritize 6 collaborative behaviors (see List 1).

List 1 Collaborative Competencies for Interns and Their Key Component Behaviors*

List 1 Collaborative Competencies for Interns and Their Key Component Behaviors*

The collaborative behaviors we selected for the T-MEX and the collaborative competencies they represent are shown in Table 1. We also added a measure of overall performance: “demonstrates appropriate collaborative behaviors overall to ensure patient safety.”

Table 1

Table 1

We developed the response scale for the T-MEX behavior and overall performance items in accord with recommendations from the literature on scale validity and reliability of instruments that use direct observation. Previous research has found mini-CEX interrater reliability to be similar for nine-point and five-point scales,22 but mini-CEX assessors have reported difficulty and variability in synthesizing judgments to numerical rating scales with nine points.23 There fore, we selected a five-point scale with the option of rating a given behavior as not applicable to the encounter. Another study of direct-observation WBA instruments found that scales anchored to a construct that clinical assessors understand well have better discrimination and better reproducibility than those lacking such anchors.24 Our interviews of intern supervisors19 suggested a common understanding of the level of supervision required for interns. Accordingly, we used “internship” as the frame of reference, asking assessors to rate students on the extent to which they met expectations for an intern. All scale points were coupled with descriptors to prompt developmental feedback (Table 2).

Table 2

Table 2

We used our findings from the intern supervisor interviews19 to identify the clinical situations within which interns’ collaborative skills are critically important. Accordingly, we selected seven types of clinical encounters as the most relevant for the use of the T-MEX: clinical handover, calling a consult–medical, calling a consult–allied health, patient care discussion, discussing a discharge plan, asking for help, and team meeting. We allowed for additional contexts by including an “other” option. Depending on the encounter type, a student could interact with and be assessed by the same senior team member (e.g., clinical handover, patient care discussion) or interact with one team member and be assessed by another (e.g., calling a consult, team meeting). We asked assessors completing the instrument to identify their role in the team as specialist, registrar (senior resident), resident (postgraduate year 2 and above), senior nurse, or other. The literature suggests that more valid and reliable judgments can be made by assessors who frequently observe performance in the contexts that may clearly demonstrate the target domain being assessed.21 The T-MEX’s suggested encounter and assessor types were chosen in accord with these recommendations. We also asked assessors to record the time taken for observation and for feedback.

To encourage assessors to provide more specific feedback and students to engage in self-monitoring, we modified the mini-CEX’s format for written feedback. On the T-MEX, we placed two boxes following the ratings for feedback on “best aspects of collaboration” and “specific aspects of collaboration needing development” as well as four feedback descriptors that assessors could check off regarding “other suggestions for development” (e.g., “consider broader patient and team issues”). On the reverse side of the form, we provided two boxes for students’ written reflections: one for reflection on issues raised in the exercise that the student needs to focus on, and one for planning actions for improvement after discussion with the assessor.

Finally, we asked assessors and stu dents to provide their views (using a five-point Likert scale ranging from strongly disagree to strongly agree) on the feasibility of the T-MEX and its usefulness for giving and soliciting feedback. The T-MEX instrument is available as Supplemental Digital Appendix 1 at

The draft T-MEX was reviewed in September 2010 by five clinical edu cation experts, including three of the six Australian intern supervisors interviewed previously,19 who found that the instrument had face validity. It was then piloted within a clinical handover encounter by a registrar and an intern, who indicated that no further adjustments were required.

Back to Top | Article Outline


In October 2010, two of us (A.O., A.S.) introduced the T-MEX to the approx imately 75 final-year medical students in two of UNSW Medicine’s four large, metropolitan clinical schools during the orientation session for their last term. In this six-week Preparation for Internship (PRINT) term, students are assigned to different units where interns practice, and they are expected to actively engage with the health care team in preparation for their internships. Students have completed their summative assessments by the PRINT term. We considered this term to be an ideal time to introduce the T-MEX. All students were briefed, and those who volunteered to participate gave informed written consent. The supervising staff in the units were also informed that a new WBA instrument would be implemented during this term. Ethics approval for this study was obtained from the Human Research Ethics Advisory Panel, UNSW (reference number: 2010-7-29).

Preliminary investigations of assessment formats similar to the T-MEX—such as the mini-CEX16 and the P-MEX13—have been largely naturalistic in design in that the selection of encounters and the assessors were uncontrolled. Furthermore, the concept of constructive alignment suggests that learners should create their own learning and assessment tasks.25 Accordingly, the T-MEX process was student led in this study: Participating students could select both the encounters they wished to be assessed on and the assessors from whom they wished to receive feedback. They could also decide whether to return a particular T-MEX form to our research team via a dropbox placed in their clinical teaching unit.

The literature on assessor training for rating-based instruments suggests mixed results, and a recent randomized trial of mini-CEX ratings concluded that accuracy and interrater reliability did not improve after such training.26 The literature on clinical assessment using direct observation suggests that more attention needs to be paid to learner preparation.27,28 We developed written instructions for assessors on the process for observation and feedback, which we included on the reverse of the T-MEX form, and we targeted indirect assessor awareness through student preparation. During the PRINT orientation session, we encouraged students to seek a variety of encounter types and relevant assessors, to review the T-MEX form and prepare for the assessment, and to engage in an active process of feedback and reflection. In addition to copies of the T-MEX form, we gave students two single-page instruction sheets: One described the possible T-MEX encounters, and the other provided answers to what we anticipated would be frequently asked questions. Students did not receive significant incentives for participation.

Back to Top | Article Outline


We analyzed completed T-MEX forms for evidence on the overarching concept of construct validity29 and aspects of utility.20 The descriptive statistics on the clinical contexts attempted by the students and on the assessors’ team roles investigated content validity (i.e., whether these representative clinical contexts important for interns could be sampled by students in clinical settings). Assessors’ ratings of collaborative behaviors as applicable (by providing a rating) or not applicable (by not providing a rating or marking “not applicable”) to the encounters also investigated content validity. The time taken for observation and feedback provided evidence on feasibility. The assessors’ and students’ ratings on T-MEX feasibility and usefulness for feedback provided “reaction” level30 evidence for acceptability and educational impact. The internal structure of the T-MEX was analyzed for internal consistency (Cronbach’s alpha), and item analysis (item–total correlations and highest/lowest-rated items) was conducted to investigate reliability and construct validity (in its traditional sense).

We used the GENOVA program31 to perform a generalizability (G) study as well as to perform decision studies to predict the number of different sources required for sufficient generalizability. According to the information provided on the completed T-MEX forms, the inference could be made that a different assessor rated a different student at each encounter. Thus, encounters (which include assessor variance) were nested within candidates (variance due to student performance). Each item score was considered individually to investigate the variance due to items. The items were considered as a random facet—the universe of items was potentially much larger because even the 16 collaborative behaviors initially identified as important by intern supervisors in the interviews were later reduced to 6 in the Delphi process of prioritization.19 Thus, the design of our G study was (Encounter:Candidate) × Item or (E:C) × I. A similar design has been used in previous studies to measure the generalizability of other single-encounter rating forms such as the mini-CEX.32,33 To circumvent the possible violation of local independence of items among different encounters, we repeated the G study on the basis of a total score across items per encounter or per form using a design of Candidate × Form or C × F. Different suggestions for analyzing naturalistic data as balanced designs were considered in the analysis of the T-MEX forms, and a process of culling was selected to minimize data loss.34 Consequently, we used a sample of three forms each from 18 students (56 forms) in the analysis.

Back to Top | Article Outline


Content validity of the encounters, assessors, and applicable behaviors

Forty of the approximately 75 eligible students (about 53%) volunteered to participate. Of the volunteers, 25 (62%) submitted T-MEX forms produced from 88 encounters. The students attempted all of the proposed clinical contexts (Table 3). Many students submitted three or four forms, and two students submitted six forms. On the occasions when the “other” encounter type was selected, the descriptions suggested similarity to the proposed contexts.

Table 3

Table 3

The 88 T-MEX encounters were assessed by 23 clinicians. Eighty-three (92%) of the encounters were assessed by 20 medical professionals, mostly residents (69% of encounters). Two nurses and a physiotherapist also assessed students in 5 encounters (6%). Ten allied health consults and three team meetings were assessed, suggesting that in some encounters students interacted with allied health professionals while being observed by a medical professional. The range of encounter types suggests that the T-MEX provided opportunities for engagement with many members of the team.

Of the six collaborative behaviors, four were rated as applicable to > 95% of the encounters, and two were rated as applicable to > 87% of the encounters. These findings suggest that none of the behaviors were redundant and that all of the behaviors were applicable to the clinical contexts suggested for T-MEX encounters, supporting the content validity.

Back to Top | Article Outline

Time taken for the T-MEX

The mean time for T-MEX performance (i.e., interacting with and/or observing the student) was 11 minutes, and the mean time for giving feedback was 8 minutes. Although there was some variability (ranges: T-MEX performance, 2–50 minutes; feedback, 1–30 minutes), the assessment tasks were completed in 5 to 15 minutes in 81% of the encounters, while feedback was provided in another 5 to 15 minutes in 74% of the encounters. These results suggest that the T-MEX can be conducted as a brief activity in a clinical work-based setting, consistent with the mini-CEX format.16 This logistics evidence supports the feasibility and cost-effectiveness of the T-MEX.

Back to Top | Article Outline

Internal structure

The Cronbach’s alpha of the 88 T-MEX forms was high: 0.91 when just the six collaborative behavior items were considered, increasing to 0.93 when the overall performance item was included. The interitem correlations were moderately high, between 0.50 and 0.75. The differences in the mean scores for the six collaborative behavior items were statistically significant (P = .039; tested through ANOVA). Students were rated lowest in the two behaviors representing the competency of “safe communication” (see Table 1 for items). The item–total correlations were between 0.67 and 0.85 and Cronbach’s alpha did not improve by deleting any item. These findings suggest that all of the behaviors are relevant, that assessors are able to distinguish between the behaviors, and that no “outlier” behaviors are being rated.

Back to Top | Article Outline

Generalizability study

As described above, our G study used a sample of three forms each for 18 students (56 forms). The G coefficient for three T-MEX forms was 0.62. Student performance (C), or “true” variability, contributed to the second largest variance component (Table 4). A student’s score in a particular item also varied due to the assessor/encounter (E:C). The item variance component (I) was low, suggesting some range restriction. The largest variance component—EI:C, the “error variance”—should be divided by the number of encounters for multiple T-MEX forms.16 Norcini et al16 suggest that the variance due to a candidate improving his or her performance is included in such “error variance,” though in fact it is an artifact representing educational impact, which is an apt consideration because of the formative use of the T-MEX.

Table 4

Table 4

We used decision studies to predict the number of encounters and the number of items needed to improve generalizability. The G coefficients for increasing numbers of encounters were as follows: 0.77 for six encounters, 0.79 for seven encounters, and 0.81 for eight encounters. This suggests that a G coefficient > 0.8 can be achieved with eight T-MEX encounters. The G coefficients for increasing numbers of items were as follows: 0.64 for seven items, 0.65 for eight items, and 0.66 for nine items. Accordingly, the improvement in reproducibility that can be expected from increasing the number of assessed behaviors is small. In the design based on a total average score across items per encounter (i.e., C × F), the results were similar (G coefficient = 0.63, predicted G coefficient with eight encounters = 0.82).

Back to Top | Article Outline

Assessor and student perceptions of usefulness and feasibility

The assessor and student ratings of the usefulness and feasibility of the T-MEX are provided in Table 5. The majority of assessors (68%) and of students (74%) found the T-MEX to be useful for feedback purposes and practical to conduct in the clinical environment (72% and 64%, respectively). These findings support the view that the assessment was acceptable to assessors and students and had a positive educational impact.

Table 5

Table 5

Back to Top | Article Outline


The T-MEX is a WBA instrument applicable to a broad range of situations important for collaborative clinical practice by interns, as evidenced by the types of encounters selected and the behaviors assessed in this study. Previous research has shown that trainees mainly learn aspects of professionalism through the informal curriculum.35 Our findings suggest that the T-MEX can be used to provide some structure to optimize and formalize this learning process as it relates to teamwork in clinical settings.

Previous research has explored the dimensionality of the mini-CEX through factor analysis.33 The information compiled in this study’s completed T-MEX forms was insufficient for such analysis. However, construct validity of the items is suggested by the selection of the T-MEX collaborative behavior items through factor analysis and prioritization in our Delphi study,19 ensuring sufficient construct representation, and the distinct patterns in T-MEX ratings in this study.

The reproducibility evidence for the T-MEX (a G coefficient of 0.8 predicted with eight forms) is consistent with other WBA tools of similar format. A G coefficient of 0.8 is acceptable for high-stakes judgments,34 and this level can be achieved with a sample of 10 encounters in WBAs using direct observation.28 Eight T-MEX forms is also sufficient for all the different clinical situations that the T-MEX targets. Students found completing 3 to 6 T-MEX encounters to be feasible in a six-week term; this suggests that a sufficiently reliable assessment may be feasibly conducted within a longer period (e.g., over the course of two or three clerkships). Increasing the number of behaviors assessed would not greatly improve reproducibility, which is consistent with the literature on clinical performance ratings.36 This is an important consideration for a succinct assessment in clinical settings.

The T-MEX is acceptable to both students and assessors in terms of its educational value and its logistics. It has good feasibility for busy clinical settings, as supported by the brief mean times for observation of encounters and provision of feedback. These findings signal the T-MEX’s utility as a WBA instrument.

As this study shows, the T-MEX can be implemented within routine clinical activities. Our findings may be transferable to different clinical environments in which medical students prepare for internship. Some preliminary evidence on external validity is available from a supporting study37 we have conducted at the University of Colombo in Colombo, Sri Lanka. The validity, reliability, and feasibility evidence at this secondary setting support the findings discussed in this study.

Further research may address potential limitations of this study. Students may receive limited experience in some situations because of contextual constraints in clinical work-based settings (e.g., multidisciplinary team meetings). Clinical educators may need to purposefully focus on such encounters in future iterations of the T-MEX. Some contexts, such as “asking for help” in an emergency, may be better assessed through simulations, although other situations of uncertainty, such as discussing patient management plans, would be appropriate for the T-MEX. Further research with larger samples of students completing multiple T-MEXs in different clinical contexts, with different health professionals acting as assessors, and with datasets where threats to validity, such as free choice of contexts and assessors, are addressed, would support more informed inferences of the validity of the T-MEX (e.g., on construct validity of the items).

Back to Top | Article Outline


This preliminary investigation suggests that the T-MEX is a valid, reliable, and feasible instrument for WBA of senior medical students’ competence in collaborating with health care teams. It is designed to facilitate focused feedback and reflection. Although existing WBA tools have targeted direct observation of professional behavior13 and feedback on professional behavior from different members of the health care team,14 there is a need in medical education for assessment tools that focus on the interactional component in clinical competence.38 By addressing collaborative behaviors in key, specific health care team practice situations, the T-MEX presents an important addition to WBA.

Acknowledgments: The authors wish to acknowledge the support of the heads of the UNSW Medicine clinical schools during the implementation of this study, and the students and clinical staff who participated. The authors also wish to acknowledge Dr. Gominda Ponnamperuma for his support in the generalizability analysis.

Back to Top | Article Outline


1. Singh H, Thomas EJ, Petersen LA, Studdert DM. Medical errors involving trainees: A study of closed malpractice claims from 5 insurers. Arch Intern Med. 2007;167:2030–2036
2. Coombes ID, Stowasser DA, Coombes JA, Mitchell C. Why do interns make prescribing errors? A qualitative study. Med J Aust. 2008;188:89–94
3. Jen MH, Bottle A, Majeed A, Bell D, Aylin P. Early in-hospital mortality following trainee doctors’ first day at work. PLoS One. 2009;4:e7103
4. Lempp H, Cochrane M, Rees J. A qualitative study of the perceptions and experiences of pre-registration house officers on teamwork and support. BMC Med Educ. 2005;5:10
5. Lawson M, Bearman M, Jones A. Department of Education, Science and Training–Australian Medical Education Study: Intern Case Study Report. 2007 Melbourne, Australia Monash University
6. Davidson M, Smith RA, Dodd KJ, Smith JS, O’Loughlan MJ. Interprofessional pre-qualification clinical education: A systematic review. Aust Health Rev. 2008;32:111–120
7. Buljac-Samardzic M, Dekker-van Doorn CM, van Wijngaarden JD, van Wijk KP. Interventions to improve team effectiveness: A systematic review. Health Policy. 2010;94:183–195
8. Lave J, Wenger E. Situated Learning: Legitimate Peripheral Participation. 1991 New York, NY Cambridge University Press
9. Hilton SR, Slotnick HB. Proto-professionalism: How professionalisation occurs across the continuum of medical education. Med Educ. 2005;39:58–65
10. Boud D, Falchikov N. Aligning assessment with long-term learning. Assess Eval Higher Educ. 2006;31:399–413
11. Ginsburg S, Regehr G, Hatala R, et al. Context, conflict, and resolution: A new conceptual framework for evaluating professionalism. Acad Med. 2000;75(10 suppl):S6–S11
12. Archer JC. State of the science in health professional education: Effective feedback. Med Educ. 2010;44:101–108
13. Cruess R, McIlroy JH, Cruess S, Ginsburg S, Steinert Y. The Professionalism Mini-evaluation Exercise: A preliminary investigation. Acad Med. 2006;81(10 suppl):S74–S78
14. Whitehouse A, Hassell A, Bullock A, Wood L, Wall D. 360 degree assessment (multisource feedback) of UK trainee doctors: Field testing of team assessment of behaviours (TAB). Med Teach. 2007;29:171–176
15. Cook DA, Beckman TJ. Current concepts in validity and reliability for psychometric instruments: Theory and application. Am J Med. 2006;119:166.e7–166.e16
16. Norcini JJ, Blank LL, Arnold GK, Kimball HR. The mini-CEX (clinical evaluation exercise): A preliminary investigation. Ann Intern Med. 1995;123:795–799
17. Olupeliyawa A, Hughes C, Balasooriya C. A review of the literature on teamwork competencies in healthcare practice and training: Implications for undergraduate medical education. South East Asian J Med Educ. 2009;3(2):61–72
18. Flanagan JC. The critical incident technique. Psychol Bull. 1954;51:327–358
19. Olupeliyawa AM, O’Sullivan A, Hughes C, Balasooriya CD. Transition to clinical practice as a medical graduate: What collaborative competencies and behaviors are critical? Focus Health Prof Educ. 2013;14(2):57–70
20. Van Der Vleuten CP. The assessment of professional competence: Developments, research and practical implications. Adv Health Sci Educ Theory Pract. 1996;1:41–67
21. Crossley J, Jolly B. Making sense of work-based assessment: Ask the right questions, in the right way, about the right things, of the right people. Med Educ. 2012;46:28–37
22. Cook D, Beckman T. Does scale length matter? A comparison of nine-versus five-point rating scales for the mini-CEX. Adv Health Sci Educ. 2009;14:655–664
23. Kogan JR, Conforti L, Bernabeo E, Iobst W, Holmboe E. Opening the black box of clinical skills assessment via observation: A conceptual model. Med Educ. 2011;45:1048–1060
24. Crossley J, Johnson G, Booth J, Wade W. Good questions, good answers: Construct alignment improves the performance of workplace-based assessment scales. Med Educ. 2011;45:560–569
25. Biggs J. Enhancing teaching through constructive alignment. High Educ. 1996;32:347–364
26. Cook DA, Dupras DM, Beckman TJ, Thomas KG, Pankratz VS. Effect of rater training on reliability and accuracy of mini-CEX scores: A randomized, controlled trial. J Gen Intern Med. 2009;24:74–79
27. Fromme HB, Karani R, Downing SM. Direct observation in medical education: A review of the literature and evidence for validity. Mt Sinai J Med. 2009;76:365–371
28. Pelgrim EA, Kramer AW, Mokkink HG, van den Elsen L, Grol RP, van der Vleuten CP. In-training assessment using direct observation of single-patient encounters: A literature review. Adv Health Sci Educ Theory Pract. 2011;16:131–142
29. Messick S. The interplay of evidence and consequences in the validation of performance assessments. Educ Res. 1994;23:13–23
30. Kirkpatrick DCraig R, Bittel L. Evaluation of training. Training and Development Handbook. 1967 New York, NY McGraw-Hill In:
31. Crick JE, Brennan RL GENOVA: A General Purpose Analysis of Variance System. Version 3.1. 2001 Iowa City, Iowa Center for Advanced Studies in Measurement and Assessment, University of Iowa
32. Kogan JR, Bellini LM, Shea JA. Feasibility, reliability, and validity of the mini-clinical evaluation exercise (mCEX) in a medicine core clerkship. Acad Med. 2003;78(10 suppl):S33–S35
33. Cook DA, Beckman TJ, Mandrekar JN, Pankratz VS. Internal structure of mini-CEX scores for internal medicine residents: Factor analysis and generalizability. Adv Health Sci Educ Theory Pract. 2010;15:633–645
34. Crossley J, Davies H, Humphris G, Jolly B. Generalisability: A key to unlock professional assessment. Med Educ. 2002;36:972–978
35. Stern DT. In search of the informal curriculum: When and where professional values are taught. Acad Med. 1998;73(10 suppl):S28–S30
36. Williams RG, Klamen DA, McGaghie WC. Cognitive, social and environmental sources of bias in clinical performance ratings. Teach Learn Med. 2003;15:270–292
37. Olupeliyawa A Identification of Collaborative Competencies for Internship and the Development of an assessment Strategy That Facilitates Learning [thesis]. 2012 Sydney, Australia University of New South Wales
38. Holmboe ES, Sherbino J, Long DM, Swing SR, Frank JR. The role of assessment in competency-based medical education. Med Teach. 2010;32:676–682

Supplemental Digital Content

Back to Top | Article Outline
© 2014 by the Association of American Medical Colleges