Working effectively within teams has been recognized by medical educators as an important competency for learners.1,2 Teams are increasingly being used in medical education to enhance active learning and foster better interpersonal communication skills. A variety of methods that employ teams in classroom-based teaching settings, such as problem-based and team-based learning (TBL), have become part of many medical schools’ curricula. These methods were adopted because of the recognition that effective team processes improve learning outcomes.3,4
Even though the contribution of effective teams to learning outcomes has been recognized, there are few empirical data to help educators design and evaluate the performance of those teams, especially in classroom-based educational activities. Systematic study in this area is limited in part by a lack of validated practical tools to measure the quality of small-group or team interactions.5,6
The purpose of this study was to develop and test the validity and reliability of a simple instrument to measure the quality of learning team interactions in medical education settings.
Method
Item development and refinement
We reviewed and discussed the literature on team and small-group processes. From these discussions, we identified several overlapping features of high-functioning teams that included high levels of engagement by all team members, discussions of the team’s work at deep (rather than superficial) conceptual levels, and a strong sense of team identity based on the literature.7,8 Using these features as a guide, a writing team consisting of clinical educators (P.H., R.E.L.), educational specialists (B.F.R., F.K.), and a measurement specialist (P.A.K.) generated a pool of 30 scale items to assess the quality of team interactions. The items were designed to be completed at the end of a course by students retrospectively reflecting on all of their team experiences during the course.
We distributed the 30 items to 409 students in 13 different courses that used TBL as the major pedagogical method at three different medical schools throughout the United States (Baylor College of Medicine, The University of Texas Medical Branch, and The Wright State Boonshoft School of Medicine). TBL settings in medical schools were chosen to test and refine our instrument because these settings create a structure that allows team processes to emerge mainly as a result of interactions between the individual students assigned to each team. Specifically, TBL uses teams without individual faculty facilitators (theoretically putting all teams on equal footing at the outset of a course), directs all of the teams in a course to solve a common set of real-world application-based tasks, and requires all teams to perform their work in parallel in the same educational space, such as a lecture hall. In these schools, students were also assigned to teams rather than being able to self-select their team members.
Psychometric evaluation and validation of the Team Performance Scale
To assess the psychometric properties of the Team Performance Scale (TPS) and validate it in a second sample, we next distributed the TPS to second-year students in an evidence-based medicine course at Baylor College of Medicine during the 2006–2007 and 2007–2008 academic years. This course used TBL as its major pedagogical method and also used a peer evaluation process at the end of the course. This additional measure used the method of Michaelsen et al.8 Briefly, each student was given 100 points to distribute to the other students on their team based on how “helpful” each member’s contribution to the team activities had been. Students were required to distribute all 100 points and were not allowed to give points to themselves. Therefore, each team member received a total peer evaluation score that was based on the points that their peers had distributed to them.
We calculated TPS scores at the individual and team levels. We determined an individual TPS score by calculating a summed score for the 18 TPS items (possible total of 108 points) for each student. We determined a team TPS score by averaging the individual TPS scores of all team members.
We also examined variability in peer evaluation scores at individual and team levels. We determined the spread of the individual peer evaluation ratings by calculating the standard deviation of the scores that each student had assigned to the others on his or her team. To determine variability at the team level, we calculated the standard deviation of the team member’s total peer review score.
We examined internal consistency for the overall instrument using Cronbach’s alpha. We determined convergent validity at the individual and team levels by examining correlations between the TPS and peer evaluation ratings. Convergent validity assesses the “convergence” or correlation of scores on one measure to scores on another measure that assesses related dimensions.9 On the basis of previous work, we expected that TPS scores would be inversely correlated with the degree of variability (measured by standard deviations for individual as well as team scores) in peer evaluation ratings.
We also examined descriptive and inferential statistics. To determine whether the TPS could be used to detect differences between learning teams, we conducted an ANOVA. For this ANOVA, we used team as the independent variable and TPS score as the dependent variable. We also determined effect size using eta squared (η2 ) These studies were approved under the sponsoring institution, Baylor College of Medicine’s institutional review board.
Results
We performed exploratory factor analysis with varimax rotation to identify latent dimensions among the 30 items with an eye towards item reduction. In defining dimensions, we used scree plot examination, eigenvalues ≥ 1.0, and a minimum of two items with loadings > 0.40 on a factor. On the basis of these criteria, we identified one dimension with 18 items. We organized this reduced set of 18 items into a final instrument, which we named the Team Performance Scale, or TPS. Each item stem was followed by a seven-point response scale anchored by “none of the time” (0) to “all of the time” (6). Instructions at the beginning of the TPS directed respondents to rate each item on the basis of their overall experience with their team during the entire course. The 18 TPS item stems are included in Table 1 .
Table 1: Team Performance Survey (TPS) Items
Overall, 309 second-year students divided into 60 teams in two years’ iterations of an evidence-based medicine course completed the TPS, representing a 95% response rate. Of those students, about half were female (47%) and white (44%), reflecting the demographics of the overall student body at Baylor College of Medicine. Students reported that the 18-item TPS took approximately five minutes to complete.
The instrument had high internal consistency with a Cronbach’s alpha of 0.97. The means for each item ranged from 4.9 to 5.5 (out of 6). Table 1 lists the items and the overall mean and standard deviation for each item. The 18 items accounted for 66% of the explained item variance.
We calculated TPS scores at the individual and team levels. Most of the students rated their team performance highly, with a mean individual TPS score of 96.0 (SD = 14) out of a possible 108 points. The 60 learning teams had an average team TPS score of 95.7 (SD = 8.5), ranging from a low of 75.7 to a high of 107.
We assessed the convergent validity of the TPS by determining the correlation (r ) between TPS scores and variability in peer evaluation ratings at the individual and team levels. Analysis revealed a statistically significant negative correlation between TPS scores and the peer evaluation ratings at the individual (r = −0.23, P < .001) and the team level (r = −0.38, P = .003).
We assessed the ability of the TPS to differentiate between teams. Results from our ANOVA indicated that TPS scores between teams were statistically significantly different (P < .001). These differences represented a large effect size (η2 = 0.33).10
Discussion
In this study, we created an instrument that is simple and quickly administered, can be used at the end of a course that employs learning teams, and can give educators a measure of the quality of the interactions within teams over the duration of a course. In our validation sample, the TPS exhibited several favorable psychometric properties: a short administration time, high internal consistency, a proportion of explained item variance (66%) supporting construct validity, and evidence of triangulation with a published peer evaluation scheme8 that was administered separately. In addition, we noted that the TPS could distinguish between teams, suggesting that this tool has merit for assessing performance at the level of teams. With such high reliability as measured by Cronbach’s alpha, it would be feasible to use a subsample of items to further reduce the already low respondent burden of the instrument.
In a previous study, students who used a similar peer evaluation system cited contributions to the group process and advance preparation as the most important parameters that they used to make decisions about how to assign peer evaluation points.11 In addition, in other TBL courses that used a points-based peer evaluation system that forced students to differentiate (e.g., a given student could give no more than two of his or her teammates the same score), qualitative feedback from students about the peer evaluation process produced two different types of comments. Students who were in teams that they perceived were working well together often felt like the peer evaluation process unfairly forced them to differentiate, whereas students who were in more dysfunctional groups tended to feel empowered to score their peers higher or lower according to observed performance.11 In the present study, our peer evaluation process did not force students to differentiate, leaving them free to assign the same number of points to everyone in their group if they wished to do so. According to our analysis, some groups did give each other scores that differentiated. The negative correlation between TPS scores and peer review spread that we observed adds additional quantitative evidence to the earlier qualitative observation that students in highly functioning teams feel as though all team members should share in the benefits of the peer evaluation rewards process, whereas students in teams that function less well tend to differentiate when scoring team members.
Practical implications
The significance of the TPS is that it provides educators with a tool to quantitatively assess the quality of team or small-group interactions, especially in settings that employ small groups such as problem-based learning or TBL. Educators commonly employ learning teams based on the notion that more favorable group processes promote better learning outcomes.3,4 In theory, the learning team or small-group process itself is impacted by a number of factors, including characteristics of a group’s members, skill of the teacher, and type of educational methods used. The TPS now gives researchers a tool to begin to collect empirical evidence about such theorized connections. Furthermore, the TPS gives educators a tool to evaluate educational innovations that utilize teams or small-group interactions, with a focus on the quality of team interactions. Future research should determine whether this instrument can detect changes in the quality of team interactions over time, such as administering the instrument midway and again at the end of a rotation or course. In our evidence-based medicine course, for example, we plan to use the TPS as a curriculum evaluation tool to measure the relative abilities of different team activities to produce high-quality intrateam interactions. With the emphasis of team interactions in the clinical setting as well as simulation-based team training experiences,12,13 this instrument may prove useful for evaluating team processes.
Limitations
This study has several limitations. Because we performed this study in the setting of a real-world course, the outcome measures that we employed were limited to those available as part of the regular course evaluation scheme. For practical reasons, we employed the TPS in TBL classrooms. Although it is unclear at this point how the TPS might perform in other learning team applications, such as problem-based learning or clinical ward teams, the content of the TPS items may prove germane to the conversations that often occur in such settings.
Conclusions
In conclusion, the TPS has evidence of very good psychometric properties and initial evidence of validity. It seems to measure the overall quality of learning in team or small-group interactions in medical education settings. Future study should be aimed at testing the performance of the instrument in additional small-group and team settings, and assessing correlations between group performance and other learning outcomes.
Acknowledgments
This research was supported in part through a grant from the Fund for Improvement of Post Secondary Education (#P116B010948) and NIH (K07 HL082629-01). The Houston Center for Quality of Care and Utilization Studies is supported by HFP 90-020 from the Office of Research and Development, U.S. Department of Veterans Affairs.
References
1 Accreditation Council for Graduate Medical Education. Common Program Requirements: General Competencies. Available at: (
http://www.acgme.org/outcome/Comp/compFull.asp ). Accessed June 16, 2009.
2 Association of American Medical Schools. Medical School Objectives Project. Learning Objectives for Medical Student Education: Guidelines for Medical Schools. Available at: (
https://services.aamc.org/Publications/showfile.cfm?file=version87.pdf&prd_id=198&prv_id=239&pdf_id=87 ). Accessed June 16, 2009.
3 Knight AB. Team-based learning: A strategy for transforming the quality of teaching and learning. In: Michaelsen LK, Knight AB, Fink LD, eds. Team-Based Learning: A Transformative Use of Small Groups. Westport, Conn: Praeger; 2002:201–212.
4 Levine RE, O’Boyle M, Haidet P, et al. Transforming a clinical clerkship with team learning. Teach Learn Med. 2004;16:270–275.
5 Healey AN, Undre S, Vincent CA. Developing observational measures of performance in surgical teams. Qual Saf Health Care. 2004;13(suppl 1):i33–i40.
6 Delphin E, Davidson M. Teaching and evaluating group competency in systems-based practice in anesthesiology. Anesth Analg. 2008;106:1837–1843.
7 Fink LD. Beyond small groups: Harnessing the extraordinary power of learning teams. In: Michaelsen LK, Knight AB, Fink LD, eds. Team-Based Learning: A Transformative Use of Small Groups. Westport, Conn: Praeger; 2002:3–26.
8 Michaelsen LK, Fink LD. Calculating peer evaluation scores. In: Michaelsen LK, Knight AB, Fink LD, eds. Team-Based Learning: A Transformative Use of Small Groups. Westport, Conn: Praeger; 2002:233–244.
9 Nunnally JC, Bernstein IH. Psychometric Theory. 3rd ed. New York, NY: McGraw-Hill; 1994.
10 Kline RB. Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Settings. Washington, DC: American Psychological Association; 2004.
11 Levine RE, Kelly KPA, Karakoc T, Haidet P. Peer evaluation in a clinical clerkship: Students’ attitudes, experiences, and correlations with traditional assessments. Acad Psychiatry. 2007;31:19–24.
12 Clancy CM, Tornberg DN. TeamSTEPPS: Assuring optimal teamwork in clinical settings. Am J Med Qual. 2007;22:214–217.
13 Rosen MA, Salas E, Wilson KA, et al. Measuring team performance in simulation-based training: Adopting best practices for healthcare. Simul Healthc. 2008;3:33–41.