Secondary Logo

Journal Logo

Articles

Team-Based Learning Analytics: An Empirical Case Study

Koh, Ying Yun Juliana MA; Schmidt, Henk G. PhD; Low-Beer, Naomi MD, MEd; Rotgans, Jerome I. PhD

Author Information
doi: 10.1097/ACM.0000000000003157
  • Open

Abstract

In the past decade, there has been a move to blend information technology (IT) with active learning strategies. One such active learning strategy, team-based learning (TBL), serves as a good example of how IT can be successfully integrated with active learning. Blending IT with TBL not only allows for innovative ways to manage or structure the learning process but also presents many opportunities to explore learning analytics, which Siemens and Baker define as “the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs.”1

To date, limited research exemplifies how data that are routinely collected and stored can be used to help inform educational practices or help students in need. We believe that this function is underused and holds many opportunities to enhance learning during TBL. In this article, we provide a case study that illustrates how learning analytics can be used to help TBL instructors understand and improve student learning.

Overview of TBL

TBL is an active learning strategy that is learner centered but instructor led.2 It is a highly structured approach that is made up of a sequence of activities designed to allow the instructor to provide frequent feedback to students as they work to master course concepts. The structure and activities in TBL are designed to hold students accountable for both their out-of-class preparation and in-class collaboration with peers.3 TBL moves beyond the basic acquisition of facts by emphasizing the importance of applying knowledge to real-life scenarios through intragroup and intergroup discussions of problems that are designed to foster complex reasoning and debate.4

TBL is composed of 3 distinct phases: (1) the preparation phase, (2) the readiness assurance phase, and (3) the knowledge application phase.5 The preparation phase occurs before a TBL session. Students are required to study the materials assigned by the instructor in advance so that they are prepared for the session. The readiness assurance phase occurs during the TBL classroom session. Students’ knowledge of the topic, gained from the preparation phase, is tested through 2 identical multiple-choice-type tests (typically consisting of 20 to 30 items). Students complete the first test, the individual readiness assurance test (iRAT), individually without consulting resources or discussing items or materials with other students. After all the students have completed the iRAT, they work together in their teams to complete the team readiness assurance test (tRAT). The tRAT consists of the same items as the iRAT, but team members are allowed to discuss the answers. They must reach a consensus and submit what they collectively think is the correct answer. Students receive immediate feedback on their tRAT responses through computer-generated feedback, indicating whether the response they have chosen is correct. They then get a chance to raise questions and seek clarifications from the instructor. The instructor, in turn, provides elaborative feedback to the students. The knowledge application phase also occurs during the TBL classroom session. Students complete an application exercise (AE) in which they are presented with case studies or real-life problems that professionals have faced. In their teams, the students apply what they have learned during the first 2 phases to work toward a resolution to the problem. The instructors provide answers and additional explanations in case students need further clarification.

Technology in TBL

As educational technologies have become increasingly available, the trend has been to incorporate them with TBL.6–11 This blending has taken on various forms, such as video conferencing, online lectures, or social media (e.g., Twitter).9 The most common way to incorporate educational technology within TBL is the use of a learning management system (LMS).9

An LMS is a web-based software application designed to assist instructors in meeting their pedagogical goals and delivering learning content.12 Functions of an LMS include but are not limited to posting course materials and content information, serving as a forum for discussions, and providing examination tools.8 One benefit of an LMS is that learners can access the content information posted by instructors at any time of the day, complete online quizzes, submit homework, and communicate with other learners and instructors in the same community. Examples of LMS platforms include iLAMS,11 Blackboard (https://www.blackboard.com/webapps/login; Washington, DC), and Moodle (https://moodle.com; Perth, Australia). These types of LMSs are used across a wide variety of instructional settings and for a wide variety of approaches.

The main use of the LMS in the TBL context is to assist the instructor in controlling the administrative and learning processes during TBL. As discussed by Robinson and Walker,6 the use of technology such as the LMS can help ease the logistic processes during TBL, making the whole process more efficient and reliable. In addition to accurately capturing students’ attendance, an LMS allows the instructor to control when students may access the materials for each TBL phase. For instance, the instructor may create a “gate” so that students can access the tRAT only after everyone has completed the iRAT.11 This control allows the instructor to guide the students, especially those who are new to TBL, through the learning process. At the same time, the LMS has a major role as an assessment system. LMS functionality includes marking the students’ iRATs and tRATs immediately during each TBL session, recording individual student scores and team scores in real time, and providing accurate feedback to the students during the tRAT portion of the TBL session.

An LMS incorporated into TBL can capture what happens before an actual TBL session, for instance, how students are accessing and downloading the prereading materials. Beyond enhancing the efficiency of conducting TBL sessions and assessing learners’ performance, each LMS features an underused function: the ability to routinely collect and store a large amount of data concerning students’ online activities and performance. For instance, the LMS can capture the amount of time each student spends answering every individual item in iRAT, tRAT, and AE. All these data allow for learning analytics, and findings from these analyses can be used both to track students’ learning progress and, importantly, to identify specific students whose performances indicate they may need help.

Given the capacity of an LMS to accurately capture and record student-specific and item-specific data, it is surprising that there are hardly any accounts in the literature demonstrating how such data can be used for monitoring and improving the process of learning in TBL. We believe that the availability of these data is an important but currently neglected feature of technology-blended TBL that deserves to be highlighted. Therefore, in this article, we present real data from a relatively new medical school (graduating its first class in 2018) that has adopted TBL as its main instructional approach. By means of a narrative case study, we illustrate how we used data collected by the LMS to explore the extent to which teams in TBL were underperforming, to demonstrate the breadth and depth of the data available, and to explain the kinds of analyses that can be performed with the existing LMS-generated data.

Educational Context and Case Study

Background

The data used in this case study originated from, as mentioned, a newly established medical school. Lee Kong Chian School of Medicine (LKCSoM) at Nanyang Technological University in Singapore uses TBL as its main pedagogical approach during the first 2 years of its undergraduate Bachelor of Medicine and Bachelor of Surgery (MBBS) program. The program is an integrated systems-based curriculum, sequenced and structured according to different body systems and foundational concepts. All students must complete 8 modules over the span of 2 years (4 modules per year). The 4 modules in Year 1 are (1) Introduction to Medical Sciences, (2) The Cardiorespiratory System, (3) The Renal and Endocrine System, and (4) The Musculoskeletal System and Skin. In Year 2, the 4 modules are (1) The Gastrointestinal System, Blood, and Infection (GIBI), (2) The Neuro System, Ear-Nose-Throat, and Eyes, (3) Reproduction Medicine and Child Health, and (4) Mental Health, Aging and Family Medicine.

On average, students attend 2 TBL sessions per week. Each TBL session lasts about 6 hours and requires an additional 4 to 6 hours of preclass preparation (see Rajalingam and colleagues11 for an overview). At LKCSoM, all TBL sessions are managed and delivered through an LMS, referred to as iLAMS.11

Our case began with a simple observation made during the 2017–2018 academic year by an instructor at the end of one of the Year 2 modules—specifically GIBI, the first module all students take at the start of their second year. The module was 14 weeks in duration and comprised 25 TBL sessions. Standard procedure at LKCSoM is to conduct an End-of-Module Review Meeting, during which all instructors and the module lead come together to discuss how the module went and to consider any possible changes. During this meeting, one of the instructors noted a concern: Some teams appeared to struggle throughout the module. The instructor—who had been present during several TBL sessions and had listened in while some teams discussed the tRAT questions—observed that some teams appeared unable to adequately justify their answers, whereas other teams did a perfect job in explaining and justifying their answers. This instructor asked if data were available to determine empirically if some teams were performing significantly better than others.

This question started an investigation in which we accessed data routinely collected by iLAMS. Here we present the 4 steps of our investigation to illustrate how the LMS provided answers both to our original question and the subsequent questions that arose based on the insights gained.

A total of 107 Year 2 medical students (71 males [66.4%]) were grouped into 18 TBL teams. Each group consisted of 6 students, with the exception of one team which consisted of 5 students. Of the 25 TBL sessions for the GIBI module, 17 had full attendance. The remaining 8 sessions had 1 to 2 absentees. On average, during the TBL sessions for the GIBI module, students had to complete an average of 22 multiple-choice questions (MCQs) for the iRAT and tRAT. The iRAT took an average of 15.2 minutes to complete, and the tRAT took an average of 81.1 minutes to complete. While all the students had to respond to iRAT, only the team leader of each team provided the team’s response to all the MCQs in the tRAT. At LKCSoM, none of the TBL activities are graded; that is, scores on iRAT, tRAT, and AE are not part of the students’ final grade.

Analysis 1: Can we identify a team that is performing significantly weaker than the other teams?

In our view, the first logical step in addressing the question of whether there were indeed weaker performing teams during the GIBI module was to retrieve the mean tRAT scores for that module. The mean tRAT scores represented the average score of all 25 TBL sessions for each team (see Figure 1 for a depiction of the results). To determine whether any teams’ performances were actually poorer than those of others, we determined a cutoff score, below which we considered all scores to be statistically significantly lower than the mean. The overall mean tRAT score of the class was 93.8%, and 2 standard deviations (equal to 2.7%) below this mean were 91.1%; thus, we considered all scores below 91.1% to be statistically significantly lower than the mean.

F1
Figure 1:
Graphical representation of the mean team readiness assurance test (tRAT) scores (expressed as the percentage correct) for all 18 teams on the Gastrointestinal System, Blood, and Infection (GIBI) module for Year 2 medical students who matriculated into Lee Kong Chian School of Medicine at Nanyang Technological University in 2016. The y-axis range is from 90.0% to 96.0%.

Only one team fell below the 91.1% threshold; Team 4 had a mean tRAT score of 90.9%, which is statistically significantly lower than the mean scores of all teams. Of course, statistical significance does not imply educational significance. Figure 1 puts Team 4’s mean tRAT score into perspective. The y-axis in Figure 1 is plotted on a narrow range of just 90.0% to 96.0%. When the data are plotted on a y-axis ranging from 0.0% to 100.0%, the findings appear quite different. On that scale, the difference between the teams is hardly visible, and Team 4 does not really appear to be an underperforming team.

Notably, all teams appeared to have performed rather well since they all had a mean tRAT score of 90.0% or above. Team 4 did have a significantly lower mean iRAT score, but, considering the overall high scores and the fact that the difference was only marginal and hardly of educational significance, we did not worry that any team was performing at a critically low level during the GIBI module.

However, at this point, we realized that we could do more with the data we had access to. Indeed, we detected no large differences in mean iRAT scores across teams, but we also realized that the mean scores could be hiding important information. For instance, mean scores did not show trends over time. In theory, Team 4 could have possibly performed well at the start of the GIBI module but poorly during the second half. The same could have been true for other teams. Thus, we decided to explore how teams performed longitudinally over the course of the GIBI module, across all 25 TBL sessions. To that end, we again accessed our routinely collected TBL data from iLAMS.

Analysis 2: Is the weaker team consistently performing more poorly throughout the module?

To better understand how Team 4 performed longitudinally during the entire GIBI module, we retrieved the tRAT scores for each of the 25 TBL sessions for further analysis. We hypothesized that if Team 4 was really a weaker team, then its tRAT scores would be substantially lower for many of the TBL sessions when compared with the other teams. To compare Team 4’s performance with those of the other teams, we first generated the mean tRAT scores of all teams for each of the 25 TBL sessions and then plotted these scores on a graph. Given the large number of teams (N = 18), we have not presented a line graph here. Instead, we generated the mean tRAT score of the other 17 teams for each TBL session and compared those scores with Team 4’s scores across the 25 TBL sessions (see Figure 2).

F2
Figure 2:
Graphical representation of how one team (Team 4) of Year 2 medical students (at Lee Kong Chian School of Medicine at Nanyang Technological University) performed in comparison to the remaining 17 teams on the team readiness assurance test (tRAT) for each individual team-based learning session (N = 25) for the Gastrointestinal System, Blood, and Infection (GIBI) module during academic year 2017–2018. The y-axis range is from 70.0% to 100.0%.

Figure 2 illustrates that Team 4 had generally somewhat lower tRAT scores compared with the rest of the class but performed well and even better on some occasions (e.g., TBL sessions 2, 7, and 21). Given this outcome, we felt we could not definitively conclude that Team 4 was performing poorly on the majority of TBL sessions. Still, we could not deny that Team 4 more consistently displayed a weaker tRAT performance on numerous TBL sessions compared with other teams. Curious to learn more, we raised a new question: Was Team 4 perhaps composed of academically weaker students?

Analysis 3: Are the students in Team 4 generally weaker academically?

To address this third question, we decided to investigate whether Team 4 was accidentally composed of students who were weaker academically. Importantly, at LKCSoM, student teams stay together for the duration of a full academic year, so extensive planning goes into the team composition. For instance, teams typically comprise students with similar educational backgrounds (i.e., specialization in biology vs physics in junior college), age, and gender distribution. To study the academic strength of the individuals in Team 4, we extracted students’ performance data, specifically entry test results and prior academic achievement scores. At LKCSoM, all students had to complete both the BioMedical Admission Test (BMAT, www.bmat.org.uk) and the Multiple Mini Interviews (MMI13) before enrollment. For the prior academic achievement scores, we included 2 sets of scores from end-of-year examinations that all students had completed at the end of their first year.

Using SPSS software (version 25; IBM, Armonk, New York), we conducted 4 analyses of variance (ANOVAs) and detected no significant differences on the 4 measures among the teams:

  • BMAT; F = 0.84, P = .64
  • MMI; F = 0.89, P = .59
  • Examination 1; F = 0.55, P = .92
  • Examination 2; F = 0.28, P = .99

Given these findings, we ruled out the possibility that Team 4 was composed of students with generally weaker academic ability.

We then considered an alternative explanation for Team 4’s generally poorer tRAT performance: that the students in Team 4, compared with the other teams, simply prepared less effectively for the TBL sessions in the GIBI module. To test this alternative hypothesis, we extracted the iRAT scores from the LMS database.

Analysis 4: Are students in Team 4 preparing less for class?

The iRAT scores are the individual test scores that represent how well each student had studied the preparation materials for the TBL sessions. We reasoned that if Team 4 did indeed prepare less thoroughly, then its members’ mean iRAT scores would be significantly lower than the mean iRAT scores of other teams.

Figure 3, which shows the mean iRAT scores for all teams, illustrates that Team 4 had a lower mean team iRAT score compared with all other teams: M = 66.9% (vs 70.0% or higher). While this difference in mean iRAT scores did not reach statistical significance (F = 1.33, P = .20), Team 4’s lower iRAT scores are quite obvious. The fact that Team 4 consistently (but not always) scored lower on various TBL indicators fueled further investigation.

F3
Figure 3:
Graphical representation of the mean individual readiness assurance test (iRAT) percentage scores on the Gastrointestinal System, Blood, and Infection (GIBI) module for all 18 teams of Year 2 medical students at the Lee Kong Chian School of Medicine at Nanyang Technological University during academic year 2017–2018. The y-axis range is from 0.0% to 100.0%.

We realized that mean iRAT scores may be masking information (similar to what we observed with mean tRAT scores). Theoretically, some students on a team might perform well, while others might not. The mean iRAT score would not reflect these variations. We considered the possibility that only 1 or 2 students were not preparing well, thereby affecting the mean score for the whole team. To make the variance within teams visible, we used a box-and-whisker plot.14

The box-and-whisker plot provides information about each team’s median, upper, and lower quartiles, and highest and lowest observations (see Figure 4). The median score of each team is indicated by the thick line in each box, and it marks the midpoint of the data and divides the box into 2. Half of the scores are greater than this midpoint, and the other half are less than this midpoint. Since there are 6 students in one team (with the exception of Team 17 which had 5 students), 3 students’ iRAT scores fall above, and 3 students’ scores fall below, the median. The ends of the box are the upper (75th) quartile (75% of the scores fall below the upper quartile) and lower (25th) quartile (25% of the scores fall below the lower quartile), so the box spans the interquartile range (IQR). The IQR represents the middle 50% scores of the whole team, and since there are 6 students on each team, the IQR represents the middle 3 students’ iRAT scores. The 2 lines outside the box are referred to as “whiskers,” and these mark the highest and lowest observations. The length of the box indicates whether the team is heterogeneous or homogeneous. For instance, Team 8 is a more heterogeneous group; the length of its box is relatively long, indicating that the iRAT scores were spread widely. Team 2, on the other hand, is a more homogeneous group for iRAT performance, and the length of the box is shorter.

F4
Figure 4:
Box-and-whisker plot demonstrating the distribution of the individual readiness assurance test (iRAT) percentage scores on the Gastrointestinal System, Blood, and Infection (GIBI) module for all 18 teams of Year 2 medical students at the Lee Kong Chian School of Medicine at Nanyang Technological University during academic year 2017–2018.

Based on this box-and-whisker plot (Figure 4), Team 4 does not appear to have performed poorly when compared with the other teams; indeed, its median ( = 67.2) is not much lower than that of Team 12 ( = 68.2). In addition, Team 4 appears to be among the more homogeneous groups given that its box is relatively short—similar to Teams 5 and 7, for example.

More important, the box-and-whisker plot also shows outliers. In the box-and-whisker plot, an outlier is a data point that falls outside a range calculated by multiplying the IQR by 3, and it is represented by a filled circle or star (i.e., Team 2 in Figure 4). If the data point falls outside a range calculated by multiplying the IQR by 1.5, then it is regarded as a suspected outlier and it is represented by an unfilled circle (i.e., Teams 4 and 7 in Figure 4).

Among the 3 identified outliers, the outlier in Team 4 received the lowest mean iRAT score (on average, 57.5%) in the class, and as a result, the team obtained the lowest median score of 67.2%. This outcome suggests that one student on Team 4 did not perform well. Intrigued by this finding, we retrieved all 25 iRAT scores of this student for the entire GIBI module. We found that this student’s iRAT score was below 50% on 6 TBL sessions. Furthermore, on 2 sessions, the student’s iRAT scores were as low as 33.3%.

Discussion

Our objective was to highlight how data that are routinely collected by an LMS during TBL can be used for learning analytics. In our view, this integration of IT and TBL makes sense and has benefits that go beyond streamlining the learning process for students and instructors. We intended to highlight how data that are routinely stored in an LMS are often neglected but can be used to generate detailed insights into the learning process. By examining data that the institution routinely collects, medical educators may be able to respond more easily to learning or curricular needs identified by faculty. To that end, we presented a case study illustrating how we used such data. Of course, we could have presented many other scenarios, but we thought showing the breadth and depth of our data would be informative. Our use of LMS data for learning analytics began with a single instructor’s general observation that some teams were struggling in a module, and our step-by-step analyses, driven by data-informed questions, resulted in the identification of one individual student who caused a ripple in the sea of data stored in the LMS. (After we identified the student, a study skills counselor provided support; the student improved academically and is now coping well.)

An interesting by-product of our analysis is the observation that the data provide insights into the effectiveness of TBL as an instructional approach. Although one team obtained the lowest tRAT and iRAT scores in class, the data demonstrate that teamwork in TBL seems to do its intended job. Although Team 4 turned out to be a relatively weak team because of one student’s lack of preparation, the weaker team member benefited from the stronger ones. Indeed, Team 4’s Team Gain Score, (i.e., the difference between mean iRAT and tRAT) was the highest in the class: mean tRAT − mean iRAT = 90.9% − 66.9% = 24.0%. Thus, Team 4 gained, on average, 24.0% as a result of the team discussions. Among all 18 teams, the average Team Gain Score was 18.7% (SD = 2.6%), with a minimum value of 13.6% and a maximum value of 24.0%. This is a testament to the effectiveness of TBL and the important role the team discussions fulfill in TBL.

Notably, the case study we have presented here provides just one example of the many ways LMS data can be used in TBL. We have not explored other types of data stored in the LMS. For instance, time-on-task data (e.g., time taken to complete iRAT and tRAT) can provide important instructional information. We must also emphasize that the data presented and used in this case study are specific to the questions we needed to answer, as well as to the type and amount of data collected by iLAMS. We were able to show teams’ and individual students’ trend data because of how our TBL was structured and how the LMS was designed. Depending on the queries posed and the type of data available, data may be mined, applied, and presented differently. In this particular case, we used only the well-established measures of TBL and thus examined only a small portion of what is possible.

In conclusion, the case presented here illustrates how institutions that are considering adopting TBL—or those that already have—can use data routinely collected via the LMS during TBL to answer important educational questions and make informed decisions about interventions that can help students improve their learning. This accurate data collection and recording tool is a powerful hidden feature of TBL that has often been neglected in characterizing this popular instructional approach.

References

1. Siemens G, Baker RSJd. Learning analytics and educational data mining: Towards communication and collaboration. Paper presented at: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge; April 29, 2012; Vancouver, British Columbia, Canada. https://www.upenn.edu/learninganalytics/ryanbaker/LAKs%20reformatting%20v2.pdf. Accessed December 11, 2019.
2. Koles PG, Stolfi A, Borges NJ, Nelson S, Parmelee DX. The impact of team-based learning on medical students’ academic performance. Acad Med. 2010;85:1739–1745.
3. Parmelee DX, Hudes P. Team-based learning: A relevant strategy in health professionals’ education. Med Teach. 2012;34:411–413.
4. Haidet P, Levine RE, Parmelee DX, et al. Perspective: Guidelines for reporting team-based learning activities in the medical and health sciences education literature. Acad Med. 2012;87:292–299.
5. Hrynchak P, Batty H. The educational theory basis of team-based learning. Med Teach. 2012;34:796–801.
6. Robinson DH, Walker JD. Michaelsen LK, Sweet M, Parmelee DX. Technological alternatives to paper-based components of team-based learning. In: Team-Based Learning: Small-Group Learning’s Next Big Step. 2008:San Francisco, CA: Jossey-Bass; 79–85.
7. Palsolé S, Awalt C. Michaelsen LK, Sweet M, Parmelee DX. Team-based learning in asynchronous online settings. In: Team-Based Learning: Small-Group Learning’s Next Big Step. 2008:San Francisco, CA: Jossey-Bass; 87–95.
8. Antoun J, Nasr R, Zgheib NK. Use of technology in the readiness assurance process of team based learning: Paper, automated response system, or computer based testing. Comput Human Behav. 2015;46:38–44.
9. River J, Currie J, Crawford T, Betihavas V, Randall S. A systematic review examining the effectiveness of blending technology with team-based learning. Nurse Educ Today. 2016;45:185–192.
10. Gomez EA, Wu D, Passerini K. Computer-supported team-based learning: The impact of motivation, enjoyment and team contributions on learning outcomes. Comput Educ. 2010;55:378–390.
11. Rajalingam P, Rotgans JI, Zary N, Ferenczi MA, Gagnon P, Low-Beer N. Implementation of team-based learning on a large scale: Three factors to keep in mind. Med Teach. 2018;40:582–588.
12. Machado M, Tao E. Blackboard vs. Moodle: Comparing user experience of learning management systems. Paper presented at: 2007 37th Annual Frontiers in Education Conference—Global Engineering: Knowledge Without Borders, Opportunities Without Passports; October 13, 2007. Milwaukee, Wisconsin. http://fie2012.fie-conference.org/sites/fie2012.fie-conference.org/history/fie2007/papers/1194.pdf. December 11, 2019.
13. Eva KW, Rosenfeld J, Reiter HI, Norman GR. An admissions OSCE: The multiple mini-interview. Med Educ. 2004;38:314–326.
14. McGill R, Tukey JW, Larsen WA. Variations of box plots. Am Stat. 1978;32:12–16.
Copyright © 2020 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the Association of American Medical Colleges.