Using item analyses is an important quality-monitoring strategy for written exams. Authors urge caution as statistics may be unstable with small cohorts, making application of guidelines potentially detrimental. Given the small cohorts common in health professions education, this study’s aim was to determine the impact of cohort size on outcomes arising from the application of item analysis guidelines.
The authors performed a Monte Carlo simulation study in fall 2015 to examine the impact of applying 2 commonly used item analysis guidelines on the proportion of items removed and overall exam reliability as a function of cohort size. Three variables were manipulated: Cohort size (6 levels), exam length (6 levels), and exam difficulty (3 levels). Study parameters were decided based on data provided by several Canadian medical schools.
The analyses showed an increase in proportion of items removed with decreases in exam difficulty and decreases in cohort size. There was no effect of exam length on this outcome. Exam length had a greater impact on exam reliability than did cohort size after applying item analysis guidelines. That is, exam reliability decreased more with shorter exams than with smaller cohorts.
Although program directors and assessment creators have little control over their cohort sizes, they can control the length of their exams. Creating longer exams makes it possible to remove items without as much negative impact on the exam’s reliability relative to shorter exams, thereby reducing the negative impact of small cohorts when applying item removal guidelines.