The results are disappointing, providing little support for the validity of the case-passing decisions based on this simple approach to scoring and standard setting. The case-passing decisions predicted what the case author intended for about only 73% or 74% of the students on average and, with agreement expected by chance removed, predicted what the case author intended for about only 25% of the students. Even with the use of the optimal pass/fail cutoffs and the dropping of students with ambiguous borderline global ratings, the case-passing decisions failed to agree with the case authors' global ratings for 15% to 30% of the students. The findings might be dismissed as simply due to low reliabilities of passing decisions and global ratings based on a single case. Although this concern would apply to intercase reliabilities, which would be subject to case specificity, the appropriate reliabilities here would seem to be intracase (i.e., intrarater), which should be fairly high (if they could be computed). Nevertheless, it seems reasonable to expect much better agreement between results of case scoring and of standard setting developed by the case author and the case author's global ratings of performance on that case, given that the case author might recall the checklist, assign a weight to each item, and so forth. Also, case-passing decisions would possibly agree more with global ratings of live or videotaped performances than with ratings of written summaries of performance; however, that question remains a challenge for further research. In conclusion, the study provides only weak evidence, at best, for the validity of the scoring and standard setting commonly used with SP assessment. The results do not undermine claims about the realism of the SP approach, however, nor do they call into question the standardization afforded by this method of assessing clinical competence. The results do raise serious concerns about this simple approach to scoring and standard setting for SP-based assessments and suggest that we should focus more on the observation and evaluation of actual student performance on SP cases in the development of valid scoring and standard setting.
Created Date: 22 February 1996; Completed Date: 22 February 1996; Revised Date: 18 December 2000
© 1996 Association of American Medical Colleges