The results show that the SP checklist scores and the SP ratings of interpersonal and communication skills have comparable psychometric properties. The reliabilities of the five-item rating form (.76) and the single global rating of patient satisfaction (.70) were slightly higher than the reliability of the 17-item checklist (.65); this finding is of particular significance, given the greater length of the checklist. Also, the checklist scores and ratings appear to be measuring the same underlying dimension, with correlations of the checklist with the five ratings and with the single global rating being .82 and .81, respectively. Van der Vleuten and associates, in two excellent articles, noted a recent shift away from the use of subjective measures of clinical competence, such as rating scales, toward the use of presumably more objective measures, such as SP checklists. Their concern was that these objective measures may focus on somewhat trivial and easily measured aspects of the clinical encounter, and that more subtle but critical factors in clinical performance may be overlooked or ignored. They referred to such measurement as “objectified” rather than objective. The shift is based on the presumption that objective or objectified measurement is superior to subjective measurement, such as ratings, with respect to psychometric properties such as reliability. On the basis of a survey of several studies, though, the authors concluded that “objectified methods do not inherently provide more reliable scores” and “may even provide unwanted outcomes, such as negative effects on study behavior and triviality of the content being measured.” The results of the present study support this conclusion, showing somewhat higher reliabilities for subjective ratings than for the objective (or perhaps objectified) checklist. Also, the high uncorrected correlations suggest that the more reliable ratings are measuring the same underlying dimension as are the checklist scores. The present study also illustrates the application of a recently proposed method for constructing a valid SP checklist, which would consist of items that best reflect global ratings of performance. In this study, the ratings were provided by the SPs themselves, but ratings could be obtained from faculty-physician experts who observe student performance on the SP case. Thus, performance on individual checklist items would be correlated with expert ratings, to identify the items that best predict the ratings. The checklist, then, would be constructed of just those items that best predict the ratings, and the checklist could be used for future testing without the need for further faculty ratings (yet the checklist scores would reflect the faculty ratings). With this approach, it would seem possible to construct checklists for history-taking and physical-examination skills, as well as for interpersonal and communication skills. Thus, the faculty ratings would provide a basis for case development and refinement, including scoring and standard setting, and scores on the checklist would serve as a proxy for the gold-standard faculty ratings. The study suggests that SP ratings may be more efficient and more reliable than SP checklists for assessing interpersonal and communication skills. The study also demonstrates that global ratings by SPs (or by expert physician observers) can provide a basis for SP-test construction.
Created Date: 22 February 1996; Completed Date: 22 February 1996; Revised Date: 18 December 2000
© 1996 Association of American Medical Colleges