To examine whether residents in either group were changing their scores significantly in the directions of the experts' scores, we compared the pre- and post-video self-assessments for residents in both groups, using a two-tailed t test. Examining the raw scores alone, the changes were not significant (p = .26 for the low group, p = 0.19 for the high group). However, using the z scores rescaled relative to the videos, the change in self-assessment toward the experts' scores was only marginally significant for the low-performing group (.16 ±.28, p = .074) but highly significant for the high-performing group (.37 ± .24, p = .002). The difference in the amount of correction between the two groups (.16 versus .37) was marginally significant (p = .06).
In fact, using the z score method, the difference in the patterns of correction between the two groups became quite evident. For the top-rated group, every single resident after viewing the videos demonstrated an appropriate but somewhat conservative correction in self-assessment scores toward the scores given by the experts. By contrast, the low-performing group demonstrated a variety of patterns after viewing the videos. Although three residents showed an appropriate shift toward the experts' scores, two residents adjusted too radically and began to underestimate their scores relative to the experts. One resident showed no change in self-assessment score at all, and two residents actually rated themselves higher after seeing the videos, despite the fact they they were clearly overestimating their performances prior to seeing the videos.
Clearly this study was limited in scope, deriving results from a small sample of physicians from one specialty (family practice), in a specific domain (interviewing skills) in one setting. Thus, it is important to note that our findings may not generalize to physicians at other levels of experience or in other specialties. Furthermore, we used only one standardized patient case, and it is possible that this particular case might have had elements that made errors in self-assessment more likely. However, the results of this study do replicate the pattern that Kruger and Dunning found across a wide range of subjects in a wide range of tasks of knowledge and skills outside of medicine. While studies of self-assessment should be replicated in different contexts in medicine, even these preliminary results raise interesting questions for further research and for medical education.
We found that those residents in the highest performance group were able to recalibrate their self-assessments more accurately when presented with benchmark videos. Some residents in the lowest performance group were also able to do this, although not as consistently, and two of eight individuals actually worsened their already inflated self-assessments.
From a technical point of view, we feel this work might make an important contribution to the evolving area of self-assessment research. To date, almost all reports of accuracy of self-assessment have relied on comparisons of the raw scores of a group of subjects on a measure of performance, compared with scores of expert(s).3 In this study, examining raw scores alone did not illustrate the problems of self-assessment shown in previous studies because different residents were using the measurement scale in different ways.2 It was only by recalibrating their self-assessments relative to the way that they rated others that it was possible to see that the highest performers were gaining accuracy while the lowest performers were not. We would suggest that future studies of self-assessment consider using a similar method.
Turning to the educational implications, this work begins to reinforce a growing concern expressed by educators in health professional education.3 Self-directed learning is based on the assumption that adult learners can identify and remedy deficits in their knowledge and skills. This is particularly important for self-regulating professions such as medicine, where continuing education is left entirely in the hands of individual professionals. It is only through accurate self-assessment that physicians can identify areas in which they are deficient in order to pursue further learning. But what if some physicians in some circumstances cannot accurately self-assess their skills? What if this inaccuracy persists, even when they are exposed to performances of peers? How will these learners overcome incompetence if left to direct their own learning?
Fortunately, improvements of self-assessment did occur in most of Kruger and Dunning's subjects and in our own. Indeed, we have confirmed that exposure to benchmark performances can lead to better self-assessments. However, that change in self-assessment was minimal or insignificant for the lowest group of residents, who were presumably at the greatest risk of incompetence.
Jeremy Taylor said: “It is impossible to make people understand their ignorance, for it requires knowledge to perceive it; and therefore, he that can perceive it hath it not.” What are we to do about the bottom quartile or tertile? While more studies are needed to replicate and extend our findings, we feel that these results might suggest roles for any or all of:
* selection tests of self-assessment ability prior to medical training;
* teaching/testing self-assessment ability during medical school and residency;
* modeling of self-assessment and self-directed learning by teachers; and
* introduction of continuing education principles, including the development of self-assessment skills during undergraduate and postgraduate education.
1. Kruger J, Dunning D. Unskilled and unaware of it: How difficulties in recognizing one's own incompetence lead to inflated self-assessments. J Pers Soc Psychol. 1999;77:1121–34.
2. Martin D, Regehr G, Hodges B. Improving self-assessment of family practice residents using benchmark videos. Acad Med. 1999;73:1201–6.
3. Gordon MJ. A review of the validity and accuracy of self-assessments in health professional training. Acad Med. 1991;66:762–9.
Research in Medical Education: Proceedings of the Fortieth Annual Conference. November 4–7, 2001.© 2001 Association of American Medical Colleges