The maintenance of physician competence represents a challenge for the profession. As a part of its overall assessment and enhancement strategy, the medical licensing authority in Ontario, the College of Physicians and Surgeons of Ontario (CPSO), sponsors an intensive tertiary physician competency assessment program, the Physician Review Program (PREP), at McMaster University. About thirty of the province’s 25,000 active physicians are assessed each year by PREP, referred by statutory committees within the CPSO because of competency concerns. Of these, some are found to be moderately to severely incompetent. Attempts to improve the performance of such physicians through extensive remedial medical education are sometimes unsuccessful.1 Several possible explanations exist for this futility, including cognitive dysfunction and serious mood disturbance.
To better understand the importance of these factors, we undertook a standardized cognitive screening of all 27 PREP participants in a 12-month period and reported these results in 2002.2* We found that 6 of the 19 physicians (about one third) who performed poorly at PREP had moderate or severe neuropsychological disturbance sufficient to explain their poor performance. Serious mood disturbance was identified in a small minority.
Since the results were of interest yet the numbers were small, the CPSO sponsored an additional 20 PREP assessments with cognitive screening in 2000–01. This has allowed us to extend our initial observations to 47 assessments of 45 physicians (two physicians were reassessed within the study). Some of these additional physicians had previously been assessed at PREP, and some physicians from the initial cohort by now have had repeat PREP assessment. (Reassessment may be requested by the CPSO to determine whether deficiencies identified at PREP have been rectified.) Consequently, we have been able to relate the outcome of PREP reassessment to neuropsychological function in 18 of the 45 physicians studied.
The PREP program has previously been described.3,4 In brief, the PREP assessment is an intensive, full day process, conducted by trained peer assessors, using multiple tools (presently, multiple-choice questions, simulated patients, and chart stimulated recall). Each physician is assigned to one of five categories by the chief assessor (JC) based on the results of all tests. Categories 1 and 2 consist of physicians with no or minor deficiencies, and categories 3 and 4 denote those with moderate to major deficiencies and thereby serious concerns with competence. Previously, categories 5 and 6 consisted of physicians deemed unsafe to practice without or with direct supervision, respectively. We now report both as category 5.
The cognitive test battery and interpretive process have also been described.2 In brief, one or more tests were chosen to assess each of 5 broad cognitive domains felt important for the practice of medicine (see Table 1). Raw scores for each test were converted to z scores using normative data from a reference population matched for education and age5 (see below). (The z score equates to the distance in standard deviations from the mean in a standardized normal distribution.) A score for each domain was then derived from the contributing z scores and, finally, each physician was assigned a global summary score, reflecting the likelihood in the professional opinion of our neuropsychologists that the physician’s cognitive difficulty would impair performance. This score ranged from 0 to 4, indicating no, minimal, mild, moderate, or severe difficulty respectively. A higher score could be assigned on the basis of a large deficit in a single domain (for example, learning and memory), or lesser deficits across several domains.
To maintain consistency with our initial report, we derived the z scores for each test using age-matched reference data. Performance in most tests drops with age, and age-adjustment lessens the apparent difficulty. Age-adjustment is standard practice, and has a certain utility from a formative viewpoint. However, if public protection is the objective, correcting for age is less appropriate. (That is, we wish physicians to be competent, not just competent for their age.) To explore the impact of age-adjustment on our results, we rederived an “age-independent” score for all 45 physicians, using normative data from a mid-adult reference population, matched for education but aged 35–40. (The reference source of the normative data for each test is indicated in Table 1.)
We assessed mood using the Profile of Mood States (POMS). The POMS score was not factored into the global score, and was analyzed separately. All tests were administered at the end of the regular PREP testing by an experienced neuropsychology research associate, and were interpreted by a clinical neuropsychologist unaware of the physician assessment outcome or the reason for the physician’s referral to PREP. The reliability of the ratings is good, and was addressed in our earlier report.2
We analyzed the results using 2 × 2 tables, and used correlation and multiple linear regression to refine the analysis. Covariates were age in years, mother tongue (English, 1; Other, 0), scores for the 5 neuropsychological domains, mood score, age-adjusted global assessment, age-independent global assessment and PREP category all as described above, and success or failure at the PREP reassessment.
In total, there were 47 assessments of 45 physicians, since two physicians were reassessed within the study. The 2 physicians reassessed within the study had near-identical results on reassessment, and we will report the results in terms of the 45 physicians.
For analysis, the PREP scores reasonably divide into satisfactory (categories 1 and 2) and unsatisfactory (categories 3, 4, and 5). Fourteen of the 45 physicians (31%) achieved satisfactory results at PREP and 31 (69%) had unsatisfactory results. Of the 14 physicians scoring well, 12 had no or minimal or mild cognitive impairment using age-adjusted reference norms (Table 2, top panel). One of the other two was a physician who did not cooperate fully with the neuropsychological assessment (the only one we were aware of), and the other had a moderate and presumably well-compensated deficit. Of the 31 physicians scoring poorly at PREP, 12 (38%) had moderate (7 physicians) or severe (5 physicians) cognitive impairment, likely sufficient to explain their poor performance.
The results using the corresponding age-independent neuropsychological scores are qualitatively similar, although as expected a higher percentage of physicians were judged to have moderate to severe cognitive difficulty; consequently a higher percentage performing poorly at PREP were judged to have moderate to severe impairment (17/31, 55%; see Table 2, bottom panel).
The best predictor of the PREP score was the age-independent global rating (Pearson correlation = 0.57), while the age-adjusted global rating was less predictive (0.39). Other correlates included age (0.41), and English as mother tongue (- 0.35). A multiple linear regression using only the age-independent global score, English as mother tongue, and age as covariates accounted for 42% of the total variance in the PREP score (r = .65), and all predictors were significant.
All individual neuropsychological domains contributed significantly to the age-adjusted global rating, although some did so slightly more than others: memory (0.75), attention and tracking (0.72), verbal fluency (0.64), visual-spatial problem solving (0.51), and verbal problem solving (0.44). A qualitatively similar contribution was seen for the age-independent global rating: memory (0.65), attention and tracking (0.64), verbal fluency (0.54), visual-spatial problem solving (0.57), and verbal problem solving (0.35).
Eighteen of the 45 physicians for whom we had neuropsychological testing were reassessed at PREP because of poor initial performance. The time between PREP assessments varied from 1 to 3 years.
Six physicians improved significantly at reassessment (achieved PREP category 1 or 2, or improved by 3 or more categories); in all six there was no or minimal neuropsychological impairment using age-adjusted reference norms (Table 3, top panel). Two of the six were judged to have moderate neuropsychological impairment if the more demanding age-independent norms were used (Table 3, bottom panel). Twelve physicians remained unsatisfactory at retesting. Of these, five had evidence of moderate or severe neuropsychological dysfunction that could explain their inability to improve, and this number rose to 9 (75%) when age-independent scoring was used.
Using multiple regression, the strongest predictor of successful remediation was the age-independent global rating (−0.44), which was of marginal significance (p = .065). There was a low (−0.17) and nonsignificant relation to age.
While mood disturbance did not correlate with either the age-independent global score or the PREP category, moderate or severe mood disturbance (POMS rating 3 or 4) was identified in 6 of 45 physicians. In two physicians, mood disturbance was associated with poor performance at PREP and with significant cognitive difficulty. In three, mood disturbance was associated with poor PREP performance but had a minor effect only on neuropsychological testing, and in one, severe mood disturbance was an isolated finding, not associated with impairment of neuropsychological screening or of PREP results.
The estimation of the prevalence and severity of neuropsychological difficulty in the physician population we studied is prone to several errors of measurement and interpretation, as set out in our first report. These include the underestimation of difficulty that might have become more apparent with extensive neuropsychological testing, the overestimation of difficulty because the testing occurred at the end of a busy and stressful day, and the subjective component inherent in the interpretative process. With these same caveats, we can confirm and extend the findings of our previous study. Overall, a significant minority (12/31; 38%) of physicians found deficient in a standardized test of competency were found to have moderate or severe impairment in age-adjusted neuropsychological testing sufficient to explain their poor results. This is a sizeable number, and the common attribution of physician incompetence to laziness or “failure to keep up” would seem overly simplistic in many cases.
A higher number, indeed a small majority (17/31; 55%) showed difficulty if the results are referenced to an education-matched, mid-adult-age normative population. This is so because neuropsychological performance drops with age, and no physician tested was below the chosen reference range age. Older physicians are overrepresented in our sample because difficulties with physician performance, and hence referrals from the licensing authority, increase with age. Age-adjustment of psychological results, even though standard practice, will underestimate the magnitude of any underlying psychological difficulty in absolute terms, and may be less relevant from a quality assurance viewpoint. In the present study, age-independent global scoring correlated far better with the PREP rating, and with improvement at PREP retesting, than did age-adjusted scores. In Ontario, over 1,000 physicians are still in practice above age 70 (about 5% of all practicing physicians in Ontario).
It is important to emphasize that these results will have limited generalizability, and may not be applicable to self-referred physicians seeking assessment of their competence for educational direction, since all physicians in the present study were referred by statutory committees within the CPSO because of preexisting concerns with clinical performance. Many elderly physicians continue to function at a high level, with little cognitive impairment and with a wealth of practice experience. In a computerized cognitive assessment, 91% of 356 physicians over 65, living independently in the community, had scores consistent with normal cognitive aging.6 Indeed, in our study, age was not identified as a major impediment to remediation, and three of the six physicians who were successful at retest were 55 or older. Given the present shortage of physicians world-wide, the development of alternative approaches to the maintenance of competence in the elderly physician should be an area of interest.
With respect to remediation, it does appear that neuropsychological testing can help identify physicians who are less apt to improve on a PREP retest. Although the numbers are small, no physician in our study with moderate or severe global difficulty for age was successful at retest. We should emphasize that there was no uniform attempt at remediation between test and retest—this was left to the discretion of the CPSO and the individual physician. Nonetheless, since licensure is in balance, motivation should be high. Clearly, the ability to predict educational futility would be important. For younger physicians, the presence of unsatisfactory professional performance, and the inability to successfully remediate by virtue of neuropsychological dysfunction, should permit access to physician disability programs. For physicians past retirement age, several years of expensive and futile educational efforts could be spared.
In summary, we have found neuropsychological testing to be useful in the assessment of physician competence. Our study indicates that such testing is a strong predictor of performance in a standardized competency assessment test, especially when results are referenced to a midadult population, and also is a strong predictor of remediation. Thus, it seems reasonable to include cognitive screening as an element of intensive assessment of competence, either within the assessment itself or afterwards, as a prelude to educational interventions for those physicians found deficient.
The authors acknowledge the input and assistance of Dr. Daniel Klass and Dr. William McCauley at the CPSD.
1 Hanna E, Premi J, Turnbull J. Results of remedial CME in dyscompetent physicians. Acad Med. 2000;75:174–76.
2 Turnbull J, Carbotte R, Hanna E, Norman G, Cunnington J, Ferguson B, Kaigis T. Cognitive difficulty in physicians. Acad Med. 2000;75:177–81.
3 Norman GR, Davis DA, Lamb S, Hanna E, Caulford P, Kaigis T. Competency assessment of primary care physicians as part of a peer review program. JAMA. 1993;270:1046–51.
4 Cunnington JPW, Hanna E, Turnbull J, Kaigis TB, Norman GR. Defensible assessment of the competency of the practicing physician. Acad Med. 1997;72:9–12.
5 Mitrushina MN, Boone KB, D’Elia LF. Handbook of Normative Data for Neuropsychological Assessment. New York: Oxford University Press, 1999.
6 Powell DH, Whitla DK. Normal cognitive aging: towards empirical perspectives. Curr Dir Psychol Sci. 1994;3:27–31.
References found only inTable 1.
7 Heaton R, Grant I, Matthews C. Comprehensive Norms for an Expanded Halstead-Reitan Neuropsychological Battery: Demographic Corrections, Research Findings, and Clinical Applications. Odessa, Florida: Psychological Assessment Resources Inc, 1991.
8 Heaton R, Chelune GJ, Talley JL, Kay GG, Curtiss GC. Wisconsin Card Sorting Test Manual (WCST) - Revised and Expanded. Odessa, Florida: Psychological Assessment Resources, Inc., 1993.
9 Kolb B, Whishaw I. Fundamentals of human neuropsychology (3d
ed). New York: WH Freeman, 1990.
10 Wechsler D. Wechsler Memory Scale- Revised. San Antonio, Texas: The Psychological Corporation, 1987.
11 Delis DC, Kramer JH, Kaplan E, Ober BA. California Verbal Learning Test Manual. San Antonio: Harcourt Brace Jovanovich, 1987.
12 Yeudall LT, Fromm D, Reddon JR, Stefanyk WO. Normative data stratified by age and sex for 12 neuropsychological tests. J Clin Psychol. 1986;43:918–46.
13 Denberg S, Carbotte RM, Denberg JA. Cognitive impairment in systemic lupus erythematosis: a neuropsychological study of individual and group deficits. J Clin Exp Neuropsychol. 1987;9:323–39.
14 Stuss DT, Stethem LL, Pelchat G. Three tests of attention and rapid information processing: an extension. Clinical Neuropsychol. 1988;2:246–50.
*During the preparation of the present report, we became aware that a formatting error was introduced during the editing of Table 1 of our 2002 paper.2 Readers wishing a correct version of this table may contact the corresponding author.