Data from a sample of 304 specialty physicians (n = 101 psychiatrists, n = 100 pediatricians, n = 103 IM specialists), recruited by the CPSA-PAR Program, were used to assess the medical colleague instrument. All specialists in the study sample had been in practice a minimum of five years. Each participating specialist was responsible for identifying eight medical colleagues who could answer the questions on the survey on their behalf. Recognizing the diversity of practices, the colleagues could be any combination of peer, referring, or referral physicians.
To assess whether the peer instrument could provide a valid and reliable assessment of competencies across specialties, a number of statistical analyses were performed using the data for each specialty group. Internal consistency reliability was assessed using the Cronbach's alpha coefficient for each of the physician groups and for each of the scales/factors for each physician group. A generalizability analysis was done to determine the generalizability coefficient (Ep2). The latter analysis is required to ensure there are sufficient numbers of items and raters to reach at least (Ep2 = .70).1,5,8 Too low an Ep2 suggests the need for modifications to the measurement procedure, as this assessment can determine the sources of measurement error as well as the number of items and observers required to obtain a desired level of generalizability.
Factor analyses were done to identify the underlying common set of variables for each specialty. Exploratory factor analysis allows one to determine which items belong together (i.e., are a “factor”), the patterns of order of items within the factor, and the commonalities and differences for each of the three specialty groups. This type of analysis is used to investigate common but unobserved sources of influence in a collection of variables. Its empirical basis is the observation that variables from a carefully chosen domain are often intercorrelated. Thus, it is natural to hypothesize that the variables all reflect a more fundamental influence that contributes to individual differences on each measure.13,14 Factor analysis, in this study, was used to decompose the variability of items into two parts: one part attributable to a factor and shared with other items, and a second part that is specific to that item but unrelated to other factors.
Using individual-physician data as the unit of analysis, the 36 items were intercorrelated using Pearson product-moment correlations. The correlation matrix was then decomposed into principle components and these were subsequently rotated to the normalized varimax criterion. This principal component extraction with varimax rotation was employed to determine the factor structure of the instrument for each specialty and the appropriateness of the items for assessing those factors. Items were considered to be part of a factor if their primary loading was on that factor. The number of factors to be extracted was based partly on the Kaiser rule (i.e., eigenvalues > 1.0) and results from our previous research.1,2 With three specialty groups, a comparison of factors by specialty group would allow us to identify group differences as well as similarities, even though the same instrument was used. Thus factor analysis would allow us to identify the factors and numbers of factors for each specialty group, describe the relative variance accounted for by each factor, their coherence and theoretical interpretability, compare factors across specialty groups and determine whether the instrument was sensitive to differences and similarities in the practice of the different groups.
The study received approval from the Conjoint Health Research Ethics Board of the University of Calgary.
A total of 2,306 peer surveys out of a possible 2,432 (94.8%) were completed for the 304 participants in the study. The 101 psychiatrists provided a mean of 7.56 surveys (range of 5–8) for a total of 764 (94.6%) surveys. The 100 pediatricians provided a mean of 7.64 surveys (minimum of 5 and maximum of 8) for a total of 764 (95.5%) surveys. The 103 IM specialists provided a mean of 7.64 surveys (minimum of 6 and maximum of 8) for a total of 778 (94.4%) surveys.
The generalizability coefficient (Ep2) was calculated using the entire sample of physicians as well as for each specialty group. The two facets used in this analysis were raters and items on the instrument. The mean number of peer assessors across all physicians was 7.6 producing an Ep2 of .83 (for pediatricians Ep2 = .78; for psychiatrists Ep2 = .81; and for IM specialists Ep2 = .82).
The factor analysis for psychiatrists showed that four eigenvalues were greater than one (21.4, 1.5, 1.3, 1.2). Nearly 70% of the variance (66.86%) was “accounted for” by this solution. The varimax rotation converged in 12 iterations. Factor 1 (communication skills) accounted for 56.2% of the total variance, Factor 2 (patient management) for 3.9%, Factor 3 (clinical assessment) for 3.5%, and Factor 4 (professional development) for 3.2%. For pediatricians, four eigenvalues were greater than one (21.3, 1.6, 1.4, 1.3) and accounted for 67.6% of the variance. The varimax rotation converged in eight iterations. Factor 1 (patient management) accounted for 56.0% of the total variance, Factor 2 (clinical assessment) for 4.3%, Factor 3 (professional development) for 3.8%, and Factor 4 (communication) for 3.5%. For IM, four eigenvalues were greater than one (24.0, 1.6, 1.2, and 1.0). This accounted for 73.4% of the variance. Factor 1 (patient management) accounted for 63.2% of the total variance, Factor 2 (clinical assessment) for 4.3%, Factor 3 (professional development) for 3.2%, and Factor 4 (communication) for 2.7%. See Table 1.
A Cronbach's alpha coefficient was completed to assess reliability for each specialty and each factor. For psychiatrists, the Cronbach's alpha coefficient was .98 with an average standard error of measurement (SEM) of .17. For pediatricians, the Cronbach's alpha coefficient was .98 and the SEM was .18. For IM specialists, the Cronbach's alpha coefficient was .99 and the SEM was .09. For the factors for each specialty, the Cronbach's alpha was >.90 (see Table 1).
The items that aligned with each of the factors for all specialty groups were similar. Patient management encompassed those aspects of specialist practice in which the physician provides consulting expertise for the patient's care, manages resources, and overall care coordination. Clinical assessment encompassed items like selecting and using diagnostic information in treatment choice. The professional development factor included involvement with professional development, contributing to quality improvement programs, and facilitating learning for others. The communication factor included verbal communication with medical colleagues, other health professionals, and patients.
The response rate is extremely high and consistently high for all three groups. This is due in large part to the fact that the CPSA-PAR program in Alberta is mandatory. The Cronbach's analyses for all groups provide evidence for high internal consistency reliability. The average standard error of measurement ranged from .09 to .18, showing that the instruments had good distributional and psychometric properties for item discrimination (i.e., discerning between different physicians). These results were consistent with our previous studies of family physicians and surgeons.1,2 We achieved a high Ep2 > .80 with between seven and eight raters per physician. This confirms the appropriateness of using fewer raters but more items on the instrument and contrasts with the instruments used by the American Board of Internal Medicine, which uses fewer items but requires more raters.4–6 In Alberta, the CPSA's goal in assessing physicians was to provide quality improvement data on a full range of competencies.1 There are advantages in a province with limited numbers of total physicians to reduce the number of times each physician may be called upon to assess his or her colleagues while maximizing the practice information (i.e., items of data) provided to each physician who is assessed.
The factor analyses revealed the same four factors for all three physician-specialty groups: patient management, clinical assessment, professional development, and communication. While these groupings are similar for all three groups, they reveal some important differences. The pattern of importance of the factors (amount of variance accounted for and cohesiveness) was the same for pediatricians and IM specialists but different for psychiatrists. For IM specialists and pediatricians, patient management was the most important factor while communication was the most important factor for psychiatrists. This was followed by patient management skills for psychiatrists. These differences likely reflect the differences in practices among the three physician groups. Effective communication is generally considered to be the foundation of psychiatry15–17 and its “daily business.” Psychiatry values communication skills above all else as these are central to diagnosis and assessment as well as therapy.17 A psychiatrists' ability to communicate will determine his/her effectiveness in eliciting information and establishing and carrying outpatient care interventions related to the patient's mental health problems over relatively long periods. Conversely, the nature of Canadian IM and pediatric referral practice is often based on managing urgent requests from physician colleagues, using on his or her diagnostic skills and information to establish a protocol for the patient, stabilize the patient, and return the patient to the family physician for ongoing care. These findings related to similarities and differences in factors provide both convergent and divergent evidence for the validity of peer assessments in the present study.
In conclusion, our results indicate that the instrument is appropriate for use across the three specialties. The tools are sensitive to the differences inherent in the practice of psychiatry and emphasize the key components of the specialized consultative nature of IM, pediatrics, and psychiatry. The psychometric quality of the instruments is high. Furthermore, while these tools were developed for use in a Canadian context, they may be useful in other jurisdictions. The family physician instruments have been used in a pilot project in Nova Scotia.7 Medical organizations in Germany, New Zealand, Malaysia, France, and California have all indicated their desire or intention to use the instruments (personal communication) developed by the CPSA-PAR program.12
Funding for the study was provided by the College of Physicians and Surgeons of Alberta. Data collection was provided by Customer Information Services, Edmonton. Special thanks to Robert Burns, John Swiniarski, and Bryan Ward at the CPSA for allowing us to continue to be part of this work. At the University of Calgary, we thank our colleagues Herta Fidler, John Toews, Ray Lewkonia, and Keith Brownell for their ongoing interest and review of documents from our multisource (360-degree) studies.
1. Hall W, Violato C, Lewkonia R, et al. Assessment of physician performance in Alberta: the Physician Achievement Review. CMAJ. 1999;161:52–7.
2. Violato C, Lockyer J, Fidler H. Multi source feedback: a method of assessing surgical practice. BMJ. 2003;546–8.
3. Norcini JJ. Peer assessment of competence. Med Educ. 2003;37:539–43.
4. Ramsey PG, Carline JD, Blank LL, Wenrich MD. Feasibility of hospital-based use of peer ratings to evaluate the performances of practicing physicians. Acad Med. 1996;71:364–70.
5. Ramsey PG, Wenrich MD, Carline JD, Inui TS, Larson EB, LoGerfo JP. Use of peer ratings to evaluate physician performance. JAMA. 1993;269:1655–60.
6. Lipner RS, Blank LL, Leas BF, Fortna GS. The value of patient and peer ratings in recertification. Acad Med. 2002;77(suppl 10):S64–6.
7. Sargeant JM, Mann KV, Ferrier SN, Langille DB, Muirhead PD, Sinclair DE. Responses of rural family physicians and their colleagues and coworker raters to a multi-source feedback process: a pilot study. Acad Med. 2003;78(suppl 10):S42–4.
8. Lockyer J. Multisource feedback in the assessment of physician competencies. J Contin Educ Health Prof. 2003;23:23:4–12.
9. Epstein RM, Hundert EM. Defining and assessing professional competence. JAMA. 2002;287:226–35.
10. Levine AM. Medical professionalism in the new millennium: a physician charter. Ann Intern Med. 2002;136:243–6.
11. Fidler H, Toews J, Lockyer J, Violato C. Changing physicians' practices: The effect of individual feedback. Acad Med. 1999;74:702–14.
12. College of Physicians and Surgeons of Alberta. Physician Achievement Program 〈http://www.par-program.org/
〉. Accessed 20 January 2004.
13. Russell DW. In search of underlying dimensions. The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin
. Pers Soc Psych Bull 2002;28:1629–46.
14. Cudeck R. Exploratory factor analysis. In: Tinsley HEA, Brown SD (eds), Handbook of Multivariate Statistics and Mathematical Modeling. San Diego: Academic Press, 2000:95–124.
15. Schreiber SC, Kramer TA, Adamowski SE. The implications of core competencies for psychiatric education and practice in the US. Can J Psychiatry. 2003;48:215–21.
16. Tuhan J. Mastering CanMEDS roles in psychiatric residency: a resident's perspective. Can J Psychiatry. 2003;222–4.
© 2004 Association of American Medical Colleges
17. Martin L, Saperson K, Maddington B. Residency training: challenges and opportunities in preparing trainees for the 21st century. Can J Psychiatry. 2003;48:225–30.