Papadakis, Maxine A. MD; Hodgson, Carol S. PhD; Teherani, Arianne PhD; Kohatsu, Neal D. MD, MPH
The professional behavior of physicians and trainees has received increasing attention from medical school educators, the broader community of medicine, and society at large.1–4 The University of California, San Francisco (UCSF), School of Medicine has a professionalism evaluation system that monitors students’ professional behavior longitudinally across their four years of medical school.5,6 Begun in 1995, the goals of this system are to identify medical students who demonstrate unprofessional behavior, provide a uniform evaluation and response to unprofessional behavior, and to remediate that deficiency. If a student receives a less-than-satisfactory rating on professional skills at the end of any course or clerkship, a Professionalism Evaluation Form is submitted. The student and the school then work to remediate the student's deficiencies. Deficiencies in professional skills identified in two or more clerkships (or two courses in the first two years and then a clerkship) are considered to reflect a pattern of deficiencies in professional behavior. At minimum, the dean's letter for application to residency programs will document these areas of concern or deficiencies. In addition, the student is placed on academic probation and, if the professional violations are severe, may be dismissed despite attaining passing grades in all courses and clerkships.
There are other professionalism assessment tools for medical students, but the adequacy of existing assessment tools is uncertain.7,8 For example, we do not know whether professionalism inadequacies in students affect their subsequent professional performance as physicians. We hypothesized that unprofessional behavior in medical school, rather than more traditional measures such as demographic characteristics and undergraduate and medical school measures of academic performance, predicts subsequent state board disciplinary action. To test this, we conducted a case–control study of UCSF School of Medicine graduates disciplined by the Medical Board of California.
Selection of Case Subjects
Approximately 6,330 medical students graduated from the UCSF, School of Medicine between 1943 and 1989. The 70 cases in this study were all the UCSF, School of Medicine graduates who were disciplined by the Medical Board of California from 1990–2000. They were identified through a search of the Medical Board of California's computerized database of disciplined physicians. Discipline, ranging from public reprimand to license revocation, is imposed by the Medical Board of California for violations defined in law.9 A single disciplinary action may be imposed for multiple violations of law. The discipline history of physicians licensed in California is public as mandated by state law.10,11
The Medical Board of California classified its reasons for discipline into nine major categories: negligence, inappropriate prescribing, unlicensed activity, sexual misconduct, mental illness, acts endangering patients through the physician's use of drugs or alcohol, fraud, conviction of a crime, and unprofessional conduct.11 For purposes of this analysis, the staff at the Medical Board of California identified the reference violation as that which represented the highest risk to the public and, thus, subject to the most severe level of discipline.
The American Board of Internal Medicine defines professionalism as requiring “the physician to serve the interests of the patient above his or her self-interest. Professionalism aspires to altruism, accountability, excellence, duty, service, honor, integrity and respect for others.”1 The medical director of the state medical board (NDK) used this definition to determine which of the state's working definitions of the nine categories for disciplinary action were violations of professionalism. He determined that all but one, mental illness, was a violation of professionalism. From the perspective of the Medical Board of California, a physician disciplined for negligence should have known that the behavior in question could result in patient harm. For example, an anesthesiologist who chooses to ignore the repeated calls of the nursing staff to see a postoperative patient with a compromised airway is negligent. Such behavior differs from mental illness (e.g., an anesthesiologist with early dementia who has difficulty performing endotracheal intubations under usual circumstances).
Physicians with alcoholism or drug addiction who commit acts that endanger or injure patients are disciplined for those acts by the Medical Board of California. Physicians in this category were included as cases in our study. However, physicians with alcoholism or drug addiction who have not endangered or injured patients may be referred (or may self-refer) to the board's Diversion Program for monitored treatment and do not face board discipline. Physicians in this latter category were not included as cases in our study.
Selection of a Control Group
Members of the control group were UCSF, School of Medicine graduates chosen from a randomized sample, stratified by year of graduation (within one year) and medical specialty, from the Directory of Physicians in the United States.12 We confirmed that members of the control group had not been disciplined in another state by reviewing the Federation of State Medical Board's database of disciplinary actions.
The UCSF, School of Medicine's Office of Student Affairs maintains academic files of graduates that contain the student's application to medical school, all course evaluation narratives, grades, administrative correspondence while in medical school, and the dean's letter of recommendation for residency programs. These files remain complete even for students who graduated decades ago. Records are not purged. A research assistant with previous experience writing dean's letters of recommendation to residency programs abstracted data from these records. All investigators involved in data abstraction remained blinded to whether the files were cases or controls until the completion of the data abstraction.
We abstracted demographic data, undergraduate grade point average (GPA), raw Medical College Admission Test (MCAT) scores, medical school course and clerkship grades, and raw National Board of Medical Examiners (NBME) Part 1 scores on the first attempt. Any negative excerpts about students’ professional and personal attributes (defined as one or more words describing less than satisfactory professional and personal attributes) were abstracted from course evaluations, including narratives, dean's letter of recommendation to residency programs, narratives in the students’ admission interviews, or any other document in the students’ files dated before the student graduated from medical school.
The negative excerpts were assigned a severity rank by the research assistant. Two deans of students (MAP and another dean), each with at least a ten-year history of writing and interpreting student evaluations, independently reviewed all negative excerpts and assigned severity rankings. The two deans of students could refer back to the academic file, if necessary, to contextualize the excerpts. When there was discordant ranking, the two deans discussed the rationale and agreed on the appropriate classification while still blinded to the status of the subjects.
The ranking system established thresholds for the severity of the negative excerpts based on those in our current UCSF, School of Medicine professional evaluation system established in 1995. The ranks were:
* Good = no adverse comments.
* Trace = Student had an occasional constructive or negative comment from an isolated instructor such as “immature,” but the composite evaluations and narratives from a course were good. The occasional constructive or negative comment was mild.
* Concern = Student had problematic comments in one course. These comments were qualitatively serious (beyond the occasional “immaturity” above), such as “resistant to accepting feedback,” “needs continuous reminders to fulfill ward responsibilities,” “unnecessary interruptions in class,” “inappropriate behavior in small groups both with peers and with faculty,” and would have warranted the submission of a Professionalism Evaluation Form now used in the UCSF, School of Medicine professionalism evaluation system.5,6
* Problem = Student had problematic comments in two or more courses at the level of Concern, demonstrating a longitudinal pattern of problematic professional behavior. These students would have received two or more Professionalism Evaluation Forms in the current UCSF, School of Medicine professionalism evaluation system.5,6
* Extreme = Student has Extreme problematic comments, such as “dismissed from the PhD portion of an MD–PhD program because he could not work with peers.” Student received this rating based on the severity of the comment, even if only made once.
The distribution of specialties practiced by UCSF, School of Medicine graduates was taken from the Directory of Physicians in the United States.12
Undergraduate GPA was converted to a four-point scale (A = 4 points, D = 1 point) by adding one point to each grade when a three-point scale was used (A = 3 points, D = 0 points).
The first graduation class for which all students had MCAT scores available was 1952. This MCAT test was a four-part examination: verbal, quantitative, general, and science. Medical school graduation classes since 1977 have taken a six-part examination: biology, chemistry, physics, quantitative, problem solving, and reading. The scoring system changed as well: before 1977 scoring was a scale from 200–800; after 1977, it changed to a 1–15 scale. Subsequent changes in the MCAT did not affect our cohort. Because of the differences in scoring, number of subscales, and percentile rank changes over time, raw score data were analyzed separately for students who took the test before and after 1977. Mean scores on the total MCAT were compared between graduates who received disciplinary action and those who did not. The MCAT scores were then dichotomized into students in the bottom quartile for each MCAT time period [i.e., test administration year (1) 1952–1976 and (2) 1977–1985]. The dichotomized MCAT scores were used in all subsequent data analyses. Variables were compared by using the t test or the chi-square test.
The required course work and grading system (letter grades, honors/pass/provisional nonpass/fail) also changed over the decades. Therefore, the analysis of medical school grades was performed by comparing the number of cases in each group who had one or more course grades that was less than satisfactory (letter grade of D or F, or a provisional nonpass or fail) the first time they took the course.
The NBME Part 1 percentile scores were used in all analyses. Only data for graduates after 1977 were available. Mean score differences on NBME Part 1 percentile scores were compared using an independent t test for graduates who did and did not receive disciplinary action.
Our a priori hypothesis was that severity rankings of Concern, Problem, or Extreme would be associated with disciplinary action. Therefore, we dichotomized the severity rankings into Good and Trace versus Concern, Problem, or Extreme, and used the dichotomized ranking in all subsequent data analyses. Interrater agreement was 92%.
We analyzed our data by running a logistic regression analysis (SSPS version 11; SPSS, Inc., Chicago, Illinois) with disciplinary action as the dependent variable (yes/no). The independent variables were (1) gender, (2) undergraduate measures (undergraduate GPA and MCAT), and (3) medical school measures (medical school grades, severity ranking) and entered into the model in one step.
An estimate of sample size showed that the study had 80% power to determine its primary objective if each group contained 49 subjects. To enhance power, we chose a case to control ratio of 1:3.
All researchers participated in data analyses and data interpretation. The UCSF Committee on Human Research approved this study without requiring informed consent from the graduates.
Seventy graduates of the UCSF, School of Medicine (1% of the graduates) were disciplined by the State Medical Board of California between 1990 and 2000. The control group contained 200 physician–graduates. The academic files of four graduates (two from the case group and two from the control group) were unavailable for data abstraction; the remaining 68 (case) and 196 (control) graduates were included. All but two control-group graduates resided in California. Characteristics of the two groups are shown in Table 1. Graduation years ranged from 1943–1989, and most graduates were men. There was a small, but statistically significant, difference in undergraduate GPA (3.3 for the case group and 3.4 for the control group; p = .04). The specialty distributions for all UCSF, School of Medicine graduates, the case group, and control group, are shown in Table 2. Two specialties (obstetrics and gynecology and psychiatry) were overrepresented among the case group when compared with the specialties entered by all UCSF, School of Medicine graduates.
The principal reason for disciplinary action in 65 of 68 disciplined physicians was a violation of professionalism (see Table 3). As shown in Table 4, the prevalence of negative comments regarding professionalism in the medical school files was 38% (case group) and 19% (control group). Disciplined physicians were more likely to have negative comments regarding professionalism in their medical school file (odds ratio, 2.15; 95% confidence interval, 1.15–4.02; p = .02; see Table 5). The sensitivity of negative comments for disciplinary action is 38% and the specificity is 81%. The other variables were not associated with disciplinary action by the state medical board. These odds ratios were essentially unchanged after removing the three physicians with mental illness from the case group.
The NBME Part 1 scores were available for graduates beginning in 1977 (119). There was no difference in mean (± standard deviation) NBME Part 1 scores between case and control groups (cases, 78.1 ± 6.7; controls, 79.6 ± 5.5; p = .22). Because there was no significant difference in the NBME Part 1 scores between groups, this variable was not included in the model because of the number of missing data.
In the control group, students who entered psychiatry (10 of 28) had the greatest number of comments regarding unprofessional conduct in their files (see Table 4).
We found that UCSF, School of Medicine students who received comments regarding unprofessional behavior were more than twice as likely to be disciplined by the Medical Board of California when they become practicing physicians than were students without such comments. The more traditional measures of medical school performance, such as grades and passing scores on national standardized tests, did not identify students who later had disciplinary problems as practicing physicians.
These data add validity to the assessment of professionalism in medical school and support the use of the UCSF, School of Medicine's professionalism evaluation system. We have, for the first time, demonstrated that unprofessional behavior in medical school is associated with unprofessional behavior in practice. Nonetheless, comments regarding unprofessionalism in the students’ medical school files had a low sensitivity and a high specificity; therefore, the state's medical board did not discipline the majority of medical students who received comments regarding unprofessionalism. Test sensitivity and specificity depend on the threshold above which a test is interpreted to be abnormal. If the threshold is lowered, sensitivity is increased at the expense of lowered specificity. If the threshold is raised, sensitivity is decreased while specificity is increased. We believe it reasonable that the serious outcome of disciplinary action by the state medical board has a high threshold. The risk to the individual student who is identified as a false positive is low unless that student is unduly stigmatized as a “problem student.” The high specificity underscores the importance of the evaluation of professionalism not only to the student but also to society because events that result in disciplinary action by the state medical board have their impact on patients. Our study did not examine whether remediation can reduce this association. However, the demonstration that inadequate professional behavior as a student portends poor professional behavior in practice can now serve as evidence to some resistant students that they must commit to professional growth.
The vast majority of the approximately 105,000 physicians licensed by the State of California practice competent and professional medicine. Only about 350 physicians are disciplined annually by the Medical Board of California.11 Previous studies have shown that disciplined physicians are more likely to be men, in practice for more than 20 years, and less likely to be board certified. The majority of actions taken against physicians are for deficiencies in professional behavior rather than for incompetence.13,14 In our study, negligence was included as a cause of unprofessional behavior rather than incompetence. Even if negligence were not included as an unprofessional behavior, over half of disciplinary actions were for unprofessional behavior.
Our study has limitations. During the decades that these students attended medical school, changes occurred in the competitiveness of medical school admission, curriculum, grading system, and evaluation forms. We believe, however, these changes enhance the generalizability of our findings. To our surprise, narratives dating back to the 1940s regarding the evaluation of professionalism were available and seemed candid. Investigations and disciplinary actions by the Medical Board of California may have become more aggressive between 1990 and 2000 because the public began to demand greater accountability from the medical profession. In addition, we may have overmatched the case and control groups, particularly as it relates to psychiatry and obstetrics and gynecology, which are two of the most overrepresented specialties among disciplined physicians. Although only 6% of physicians are psychiatrists, 28% of physicians disciplined for sex-related offenses are psychiatrists. Only 6% of physicians are obstetricians and gynecologists, yet they represent 13% of physicians disciplined for sex-related offenses.14 We chose to match by specialty practice because we could not determine its contribution as a confounder. Indeed, psychiatrists in the control group had the highest number of unprofessional comments when they were in medical school. Therefore, we probably underestimated the true differences in the frequency of unprofessional comments between the two groups.
Another limitation of our study is that physicians disciplined by a medical board comprise an unknown percentage of the total group of physicians engaging in unprofessional behavior. Furthermore, various social biases may well influence which physicians behaving unprofessionally are ultimately disciplined. Thus, we caution against generalizing the identified associations to all types of unprofessional behavior in physicians.
We have shown that problematic behavior in medical school at UCSF predicted subsequent disciplinary action of the physician by the state medical board. Our findings add to the call for better evaluation tools of personal characteristics of medical students and of applicants to medical school.15 Although mindful that only a small number of physicians come to the attention of state medical boards, we now have evidence that medical students display warning signs of future disciplinary action. We hope this early identification will lead to improved methods of remediation and decrease their subsequent behaviors that are responsible for disciplinary action. At the same time, we can now advocate from an evidence-based position that professionalism is an essential competency that must be demonstrated for a student to graduate from medical school.
This project was funded (in part) by the NBME Edward J. Stemmler, MD Medical Education Research Fund grant. The project does not necessarily reflect NBME policy, and NBME support provides no official endorsement. The authors gratefully acknowledge the assistance of Bonnie Hellevig in the development of the data abstraction instrument and for the data abstraction; the expertise of Helen Loeser, MD, MSc, in ranking the severity of the unprofessional behavior; the assistance of Ellen R. Julian, PhD, for providing corresponding MCAT percentile scores over a 40 year period, and the support and encouragement of David Irby, PhD. They also thank Ron Joseph, executive director, Medical Board of California, and his staff for assistance in data gathering.