Relationship of Pass/Fail Grading and Curriculum Structure With Well-Being Among Preclinical Medical Students: A Multi-Institutional Study
Reed, Darcy A. MD, MPH; Shanafelt, Tait D. MD; Satele, Daniel W.; Power, David V. MD, MPH; Eacker, Anne MD; Harper, William MD; Moutier, Christine MD; Durning, Steven MD; Massie, F. Stanford Jr MD; Thomas, Matthew R. MD; Sloan, Jeff A. PhD; Dyrbye, Liselotte N. MD, MHPE
Dr. Reed is assistant professor, Department of Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota.
Dr. Shanafelt is associate professor, Department of Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota.
Mr. Satele is statistician, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota.
Dr. Power is associate professor of family medicine, University of Minnesota Medical School, Minneapolis, Minnesota.
Dr. Eacker is assistant professor of medicine, University of Washington School of Medicine, Seattle, Washington.
Dr. Harper is associate professor of medicine, University of Chicago Pritzker School of Medicine, Chicago, Illinois.
Dr. Moutier is associate professor of medicine, University of California, San Diego, School of Medicine, La Jolla, California.
Dr. Durning is professor of medicine and pathology, Uniformed Services University of the Health Sciences, Bethesda, Maryland.
Dr. Massie is associate professor of medicine, University of Alabama School of Medicine, Birmingham, Alabama.
Dr. Thomas is assistant professor, Department of Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota.
Dr. Sloan is professor of oncology, Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, Minnesota.
Dr. Dyrbye is associate professor of medicine, Department of Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota.
Please see the end of this article for information about the authors.
Correspondence should be addressed to Dr. Reed, Mayo Clinic College of Medicine, 200 First St. SW, Rochester, MN 55901; telephone: (507) 284-6391; fax: (507) 266-0038; e-mail: email@example.com.
First published online September 26, 2011
Purpose: Psychological distress is common among medical students. Curriculum structure and grading scales are modifiable learning environment factors that may influence student well-being. The authors sought to examine relationships among curriculum structures, grading scales, and student well-being.
Method: The authors surveyed 2,056 first- and second-year medical students at seven U.S. medical schools in 2007. They used the Perceived Stress Scale, Maslach Burnout Inventory, and Medical Outcomes Study Short Form (SF-8) to measure stress, burnout, and quality of life, respectively. They measured curriculum structure using hours spent in didactic, clinical, and testing experiences. Grading scales were categorized as two categories (pass/fail) versus three or more categories (e.g., honors/pass/fail).
Results: Of the 2,056 students, 1,192 (58%) responded. In multivariate analyses, students in schools using grading scales with three or more categories had higher levels of stress (beta 2.65; 95% CI 1.54–3.76, P < .0001), emotional exhaustion (beta 5.35; 95% CI 3.34–7.37, P < .0001), and depersonalization (beta 1.36; 95% CI 0.53–2.19, P = .001) and were more likely to have burnout (OR 2.17; 95% CI 1.41–3.35, P = .0005) and to have seriously considered dropping out of school (OR 2.24; 95% CI 1.54–3.27, P < .0001) compared with students in schools using pass/fail grading. There were no relationships between time spent in didactic and clinical experiences and well-being.
Conclusions: How students are evaluated has a greater impact than other aspects of curriculum structure on their well-being. Curricular reform intended to enhance student well-being should incorporate pass/fail grading.
Psychological distress is prevalent among U.S. medical students1,2 and is associated with suicidal ideation3 and serious thoughts of dropping out of medical school.4 Among practicing physicians, distress has been linked to medical errors5–7 and suboptimal patient care.8 To address the problem of distress in the medical profession, researchers must identify the modifiable factors influencing student and physician well-being. Studies have shown that age, gender, and race influence well-being,2,9–11 as does experiencing a major positive or negative life event.12–15 Medical schools have little to no control over these individual student factors, but they can influence the learning environment—and they face regulatory and societal obligations to do so.16
The learning environment is thought to consist of the formal and hidden curricula,17 the institutional culture, and the learning climate.18,19 Of these elements, the formal curriculum is the most easily measured, modified, and controlled by medical schools. The formal curriculum includes curriculum structure (such as the amount of time in class and the relative proportion of didactic, clinical, and testing experiences) as well as the type of scales used to evaluate student performance. Relationships between curriculum structure and student well-being have not been studied extensively. Preliminary reports suggest that two-category grading (pass/fail) may reduce distress among medical students,20,21 but, to our knowledge, this relationship has not been demonstrated in multi-institutional studies incorporating assessment of other curricular factors.
A thorough understanding of such relationships is needed to guide curriculum reforms aimed at reducing student distress. The objective of this study was to examine relationships among curriculum structures, grading scales, and student well-being in a large, multi-institutional sample of medical students to identify modifiable curricular factors related to student distress and burnout.
Study design and sample
During the 2006–2007 academic year, we conducted a multi-institutional, cross-sectional study of all first- and second-year medical students at seven U.S. medical schools with 12 distinct campuses: Mayo Medical School; Uniformed Services University of the Health Sciences F. Edward Hébert School of Medicine; University of Alabama School of Medicine; University of California, San Diego, School of Medicine; University of Chicago Pritzker School of Medicine; University of Minnesota Medical School–Duluth; University of Minnesota Medical School–Minneapolis; University of Washington School of Medicine–Seattle; University of Washington School of Medicine–Alaska; University of Washington School of Medicine–Idaho; University of Washington School of Medicine–Montana; and University of Washington School of Medicine–Wyoming.
All 2,056 first- and second-year students at these institutions were invited by e-mail to complete a Web-based survey in spring 2007. Nonresponders were sent up to two reminders. Participation was elective, and responses were anonymized. No compensation or other reward was provided for participation. The institutional review board at each institution approved the study prior to the participation of its students.
The survey included the Perceived Stress Scale (PSS), the Maslach Burnout Inventory (MBI), and the Medical Outcomes Study Short Form (SF-8) to measure student stress, burnout, and quality of life (QOL), respectively. The PSS is a 10-item instrument with high internal consistency and criterion validity.22 The mean PSS score among the general U.S. population of 18- to 29-year-olds is 14.2 ± 6.2.23 The MBI is a 22-item instrument with three subscales, including emotional exhaustion (EE), depersonalization (DP), and low sense of personal accomplishment (PA).24 The MBI has been shown to have strong content, internal structure, and criterion validity, and it is considered to be the gold standard for measuring burnout among health professionals.24,25 Students with high scores for medical professionals on the EE (≥27) and/or DP (≥10) subscales are considered to have at least one manifestation of professional burnout.24–26 The SF-8 is an eight-item instrument measuring physical and mental QOL with acceptable reliability and content and criterion validity.27 Norm-based scoring is used to calculate scores; the mean SF-8 mental QOL score for the U.S. population is 49.2 ± 9.46.27,28
The survey also included six questions about recent life events (within the past 12 months) previously shown to affect student distress: marriage, divorce, birth or adoption of a child, major personal illness, major illness of a close family member, and death of a family member.12 Students were asked to indicate whether they perceived the life event as positive, negative, or neutral—with the exception of major personal illness or major illness of a close family member, which were assumed to be negative events based on the literature.29,30 Students were also asked whether they had seriously considered dropping out of medical school in the past year.
Curriculum measures: Structure and grading scale
We measured curriculum structure for first- and second-year medical students during the 2006–2007 academic year using curriculum schedules and grading policies that we retrieved directly from the dean's office at each of the 12 participating medical school campuses. Using these data, we obtained the following variables:
* total contact days (number of days students were scheduled to attend one or more formal learning experiences during the academic year),
* total contact hours (number of hours students were scheduled for any formal learning or testing experience during the academic year),
* percent didactic learning (the percentage of total contact hours allocated to large- or small-group learning experiences that did not involve an actual, standardized, or simulated patient),
* percent clinical experiences (the percentage of total contact hours allocated to learning experiences involving an actual, standardized, or simulated patient),
* percent testing experiences (the percentage of total contact hours spent taking summative exams—i.e., written quizzes/tests, standardized exams, laboratory and practical exams, and clinical exams [e.g., objective structured clinical examinations]), and
* absolute number of tests students were administered during the academic year.
For percent testing experiences, we included only time spent taking summative exams, defined as those that contributed to the students' final grades; we did not include in this category time spent on formative assessments because they are frequently not documented and, therefore, could not be reliably measured using data housed in deans' offices.
Lastly, we obtained the grading scale used for first- and second-year students at each campus and categorized these scales as follows: two categories (pass/fail) versus three or more categories (e.g., honors/pass/fail, honors/high pass/pass/marginal pass/fail).
We collected data for the 12 specific campuses rather than for the seven overall medical schools because curriculum structures and grading scales varied among campuses within individual medical schools. We also obtained mean (standard deviation [SD]) United States Medical Licensing Examination (USMLE) Step 1 scores for each of the seven medical schools in 2007.
We described student demographic characteristics, curriculum structure variables, and student well-being outcomes using counts and percentages for categorical variables and means and SDs for continuous variables. We analyzed overall burnout as a dichotomous variable, and we analyzed MBI subscale (EE, DP, PA), PSS, and SF-8 scores as continuous variables. We used two-tailed t tests and simple logistic regression to compare continuous and categorical variables, respectively, in bivariate analyses.
We used multivariate generalized linear regression for continuous well-being outcomes and multivariate logistic regression for dichotomous well-being outcomes to identify curriculum variables independently associated with student well-being. We used generalized estimating equations for all models to account for clustering of students within medical school campuses. We adjusted all models for student-level characteristics known to influence well-being including age, gender, and positive and negative life events.2,9–15 We examined independent variables for colinearity, and variables with correlation coefficients >.05 were not modeled together. We set the threshold for statistical significance at P < .01 to account for multiple comparisons. The sample size in this study was adequate to detect relationships between grading scales (two categories versus three or more categories) with 98% power. We analyzed data using SAS version 9.1 (SAS Institute, Cary, North Carolina).
Medical student well-being
Of 2,056 first- and second-year students at the 12 medical school campuses, 1,192 (58.0%) responded to the survey. Table 1 shows the respondents' demographic characteristics and well-being measures. The majority of respondents were male (642; 53.9%), and more than half (665; 55.8%) were younger than 25 years of age. Approximately one-quarter (310; 26.0%) were non-Caucasian, and 715 (60.0%) were single. Nearly half of the responding students (543; 45.6%) met criteria for burnout. Students reported high mean (SD) perceived stress scores of 17.0 (7.5), range 0 to 40. Mean (SD) mental QOL among students as measured by the SF-8 was 42.4 (11.2), which was lower than that of the general population.27 Almost 10% (117) of the respondents reported having seriously considered dropping out of medical school within the past year.
Curriculum structures and grading scales
During the first two years of medical school, students were scheduled to participate in a mean (SD) of 760 (83.8) total contact hours of learning experiences. Of these hours, a mean of 456 (60.0%) were spent in large-group didactic learning, 160 (21.1%) were spent in small-group didactic learning, 84 (11.1%) were spent in clinical experiences, and 60 (7.9%) were spent in testing experiences. Of the 1,192 students, 701 (58.8%) students at four campuses were in a curriculum using two-category (pass/fail) grading, and 491 (41.1%) students at eight campuses were in a curriculum using grading scales with three or more categories (e.g., honors/pass/fail).
Figure 1 shows the mean (SD) USMLE Step 1 scores for each school in 2007. The mean (SD) USMLE Step 1 score among all seven schools was 224.45 (20.17). Schools had similar scores with the exception of schools A and E, which had significantly higher mean scores compared with all others (P < .0001 and P = .0005, respectively), and school F, which had a significantly lower mean score compared with all others (P < .0001).
Associations among curriculum structures, grading scales, and medical student well-being
In bivariate analysis, we found a positive association between the percentage of time students spent in testing experiences and higher EE levels (beta 0.34; 95% confidence interval [CI] 0.04 to 0.06, P = .02). Additionally, students who spent a greater percentage of their contact hours in clinical experiences were less likely to have burnout (odds ratio [OR] 0.98; 95% CI 0.97 to 0.99, P = .01) and were less likely to have seriously considered dropping out of medical school within the past year (OR 0.93; 95% CI 0.93 to 0.99, P = .03). Compared with students in curricula graded pass/fail, students in curricula using grading scales with three or more categories had higher levels of perceived stress (beta 1.91; CI 1.05 to 2.78, P < .0001), greater EE (beta 2.92; 95% CI 1.67 to 4.16, P < .001), and lower mental QOL (beta −2.79; 95% CI −4.09 to −1.50, P < .0001). They also had increased odds of burnout (OR 1.58; 95% CI 1.24 to 2.01, P = .0002) and were more likely to report having seriously considered dropping out of medical school in the past year (OR 1.91; 95% CI 1.30 to 2.80, P = .001).
Table 2 shows the results of multivariate linear and logistic regression models examining relationships among curriculum structures, grading scales, and well-being outcomes. There was a modest, statistically significant association between the percentage of time students spent in testing experiences and higher perceived stress (beta 0.29; 95% CI 0.10 to 0.48, P = .003) and lower mental QOL (beta −0.63; 95% CI −0.29 to −0.96, P = .0003). There were strongly significant associations between grading scales and all well-being outcomes except mental QOL. Compared with students in pass/fail curricula, students in curricula using grading scales with three or more categories reported significantly higher levels of perceived stress (beta 2.65; 95% CI 1.54 to 3.76, P < .0001), were more likely to have burnout (OR 2.17; 95% CI 1.41 to 3.35, P = .0005), experienced higher levels of EE (beta 5.35; 95% CI 3.34 to 7.37, P < .0001) and greater DP (beta 1.36; 95% CI 0.53 to 2.19, P = .001), and were more likely to have seriously considered dropping out of medical school within the past year (OR 2.24; 95% CI 1.54 to 3.27, P < .0001). There were no significant associations between total contact days or the percentage of time spent in didactic learning and clinical experiences and any measures of student well-being.
Psychological distress in U.S. medical students is common,1,2 regardless of curriculum structure or grading scale. However, in this study, students in curricula graded on a pass/fail scale experienced less burnout and stress, and were less likely to have seriously considered dropping out of medical school, than students in curricula graded using three or more categories (e.g., honors/pass/fail). Furthermore, it seems that the more time students spent taking examinations, the lower their well-being measures were. These findings suggest that medical schools that emphasize testing and grades may be cultivating learning environments that exacerbate anxiety and stress. These results have important implications for medical schools' efforts to optimize their learning environments to enhance student well-being.
Among all curricular elements examined in this study, the grading scale was most strongly associated with students' well-being. This suggests that how students are evaluated is more important relative to their well-being than how contact hours are spent. We found no associations between well-being and the total number of hours students spent in class or the division of hours between didactic and clinical experiences, despite adequate sample size and study power. This implies that curriculum reforms aimed at promoting well-being should focus on assessment strategy rather than on the scheduling of learning activities. Although scheduling changes such as increasing independent learning time or clinical exposure may benefit students in other ways,31–33 these reforms seem unlikely to enhance student well-being.
It is often said that assessment drives learning,34–36 and it is well known that students require adequate assessment and feedback to develop academically and professionally.34 If medical schools were to transition from multicategory grading to pass/fail grading, would students still learn? Four prior studies have examined this question, and all concluded that switching to pass/fail grading did not worsen students' academic performance: No differences were found in students' academic achievement during the first and second years of medical school with implementation of pass/fail grading,21,37,38 and changing to pass/fail grading did not affect students' subsequent grades in third-year clerkships21 or their USMLE Step 1 scores.20,21 Likewise, we found that most medical schools in this study had similar mean USMLE Step 1 scores, although grading scales varied across campuses. Together, these data indicate that transitioning to pass/fail grading during the first two years of medical school is unlikely to adversely affect students' academic achievement. Finally, it is noteworthy that these studies show consistent relationships between grading scales, academic performance, and well-being despite using different instruments to assess well-being.
There are several limitations to this study. First, 58% of eligible students responded to the survey. Although this response rate compares favorably with other medical education studies,39 a response bias cannot be excluded. Second, these data are cross-sectional, so we cannot confirm that associations are causal. Third, we measured relationships between just one aspect of the learning environment (the formal curriculum) and students' well-being. We cannot determine from these data the relative influence of other aspects of the learning environment, such as the hidden curriculum17 and institutional culture.18 Fourth, curriculum structure measures were based on scheduled class hours as documented by the deans' offices at the medical school campuses; actual attendance by students could not be verified. Finally, further research is needed to determine the impact of curriculum structure and grading scales on other important outcomes such as students' competency and preparedness for residency. Although prior studies indicate that pass/fail grading is unlikely to adversely affect students' academic performance21,37,38 and does not impair students' success in the residency match,21 other important measures of competency (e.g., professionalism and interpersonal skills) have not been examined, nor has long-term career success.
Limitations notwithstanding, these data from a large, multi-institutional sample of U.S. medical students with well-validated measures of well-being demonstrate an important relationship between grading scales and students' well-being. Medical schools are obligated to optimize the learning environment to both facilitate learning and promote development of compassionate and professional graduates.16 To do so, schools must target modifiable factors in the environment such as curriculum structure and grading scales to reduce student distress, which adversely affects professional development.40 Although further study is needed, these data suggest that curricular reform efforts aimed at enhancing student well-being should consider pass/fail grading.
This study was approved by the institutional review boards at Mayo Clinic; Uniformed Services University of the Health Sciences; University of Alabama School of Medicine; University of Chicago Pritzker School of Medicine; University of California, San Diego; University of Minnesota Medical School; and University of Washington School of Medicine.
This study was presented in a plenary presentation at the 32nd Annual Meeting of the Society of General Internal Medicine; May 13–16, 2009; Miami Beach, Florida.
3 Dyrbye LN, Thomas MR, Massie FS, et al. Burnout and suicidal ideation among U.S. medical students. Ann Intern Med. 2008;149:334–341.
5 West CP, Huschka MM, Novotny PJ, et al. Association of perceived medical errors with resident distress and empathy: A prospective longitudinal study. JAMA. 2006;296:1071–1078.
6 West CP, Tan AD, Habermann TM, Sloan JA, Shanafelt TD. Association of resident fatigue and distress with perceived medical errors. JAMA. 2009;302:1294–1300.
7 Fahrenkopf AM, Sectish TC, Barger LK, et al. Rates of medication errors among depressed and burnt out residents: Prospective cohort study. BMJ. 2008;336:488–491.
8 Shanafelt TD, Bradley KA, Wipf JE, Back AL. Burnout and self-reported patient care in an internal medicine residency program. Ann Intern Med. 2002;136:358–367.
9 Dyrbye LN, Thomas MR, Eacker A, et al. Race, ethnicity, and medical student well-being in the United States. Arch Intern Med. 2007;167:2103–2109.
10 Dyrbye LN, Thomas MR, Huschka MM, et al. A multicenter study of burnout, depression, and quality of life in minority and nonminority US medical students. Mayo Clin Proc. 2006;81:1435–1442.
11 Zoccolillo M, Murphy GE, Wetzel RD. Depression among medical students. J Affect Disord. 1986;11:91–96.
13 Frank E, Tu XM, Anderson B, et al. Effects of positive and negative life events on time to depression onset: An analysis of additivity and timing. Psychol Med. 1996;26:613–624.
14 Moerk KC, Klein DN. The development of major depressive episodes during the course of dysthymic and episodic major depressive disorders: A retrospective examination of life events. J Affect Disord. 2000;58:117–123.
15 Honkalampi K, Koivumaa-Honkanen H, Hintikka J, et al. Do stressful life-events or sociodemographic variables associate with depression and alexithymia among a general population? A 3-year follow-up study. Compr Psychiatry. 2004;45:254–260.
18 Gershon RR, Stone PW, Bakken S, Larson E. Measurement of organizational culture and climate in healthcare. J Nurs Adm. 2004;34:33–40.
20 Rohe DE, Barrier PA, Clark MM, Cook DA, Vickers KS, Decker PA. The benefits of pass–fail grading on stress, mood, and group cohesion in medical students. Mayo Clin Proc. 2006;81:1443–1448.
22 Cohen S, Kamarck T, Mermelstein R. A global measure of perceived stress. J Health Soc Behav. 1983;24:385–396.
23 Cohen S, Williamson GM. Perceived stress in a probability sample of the United States. In: Spacapan S, Oskamp S, eds. The Social Psychology of Health. Newbury Park, Calif: Sage; 1988.
24 Maslach C, Jackson SE, Leiter MP. Maslach Burnout Inventory Manual. 3rd ed. Palo Alto, Calif: Consulting Psychologists Press; 1996.
25 Thomas NK. Resident burnout. JAMA. 2004;292:2880–2889.
26 Dyrbye LN, West CP, Shanafelt TD. Defining burnout as a dichotomous variable. J Gen Intern Med. 2009;24:440.
27 Ware J, Kosinski M, Dewey J, Gandek B. How to Score and Interpret Single-Item Health Status Measures: A Manual for Users of the SF-8 Health Survey. Lincoln, RI: QualityMetric Inc.; 2001.
28 Turner-Bowker DM, Bayliss MS, Ware JE Jr, Kosinski M. Usefulness of the SF-8 Health Survey for comparing the impact of migraine and other conditions. Qual Life Res. 2003;12:1003–1012.
29 Osvath P, Voros V, Fekete S. Life events and psychopathology in a group of suicide attempters. Psychopathology. 2004;37:36–40.
30 Kendler KS, Karkowski LM, Prescott CA. Causal relationship between stressful life events and the onset of major depression. Am J Psychiatry. 1999;156:837–841.
31 Putnam CE. Reform and innovation: A repeating pattern during a half century of medical education in the USA. Med Educ. 2006;40:227–234.
33 Kuo AA, Slavin SJ. Clerkship curricular revision based on the Ambulatory Pediatric Association and the Council on Medical Student Education in Pediatrics guidelines: Does it make a difference? Pediatrics. 1999;103(4 pt 2):898–901.
34 Epstein RM. Assessment in medical education. N Engl J Med. 2007;356:387–396.
35 Ben-David MF. The role of assessment in expanding professional horizons. Med Teach. 2000;22:472–477.
36 Karpicke JD, Roediger HL 3rd. The critical importance of retrieval for learning. Science. 2008;319:966–968.
38 White CB, Fantone JC. Pass–fail grading: Laying the foundation for self-regulated learning. Adv Health Sci Educ Theory Pract. 2010;15:469–477.
39 Reed DA, Cook DA, Beckman TJ, Levine RB, Kern DE, Wright SM. Association between funding and quality of published medical education research. JAMA. 2007;298:1002–1009.
40 Dyrbye LN, Massie FS Jr, Eacker A, et al. Relationship between burnout and professional conduct and attitudes among U.S. medical students. JAMA. 2010;304:1173–1180.
© 2011 Association of American Medical Colleges