Evaluation of student performance in prelicensure clinical courses is an ongoing challenge. Clinical performance encompasses a complex combination of activities including skilled know-how, clinical reasoning, situational awareness, and emotional sensitivity.1 Consequently, assessment of clinical performance requires evaluation of student knowledge, preparation, judgment, and ability to respond to a changing environment.2 With oversight of 8 to 10 students, faculty observation of students at all times is impossible, so evaluation is based on a “snapshot” of performance. Objectives may fail to capture measurable elements of clinical practice, but nevertheless, at semester’s end, faculty must determine a grade. This appraisal “not only informs students about their achievement, but also serves as the basis for certifying students as clinically competent to employers and regulatory groups.”2 (p113)
Even with the obvious importance of being exact in evaluation, there is significant grade inflation in undergraduate clinical nursing courses.2 Nine years after implementation of evidence-based strategies to improve precision in clinical grading, the present research was undertaken to determine if there was improvement in grade inflation.
Review of the Literature
Grade inflation is pervasive in postsecondary education and has been of concern for decades.3 Definitions of grade inflation include a persistent increase in awarding grades of A and B with a corresponding decrease in awarding grades of D and F,4 “a greater percentage of excellent scores than student performances warrant,”5 (p112) and a situation where “a grade is viewed as being less rigorous than it ought to be.”6 (p5) In 2013, 45% of grades awarded in US colleges were A’s compared with only 15% of “A” grades awarded in 1940.7
Factors Involved in Grade Inflation
Among the factors associated with grade inflation are faculty issues, student behavior and beliefs, nature of the faculty-student relationship, imprecision of instrumentation,8 and discipline-specific grading systems.9 Faculty issues such as unwillingness to admit to inflation,10 failure to apply strict standards to receive better student evaluations,11 and the desire to reduce or avoid grade appeals may contribute to grade inflation.9 Lack of formal education in how to evaluate may cause errors in grading practices including leniency-clustering grades at the high end, severity-clustering grades at the low end, or central tendency-clustering grades at the middle.12 Faculty judgment can also be influenced by the halo effect, where a general impression of a student has some bearing on the grade awarded.12 Coupled with inexperience2,13 and lack of confidence in evaluation,14 faculty may unwittingly inflate grades.
Student behavior and beliefs affecting the grading process include pressuring faculty members to award a “good” grade regardless of the quality of performance15 and the view that effort, persistence, and attendance should count in determination of grades.2 Increasingly, students perceive education as a “purchased service,” and good grades are expected as part of the buyer-seller relationship rather than something to be earned.15-18
The nature of the student-faculty relationship is yet another factor. As a closer relationship develops, there is greater likelihood that faculty will be influenced by personal information leading to leniency in grading.19 In clinical nursing education where faculty supervise 8 to 10 students, this close relationship may contribute to grade inflation.2
Imprecision in instrumentation is also an issue. Clinical evaluation tools, often deliberately broad to cover a range of situations, may fail to capture measurable elements of clinical practice.2,9 Performance criteria, written in academic jargon and/or imprecise language, may mean different things to different faculty, leading to variability in grading.20 Although widely used in nursing education, most clinical evaluation tools have not undergone reliability and validity testing.21
Finally, the grading system may also contribute to grade inflation. In many professional programs, the grading system uses grades of A through D, but C is the minimum passing grade. Consequently, there is a tendency to recenter the scale upward and award a grade of C to weak students and a grade of B to average students.2,9
Why Grade Inflation Matters
Inflated ratings of student performance in field or clinical experiences have been documented not only in nursing but also in social work, medicine, and others.8,14,20–24 Academicians in health and human services are concerned that falsely elevated grades allow students’ entry into professional practice, making high-stakes decisions.5,20,23 In a recent survey of nursing faculty from 9 community colleges and 5 universities, 43% of respondents “had awarded higher grades than merited,” and 72% had given students the benefit of the doubt when determining clinical competence.25 (p230) Thus, “failing to fail” could be another factor in grade inflation.25-27 The ultimate question may be one of safety and preparation for practice: Does grade inflation in clinical courses mean that programs are graduating students “who may be less clinically competent than their academic records suggest?”28 (p377)
Grade Inflation in Nursing
Five articles have been published since 2011, 2 examining existing literature on grade inflation and 3 empirical studies. A systematic review of articles on grade inflation in the health professions from 1996 to 2009 examined arguments for and against awarding grades for clinical experiences, reasons for grade inflation, and suggestions for controlling it.8 The authors recommended further quantitative research using methodologies other than description and surveys, with increased sample sizes, and expansion to multiple sites. King-Jones and Mitchell29 described factors related to grade inflation in nursing education and possible solutions. Synthesizing information from several published reports, they did not provide any new research evidence. Recommendations included research with larger samples in a variety of settings.
Susmarini and Hayati30 explored Indonesian nursing faculty experiences with clinical grade inflation. Themes emerging from this qualitative study included (1) causal factors (required minimum grade to advance, student attitude/effort, and imperfections in evaluation tools), (2) impact (school reputation, student’s future, public safety), and (3) solutions (improvement in evaluation tools, use of multiple assessment measures). No sample size was reported, and the study was limited to data collection at a single site. The authors suggested evaluation of the components of grade inflation to isolate those that contribute most. Paskausky and Simonelli,28 in a US study of clinical and theory course grades, found that grades in maternity clinical were inflated by at least 1 letter grade when compared with theory grades. Low to moderate correlations (r = 0.357) were reported between theory and clinical grades over a 6-year period (n = 281). This study was limited to 1 theory/clinical course pair at a single site. Replication at other institutions and in other clinical specialties was suggested.
Finally, White and Heitzler investigated the “effect of increased objectivity of evaluation methods on grade inflation in a graduate nursing research course.”31 When grades from 5 semesters of a traditional course offering were compared with grades from a modified course offering (with rigorously developed multiple-choice tests and use of grading rubrics), the researchers found a statistically significant decrease in grades, suggesting a reduction in grade inflation. Limitations included a single site and a focus on 1 theory course.
Given the limited body of recent research in the nursing literature, sound scientific conclusions cannot be drawn. However, recommendations from these 5 articles stressed the need for improved precision in evaluation instruments, investigation using quantitative approaches with larger sample sizes, and a variety of courses and at multiple sites.
Strategies to Reduce Grade Inflation in Clinical Courses
The literature suggests a multimethod approach to combat grade inflation.8,30 Inclusion of objective measures of knowledge such as quizzes and NCLEX-style questioning has been proposed.2,8,30 Using rubrics with clearly defined performance criteria is recommended for both formative and summative evaluation,21,32 but data on their efficacy in reducing grade inflation are limited. Evaluating performance over a series of clinical encounters, rather than a single encounter, can reveal a student’s ability to adjust behavior to a specific context and lead to a “more realistic and fair assessment.”33
Simulations using high-fidelity manikins for evaluating student performance standardize experiences and are tailored to learning outcomes.34,35 To ensure interrater reliability, faculty training with various scenarios using reliable and valid instruments is needed.13 Objective structured clinical examinations and simulations using standardized patients (SPs) can evaluate students’ cognitive, affective, and psychomotor skills.36 Monitoring of SP performance for accuracy and consistency in portraying scenarios is critical to strengthen reliability and validity.36
This research was a follow-up to our original study examining clinical grade inflation2 undertaken to answer the following question: Is there a difference in clinical grade distribution when comparing 2 cohorts from 2 different time periods, the first (1997-2002) collected before the implementation of grade inflation-reducing strategies and the second (2009-2016) collected after implementation of grade inflation-reducing strategies. No data were collected for the period 2003 to 2008 because faculty were examining the research findings and making plans for possible interventions.
A retrospective descriptive design was used to examine grades of students from 5 required undergraduate nursing clinical courses at a mid-Atlantic public university. Clinical grades for cohort I were based on percentages achieved on projects, presentations, care plans, and performance in “live” patient care situations. Clinical grades for cohort II included those plus grade inflation-reducing strategies such as quizzes, rubrics, and simulations. The courses for both cohorts were the same: Adult I, Adult II, Pediatrics, Maternity, and Mental Health. After institutional review board approval, data were collected, deidentified, and analyzed using SPSS (IBM Corp, Armonk, New York). Grades were recorded as A (≥90%), B (80%-89%), C (70%-79%), D (60%-69%), or F (<60%), with a grade of C as the minimum passing grade. Because data were ordinal (A’s, B’s, etc) and non–normally distributed, a Mann-Whitney U test, rather than an independent t test, was performed to detect differences in mean ranks for cohort I compared with cohort II for the same clinical courses.37 Before these inferential procedures were run, an a priori power analysis was performed using G*Power 3.1.2 (University of Düsseldorf, Düsseldorf, Germany),38 which showed that a sample size of at least 134 was necessary for a 2-tailed test, using an α level of .05, a power of 0.80, an effect size of 0.50.39
Results and Discussion
The total number of participants for cohort II was 672, with variations in individual courses due to differences in enrollment caused by curricular sequencing. The total number of participants for cohort I was 184, again with variations in individual courses caused by curricular sequencing. Increases in enrollment beginning in 2009 led to the differences in student numbers.
The results of the Mann-Whitney U procedure show statistically lower grades for cohort II when compared with cohort I in the Adult I and Maternity courses, suggesting some reduction in grade inflation but not to the extent anticipated (Table). In Adult Health I, rubrics to provide feedback on performance (direct care and clinical judgment) and written work (care plans) were instituted in 2009. Creation, pilot, and refinement of the rubrics promoted discussion and consensus on exactly what was to be evaluated. Concerns about recency have decreased as each clinical experience counts equally toward the final grade. Objectively graded items such as pharmacology and skills quizzes were added and accounted for 35% of the final clinical grade. Reliability of these quizzes was assessed through item analysis, and those questions with poor performance were revised or eliminated. A simulation in patient safety assessment was initiated in 2011. Although it is currently a pass/fail experience, extremes in performance are factored into the clinical grade. Lastly, whenever possible, the course was staffed so that students would have the same faculty all semester.
In Maternity, rubrics for evaluating performance and written work were implemented as described previously beginning in 2012. Ungraded simulation experiences were added in 2012 with grading of student performance starting in 2014. While mean clinical grades were still high, the significant downward trend might be evidence that these interventions were beginning to improve precision in grading. On the other hand, it is possible that this trend is statistical regression toward the mean. Constancy in location and clinical faculty in this course is difficult, given the desire to provide student experiences that include prenatal, labor and delivery, postpartum, and newborn care and the limited number of faculty who can teach across all these areas. Both Adult Health II and Mental Health showed trends toward improvement in grade inflation. Using rubrics to evaluate performance and written work, counting all clinical experiences in the final grade, and incorporation of quizzes and simulations began in 2009. Reliability of these quizzes was assessed as previously described. However, these courses did not integrate the principle of constancy in location and clinical faculty because faculty believed that breadth of experience was more important despite possible fragmentation in learning.
Pediatrics clinical grades showed essentially no change despite the implementation of rubrics and simulations. Faced with challenges related to constancy in location and clinical faculty as well as low patient census, this course increased the number and complexity of simulations and began grading them in 2014. However, it may be too soon to see any effect on grade inflation.
Despite reduction in grade inflation in some courses, continued effort is needed. To improve precision in clinical evaluation, all course teams have increased the frequency of team meetings, added discussions about how to use grading rubrics within each course, and expanded orientation of new full- and part-time faculty to their role in evaluation. Finally, a structured month-long teacher preparation workshop with simulated clinical teaching encounters and year-long mentorship were initiated. Attendance was strongly recommended but was not compulsory.40
The war on grade inflation in clinical courses is not over. Precision in grading can be improved through adherence to the following principles of evaluation:
- Agree on what is being evaluated. Clear, criterion-referenced standards for clinical performance are needed, written in language that faculty and students understand.41 Derived from course objectives, they should specify essential behaviors and show how expectations increase in sophistication across a curriculum. Competency-based, developmental outcomes, similar to the Milestones-Based Evaluation System, could be developed by specialty and demonstrate student progression from program start to finish.42 Faculty should resist the temptation to evaluate students on every aspect of care every week.1 Clinical experiences, no matter where they take place, can be divided into discrete, targeted experiences with specific objectives for each. Evaluation of student performance would be limited to the identified weekly objectives. If faculty desired an evaluation where students “put it all together,” a simulated experience is recommended.
- Reduce rater error. Education is needed to ensure that faculty have a robust theoretical foundation in the principles of clinical evaluation. A graduate-level course in assessment is advised for all full-time nurse educators.43 Expert clinicians recruited into part-time teaching roles may be ill-equipped to evaluate student performance, so preparation for evaluation is also needed.40 Use of reliable and valid instruments is essential, yet most nursing programs continue to use rubrics that have not undergone appropriate testing.21 Several reliable and valid instruments are available to assess student performance in simulations44,45 and in vivo46 and should be used whenever possible. It is critical to provide adequate training for all faculty who will be using any instruments selected, as most errors in performance ratings can be attributed to evaluators.47
- Standardize experiences. Faculty also should determine the environment (low-fidelity laboratory, high-fidelity simulation, SPs, in vivo) and the sequence most appropriate for achieving desired learning outcomes.1 For example, the skill of insulin injection might be mastered through practice in a low-fidelity laboratory setting and applied in a high-fidelity simulation requiring clinical judgment such as adjusting insulin dose based on a blood glucose report. Finally, in vivo, both skill and judgment would be used in retrieval of a blood glucose report and selection and administration of an appropriate insulin dose based on provider orders and laboratory results. Simulation is especially useful early in a semester for practice and repetition until students have developed confidence and competence. The ability to control simulations also provides standardization for evaluation, for example, same environment and same degree of difficulty.13
- Provide repetition and constancy. Deliberate structuring of clinical learning is needed to provide opportunities for repeat experiences and, whenever possible, a constant clinical location and same faculty. We also suggest focusing on common health issues rather than whatever experiences are available.1 For the most effective learning, students need 2 kinds of repetition: the opportunity to do the same (or similar) things over and over and repeated exposure to similar clinical situations, with higher performance expectations as they progress in course work.1,48 To accomplish this and improve internalization, for each specialty faculty could select the top 5 to 8 medical conditions where student experiences (laboratory, simulation, in vivo) would be guaranteed and documented in an electronic portfolio. Repeat experiences would be planned to focus on the 2 kinds of repetition. By program’s end, students would have had repeated experiences with common health issues and at increasing levels of complexity. Classroom learning about less common disorders would continue, but clinical learning would focus on the commonplace rather than the unusual. To improve constancy in clinical location and instructors, dedicated education units where staff nurses serve as clinical instructors for individual students over a period could be developed or used if available, adding further opportunities for repeat experiences.49
The data from this study suggest that grade inflation has been reduced in some clinical nursing courses at a single university after deliberately applying certain principles of evaluation and strategies. Although this study examined a larger data set collected over time, limitations include a single site, retrospective descriptive design, and the possible inequality of groups. However, the average preadmission grade point average (GPA) for cohort I was 3.094 as compared with 3.559 for cohort II. If preadmission GPA was a “true” measure of ability, grades in nursing courses should have been higher for cohort II. These data cause the authors to wonder if there is growing grade inflation in liberal arts and science prerequisite courses as noted by others.7 Another limitation was the lack of standardized faculty training across the curriculum in the use of grading rubrics, although discussion about how to use grading rubrics has occurred on a course-by-course basis. Finally, the teacher-prep workshop was voluntary rather than mandatory.
Multisite studies are needed to determine the extent to which clinical grade inflation exists and if similar patterns are seen at other institutions. Instrumentation studies and investigation of factors that influence quality of assessment are also needed. Experimental designs testing proposed solutions and the effectiveness of various strategies to improve performance of assessors would add to our understanding as to which principles of evaluation and which specific interventions improve precision in clinical grading. Professional excellence demands that we continue the quest to ensure that grades awarded truly reflect student ability.
1. Benner P, Sutphen M, Leonard V, Day L. Educating Nurses: A Call for Radical Transformation
. Stanford, CA: Jossey-Bass; 2010.
2. Walsh CM, Seldomridge LA. Clinical grades: upward bound. J Nurs Educ
3. Kostal JW, Kuncel NR, Sackett PR. Grade inflation
marches on: grade increases from the 1990s to 2000s. Educ Meas Issues Prac
4. McKenzie RB, Staaf RJ. An Economic Theory of Learning: Student Sovereignty and Academic Freedom
. Blacksburg, VA: University Publications; 1974.
5. Speer AJ, Solomon DJ, Fincher RM. Grade inflation
in internal medicine clerkships: results of a national survey. Teach Learn Med
6. Mullen R. Indicators of grade inflation
. AIR 1995 Annual Forum Paper. Eric # ED386970. 1995;3-18. Available at http://eric.ed.gov/?id=ED386970
. Published May 1995. Accessed June 4, 2017.
7. Rojstaczer S. Grade Inflation at American Colleges and Universities
. 2002-2013. Available at www.gradeinflation.com
. Published March 29, 2016. Accessed May 20, 2017.
8. Donaldson JH, Gray M. Systematic review of grading
practice: is there evidence of grade inflation
? Nurse Educ Pract
9. Isaacson JJ, Stacy AS. Rubrics for clinical evaluation: objectifying competence-based assessments. Research Matters
11. Franz IW. Grade inflation
under the threat of students’ nuisance: theory and evidence. Econ Educ Rev
12. Oermann MH, Kardong-Edgren S, Rizzolo MA. Summative simulated-based assessment in nursing programs. J Nurs Educ
13. Cacamese SM, Elnicki M, Speer AJ. Grade inflation
and the internal medicine subinternship: a national survey of clerkship directors. Teach Learn Med
14. Weaver CS, Humbert AJ, Besinger BR, Graber JA, Brizendine EJ. A more explicit grading
scale decreases grade inflation
in a clinical clerkship. Acad Emerg Med
15. Alsop R. The ‘Trophy Kids’ Go To Work
. Stanford, CA: Jossey-Bass; 2008.
16. Cain J, Romanelli F, Smith K. Academic entitlement in pharmacy education. Am J Pharm Educ
17. Gentry J. Radical change in faculty and student evaluation: a justifiable heresy? Admin Issues J
19. Fletcher P. Clinical competence examination—improvement of validity and reliability. Int J Osteopath Med
20. Walsh CM, Seldomridge LA, Badros KK. Developing a practical evaluation tool for preceptor use. Nurse Educ
21. Oermann MH, Saewert KJ, Charasika M, Yarbrough SS. Assessment and grading
practices in schools of nursing: national survey findings part I. Nurs Educ Perspect
22. Larocque S, Luhanga F. Exploring the issue of failure to fail in a nursing program. Int J Nurs Educ Scholarsh
23. Miller G. Grade inflation
, gate keeping, and social work education: ethics and perils. J Soc Work Values Ethics
24. Lowe SK, Borstorff PC, Landry RJ. An empirical examination of the phenomenon of grade inflation
in higher education: a focus of grade divergence between business and other fields of study. Acad Educ Leadersh J
25. Docherty A, Dieckmann N. Is there evidence of failing to fail in our schools of nursing? Nurs Educ Perspect
26. Sowbel LR. Gatekeeping in field performance: is grade inflation
a given? J Soc Work Educ
27. Jervis A, Tilki M. Why are nurse mentors failing to fail student nurses who do not meet clinical performance
standards? Br J Nurs
28. Paskausky AL, Simonelli MC. Measuring grade inflation
: a clinical grade discrepancy score. Nurse Educ Pract
29. King-Jones M, Mitchell A. Grade inflation
: a problem in nursing? Creat Nurs
30. Susmarini D, Hayati Y. Grade inflation
in clinical stage. Am J Health Sci
31. White KA, Heitzler E. Effect of increased evaluation objectivity on grade inflation
: precise grading
rubrics and rigorously developed tests [published online ahead of print July 14, 2017]. Nurs Educ
32. Shipman D, Roa M, Hooten J, Wang ZJ. Using the analytic rubric as an evaluation tool in nursing education: the positive and the negative. Nurse Educ Today
33. Oerlemans M, Dielissenb P, Timmermana A, et al. Should we assess clinical performance
in single patient encounters or consistent behaviors of clinical performance
over a series of encounters? A qualitative exploration of narrative trainee profiles. Med Teach
35. Patton SK. A pilot study to evaluate consistency among raters of a clinical simulation. Nurs Educ Perspect
36. McWilliam PL, Botwinski CA. Identifying strengths and weaknesses in the utilization of objective structured clinical examination (OSCE) in a nursing program. Nurs Educ Perspect
37. Plichta S, Kelvin E. Munro’s Statistical Methods for Health Care Research
. 6th ed. Philadelphia, PA: Wolters-Kluwers; 2012.
38. Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods
40. Hinderer KA, Jarosinski JM, Seldomridge LA, Reid TP. From expert clinician to nurse educator: outcomes of a faculty academy initiative. Nurse Educ
41. O’Halloran KB, Gordon ME. A synergistic approach to turning the tide of grade inflation
. High Educ
42. Kuo LE, Hoffman RL, Morris JB, et al. A milestone-based evaluation system-the cure for grade inflation
. J Surg Educ
45. Mikasa A, Cicero T, Adamson K. Outcome-based evaluation tool to evaluate student performance in high fidelity simulation. Clin Simul Nurs
46. Lasater K. Clinical judgment development: using simulation to create an assessment rubric. J Nurs Educ
47. Pangaro L, Holmboe ES. Evaluation forms and global rating scales. In: Holmboe ES, Hawkins RE, eds. Practical Guide to the Evaluation of Clinical Competence
. Philadelphia, PA: Mosby; 2008:24–41.
48. Crookes K, Crookes PA, Walsh K. Meaningful and engaging teaching techniques for student nurses: a literature review. Nurse Educ Pract
49. Polvado KJ, Sportsman S, Bradshaw P. Pilot study of a dedicated education unit: lessons learned. J Nurs Educ Pract