Share this article on:

The Trouble with Using Provider Assessments for Rating Clinical Performance: It’s a Matter of Bias

McCarthy, Robert J. PharmD; De Oliveira, Gildasio S. MD, MSCI

doi: 10.1213/ANE.0000000000000593
Editorials: Editorial

From the Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, Illinois.

Accepted for publication November 6, 2014.

Funding: None.

The authors declare no conflicts of interest.

Reprints will not be available from the authors.

Address correspondence to Robert J. McCarthy, PharmD, Department of Anesthesiology, Northwestern University Feinberg School of Medicine, 251 E. Huron St., F 5–704, Chicago, IL 60611. Address e-mail to

The International Association for the Study of Pain has referred to pain as the fifth vital sign, and acute pain management after surgery has been shown to be a key factor in quality of recovery. In addition, the establishment of pain management benchmarks by the Joint Commission on the Accreditation of Healthcare Organizations more than a decade ago has resulted in a greater awareness of patients’ right to optimal pain control by health care practitioners and administrators. Postoperative pain control has become a priority for hospitals across the United States.

Optimization of postoperative pain management has been demonstrated through the implementation of protocols that include multimodal analgesic regimens, surgical-specific treatment pathways, implementation of a 24-hour anesthesiology pain service, and pain-specific training for physicians and nurses involved in postoperative care.1 Importantly, pain as assessed by the numeric rating scale (NRS), for which 0 = no pain and 10 = maximal pain, has been shown to be significantly reduced after the implementation of postoperative analgesia protocol. These data suggest that NRS pain scores might be a useful metric to evaluate the quality of anesthesia care provided after the transition from surgery to recovery.

Wanderer et al.2 from the Vanderbilt University have applied this principle in a research report in the current edition of Anesthesia & Analgesia in an attempt to use rank ordering by initial postanesthesia recovery unit NRS pain scores, as collected by nurses in a clinical setting, to compare supervising anesthesiologists when adjusted for confounding factors. The analysis included 26 680 cases using electronic documentation and excluded physician and nurse providers who had not cared for ≥100 patients. When admission postanesthesia recovery unit scores were compared among anesthesiologists after adjusting for patient and surgical factors, only 6.4% of the 69 supervising anesthesiologists were found to differ in median postanesthesia recovery room admission pain scores. This finding clearly demonstrates that as presently assessed, initial postanesthesia recovery scores are a poor metric for identifying differences among supervising anesthesiologists, and the authors correctly concluded that they should not be included in the assessment of anesthesia care performance.

Interestingly, the study also found that 16 of 66 (24%) recovery room nurses elicited median pain scores less than the median value for the entire group, and 33 nurses (50%) had elicited median pain scores significantly higher than the median for the entire group. These differences translated into a range of odds ratios from 0.16 (95% confidence interval, 0.11–2.4) for the lowest to 2.95 (95% confidence interval, 2.43–3.59) for the highest nurse compared with the nurse who ranked the median value for the overall group. In fact, NRS pain assessments using the 0 to 10 NRS pain score were found to depend more on the nurse making the assessment than patient age, gender and race, preoperative use of opioids, American Society of Anesthesiologists physical status, or procedure. This finding should not be interpreted to suggest dishonest recordings of NRS values by nurses (86%–90% of nurses accurately record pain ratings provided by patients), but that personal opinions, knowledge, and attitudes toward pain strongly influence assessments and management.3

Wanderer et al. discuss the use of nonstandard item descriptors rather than “no pain” equals 0 and “the worst possible pain” equals 10 as anchors for the NRS pain scales as a possible explanation for the variability in the nurse assessor–recorded NRS. They cited a recent systematic review that identified 24 distinct anchoring phrases in 54 included studies.4 The use of anchors during decision making can create a form of cognitive bias because individuals tend to rely heavily on these points when making decision adjustments. Factors that have been shown to affect the influence of anchoring on decision making include happier moods and conscientious personalities, certainly desirable attributes in postanesthesia care nurses. These are factors that patients are likely to perceive, and substituting anchors could clearly influence the perceived value reported by patients.5

The method of presentation of the NRS score range by the evaluator can be used to influence the choice made by the decision maker. This method is called the framing effect and is another type of cognitive bias.6 The presenter in this situation is referred to as the choice architect. This practice is not an uncommon phenomenon when using Likert scales because the differences between scores in the range are not necessarily interpreted as equidistant by respondents, and using descriptors to define ranges of values can assist with defining the meaning of individual values. Wanderer et al. discuss the use substitution of worst pain with the phase as in childbirth to encourage the patient to lower his/her reported NRS score from 10 to 5. Other framing descriptions such as mild (0–3), moderate (4–7), and severe (8–10) pain at presentation by the assessor may influence the patient’s decision.

In addition to the problems identified with anchor points and presentation of the NRS scales, assessments by nurses of postsurgical pain frequently depend on physical clues such as level of alertness and facial expressions.7 This effect is demonstrated in the study by Wanderer et al. when examining the analysis comparing NRS >0 to NRS = 0. The increase in strength of the recovery room nurse in mixed-effects model suggests that there are substantial differences in handling the report of no pain. Some nurses likely record NRS = 0 if the patient is somnolent or nonresponsive, whereas other nurses likely stimulate patients to obtain a response. The effect of this assessment is substantial in the recovery room, where a large proportion of patients are likely to be sleeping on arrival. Physical clues such as sleeping patients can easily be interpreted to represent an illusory correlation and represent a type of confirmation bias. This type of selective perception can lead to poor decisionmaking because the belief of the assessor can be maintained or even reinforced in the face of contradictory information, such as a smiling patient in no apparent distress reporting a high NRS or a grimacing patient reporting no pain.

In a study of a surgical department, nurse investigators found that knowledge and attitudes toward postoperative pain management were dependent on the types of postsurgical patients.8 The authors concluded that knowledge, attitudes, and possibly biases regarding postoperative pain varied as a function of the surgical area in which the nurses worked. The projection by the nurse of the anticipated pain level of the patient is an expectation bias when there is an a priori response that the assessor has determined should be present. When the outcome matches the assessor’s expectation, there is subjective validation of the belief reinforcing the expectation for future assessments. When the assessment does not match the observers’ expected response, additional input may be collected in an effort to confirm their expectation. The decision to accept or reject the information can also be affected by outcome bias because care providers generally evaluate a single assessment with respect to the eventual outcome, a pain-free recovered patient, rather than on the specific decision at the time it was made. Variability in nurses’ pain assessments has been demonstrated to vary as a function of educational level and experience.7 Although experience may reduce the effect of anchoring or framing on decision making, even experts’ decisions have been demonstrated to be affected by biases. The influence of the nurses’ perceptions toward postoperative pain influences their assessment and management of pain more than education level or amount of professional experience.3

One of the strengths of the study by Wanderer et al. is that it used preexisting data, and the nurses making the observations were likely unaware of the intent to use the data for clinician performance assessment. Although it is likely that NRS-specific assessment training would have been reduced between nurse variability, the effect of a single education intervention has been shown to be temporal without reinforcement.7 There would be problems with developing a performance assessment model after training and disclosure of intent because this could influence additional biases such as the Halo effect or the reverse Halo effect with regard to the perception of the clinicians that the rating would affect.

The findings of this study also have important implications beyond use to assess anesthesiologists’ performance. Clinical studies frequently rely on nursing assessment of pain in the postanesthesia care unit for evaluating efficacy and directing postoperative pain management. Based on the findings of the current investigation, validation of assessments by trained research study personnel blinded to study group not involved in clinical care should probably be considered when pain assessment is a critical study outcome. For consistency, they should also be instructed on how to report NRS values in sleeping patients. Retrospective evaluation of clinical data assessor bias should be considered in modeling responses based on pain assessments in the recovery room or at least disclosed as a limitation of the study design.

We believe that the authors should be congratulated for demonstrating that what appears to be a simple method to objectify a complex process, NRS pain scores, is principally dependent on a nurse making the assessment. The effect of nurse assessment of NRS could adversely influence clinician performance ranking if there was an increased association between anesthesia provider and a nurse making initial NRS pain assessment at either extreme compared with the median. Examining the underlying association between confounding variables is often overlooked when developing explanatory models.

Back to Top | Article Outline


Name: Robert J. McCarthy, PharmD.

Contribution: This author helped write the manuscript.

Attestation: Robert J. McCarthy approved the final manuscript.

Name: Gildasio S. De Oliveira, MD, MSCI.

Contribution: This author helped write the manuscript.

Attestation: Gildasio S. De Oliveira approved the final manuscript.

This manuscript was handled by: Franklin Dexter, MD, PhD.

Back to Top | Article Outline


1. Usichenko TI, Röttenbacher I, Kohlmann T, Jülich A, Lange J, Mustea A, Engel G, Wendt M. Implementation of the quality management system improves postoperative pain treatment: a prospective pre-/post-interventional questionnaire study. Br J Anaesth. 2013;110:87–95
2. Wanderer JP, Shi Y, Schildcrout JS, Ehrenfeld JM, Epstein RH. Supervising anesthesiologists cannot be effectively compared according to their patients’ postanesthesia care unit admission pain scores. Anesth Analg. 2015;120:923–32
3. McCaffery M, Ferrell BR, Pasero C. Nurses’ personal opinions about patients’ pain and their effect on recorded assessments and titration of opioid doses. Pain Manag Nurs. 2000;1:79–87
4. Hjermstad MJ, Fayers PM, Haugen DF, Caraceni A, Hanks GW, Loge JH, Fainsinger R, Aass N, Kaasa SEuropean Palliative Care Research Collaborative (EPCRC). . Studies comparing Numerical Rating Scales, Verbal Rating Scales, and Visual Analogue Scales for assessment of pain intensity in adults: a systematic literature review. J Pain Symptom Manage. 2011;41:1073–93
5. Furnham A, Boo HC, McClelland A. Individual differences and the susceptibility to the influence of anchoring cues. J Individual Differences. 2012;32:89–93
6. Strough J, Karns TE, Schlosnagle L. Decision-making heuristics and biases across the life span. Ann N Y Acad Sci. 2011;1235:57–74
7. Duke G, Haas BK, Yarbrough S, Northam S. Pain management knowledge and attitudes of baccalaureate nursing students and faculty. Pain Manag Nurs. 2013;14:11–9
8. Kiekkas P, Gardeli P, Bakalis N, Stefanopoulos N, Adamopoulou K, Avdulla C, Tzourala G, Konstantinou E. Predictors of nurses’ knowledge and attitudes toward postoperative pain in Greece. Pain Manag Nurs. 2014 [Epub ahead of print]
© 2015 International Anesthesia Research Society