Sepsis is a leading cause of mortality among hospitalized patients (1). Mortality from hospital-acquired sepsis is two to three times higher than sepsis present at admission (23). Delayed recognition delays life-saving interventions, increasing the risk of progression to shock, organ failure, and death (4). Many hospitals have developed electronic health record (EHR)-based sepsis surveillance and alert systems to improve early detection and intervention (5).
The first surveillance tools targeted detection of the systemic inflammatory response syndrome (SIRS). With good diagnostic accuracy, detection systems prompted increased frequency of and improved time to diagnostic testing, and escalation of care (6–12). Our group previously developed an automated SIRS-based sepsis detection tool (Early Warning System [EWS] 1.0) that resulted in a nonsignificant trend toward reduced mortality (9).
Our group and others have more recently developed predictive tools to identify high-risk patients before clinical criteria are apparent (13–21). Using machine learning (ML) algorithms, large patient datasets can be mined for novel clinical patterns and characteristics predictive of clinical decompensation (182223). ML algorithms to predict septic shock in ICU patients have demonstrated good predictive accuracy in retrospective validation (222425), although few have reported prospective implementation outcomes. One small nonacademic hospital reported improved sepsis-related mortality (26), and a small randomized trial demonstrated decreased length of stay and improved mortality in ICU patients (23).
To our knowledge, our group is the first to evaluate large-scale prospective implementation of a ML-based sepsis prediction tool (EWS 2.0) in non-ICU patients. Algorithm validation suggested excellent predictive characteristics for severe sepsis and septic shock, with a positive likelihood ratio of 13 (27). We linked EWS 2.0 to predictive alerts deployed to clinical care teams on non-ICU inpatient services and performed a prospective preimplementation and postimplementation analysis of its impact on clinical care processes and patient outcomes (27).
In addition to good algorithm performance, stakeholder acceptance of clinical decision support systems is crucial for sepsis care improvement. We previously reported that a minority of providers perceived our earlier sepsis detection system, EWS 1.0, to be helpful despite observed changes in management resulting in increases in early sepsis care and documentation (28). We postulated that acceptance was limited by alert fatigue. Provider acceptance of ML algorithm prediction tools may be further limited by their complexity and lack of transparency (28). This study describes clinician perceptions of our predictive ML-based EWS 2.0 deployed prospectively across our healthcare system.
MATERIALS AND METHODS
Study Design, Setting, and Subjects
This was a prospective observational study. The EWS 2.0 alert was deployed throughout our multihospital academic healthcare system, the same study site as EWS 1.0. This study was conducted in our flagship 782-bed academic hospital given the higher volume of alerts at this location and on-site availability of the research team. Study subjects were bedside clinicians caring for patients who triggered the alert, including registered nurses (nurses) and physicians or advanced practitioners (providers). This project was reviewed and determined to qualify as Quality Improvement by the University of Pennsylvania’s Institutional Review Board.
To create EWS 2.0, we trained an ML algorithm to predict severe sepsis or septic shock. The algorithm was developed using a random forest classifier trained on EHR data from adult patients discharged from July 2011 to June 2014 from any of our three urban acute care hospitals (n = 162,212). Positive cases (n = 943) were defined as having an International Classification of Diseases, 9th Edition, code of 995.92 (Severe Sepsis) or 785.52 (Septic Shock) associated with their encounter, with a positive blood culture and either a lactate greater than 2.2 mmol/L or systolic blood pressure less than 90 mm Hg. The earliest of these events was labeled as “time zero” of sepsis onset. Only non-ICU patients were included in the derivation population.
The algorithm’s sensitivity threshold was set to generate approximately 10 alerts across the hospital system per day, with the goal of generating a feasible alert response workload and minimizing false positives that would exacerbate alert fatigue and erode alert confidence. This target daily alert rate was determined a priori and informed by our experience with EWS 1.0, which, based on a threshold determined to capture the patients most likely to decompensate, generated about six alerts per day (9). We retrospectively validated the algorithm on hospitalized patients from October to December 2015 (n = 10,448; screen positive = 314). The positive likelihood ratio for “severe sepsis or septic shock” was 13, with positive and negative predictive values of 29% and 97%, respectively.
Clinicians throughout our hospital system were educated about EWS 2.0 via informational emails before alert deployment. On June 16, 2016, EWS 2.0 was activated. When EWS 2.0 was triggered, an EHR-based alert was sent to the patients’ nurse, and a text message alert was sent to the patient’s provider as well as to a rapid response coordinator who ensured clinical teams received the alert and completed an immediate bedside assessment. Alerts stated that EWS 2.0 had fired for a given patient and included relevant recent laboratory data along with 48 hours of vital sign trends.
Survey Deployment and Administration
We created two web-based questionnaires to assess clinician perceptions of EWS 2.0 (Supplemental Digital Content 1, http://links.lww.com/CCM/E590; and Supplemental Digital Content 2, http://links.lww.com/CCM/E591). The surveys were adapted from a previous questionnaire used to evaluate perceptions of EWS 1.0 (28) and refined through an iterative process with feedback from an interdisciplinary team of critical care and general medicine attendings, medical residents, and nurses. The final questionnaires included categorical and Likert scale survey items with options for open-ended response. Questions were designed to assess clinicians’ perceptions regarding: 1) the patient’s condition; 2) new information discovered at the time of alert; 3) whether and how the alert changed management; and 4) whether and how the alert was useful and/or improved patient care.
Surveys were administered over 6 consecutive weeks (November 7, 2016, to December 19, 2016) 5 months after the EWS 2.0 alert was deployed across the healthcare system. For every alert triggered, a rapid response coordinator directed the covering nurse and provider to the first 16-item questionnaire (first survey) to be completed confidentially and independently within 6 hours of the alert.
Clinicians who completed the first survey were emailed a link to the 11-item second survey 48 hours after the initial alert. The second survey included a subset of questions from the first survey, with a focus on clinicians’ perceptions of patients’ clinical state and the alert’s utility and impact on care after 48 hours of clinical evolution. Up to two reminders were sent by email or text to nonresponders 12–24 hours after the initial second survey request. Completion of surveys was strictly voluntary.
Study data were collected and managed using Research Electronic Data Capture, a secure, web-based application designed to support data capture for research studies (29). To facilitate interpretation of Likert scale survey responses, grades 1 and 2 were grouped and considered as negative, grade 3 was considered neutral, and grades 4 and 5 were grouped and considered positive. Categorical questions included options for open-ended responses; these were reviewed for themes, and some were recoded to the appropriate corresponding categorical response groups. Results were calculated as percentages of total responses within each group (provider and nurse), and comparisons were made between clinician types using the chi-square test and two-tailed Fisher exact test, as appropriate. p values less than 0.05 were considered significant and are reported here.
During the 6-week study period, 362 EWS 2.0 alerts were triggered, resulting in a median of eight alerts per day (interquartile range, 7–10; range, 4–15). For the 724 potential first survey responses (one each for a nurse and provider per alert), 287 first surveys were completed by 252 individual clinicians (overall response rate, 40%). Nurses completed 180 first surveys (50% response rate), and providers completed 107 first surveys (30% response rate). Of these, 43 nurses who completed a first survey completed a second survey (24% response rate) and 44 providers who completed a first survey completed a second survey (41% response rate), with an overall second survey response rate of 30%. Of these 77 respondents, 49 (64%, 33 providers, 16 nurses) reported sufficient continuity with the alerted patient to accurately complete the second survey.
Findings and Management at the Time of Alert
The alert and subsequent patient assessment infrequently provided new information (Table 1). Few clinicians (13% providers, 24% nurses) reported new clinical findings at the time of alert trigger (p = 0.03 for providers vs nurses). Perceptions of the presence of sepsis at the time of patient evaluation after alert were discrepant between providers (40%) and nurses (13%) (p < 0.001). In addition, following the alert, most clinicians remained unchanged in their impression that the patient would develop critical illness (62% providers, 55% nurses). At 48 hours, fewer clinicians in both groups believed that the patient was septic (26% providers, 6% nurses), when compared with their impressions within the first 6 hours after alert.
Sepsis was thought to be the primary driver of alert trigger in about one third of cases (40% providers, 21% nurses, p < 0.001), followed by dehydration (14% providers, 14% nurses) (Table 2). One tenth of providers (11%) and one fifth of nurses (21%) did not know why the alert triggered because they discovered no clinical change. Although providers’ impressions of sepsis driving alert trigger remained consistent over time (40% within 6 hr of alert, 39% at 48 hr after alert), nurses were less likely to attribute alert firing to sepsis at 48 hours (21% within 6 hr of alert, 0% at 48 hr after alert, p < 0.05) (Supplemental Table 1 (Supplemental Digital Content 3, http://links.lww.com/CCM/E592).
Few providers (9%) but a third of nurses (30%) reported that the alert changed management (Table 3) (p < 0.001). Clinicians most commonly reported increased frequency of bedside rounding, followed by increased frequency of vital sign checks, and ordering of new diagnostic tests.
Overall impressions of EWS 2.0’s utility to clinical teams and impact on patient care were mixed (Figs. 1 and 2). Almost half of nurses (42%) but less than a fifth of providers (16%) found the alert helpful at 6 hours (p < 0.001). Although the proportion of providers finding the alert helpful nearly doubled by 48 hours (30%), this was not statistically significant (Supplemental Table 1, Supplemental Digital Content 3, http://links.lww.com/CCM/E592). The proportion of nurses finding the alert helpful or unhelpful remained stable over time (helpful: 42% at 6 hr, 44% at 48 hr; unhelpful: 22% at 6 hr, 31% at 48 hr). Nurses were more likely than providers to describe the alert as improving care, at both 6 hours (11% providers, 33% nurses, p < 0.001) and 48 hours (12% providers, 38% nurses, p = 0.05).
Of the 26 clinicians reporting helpful features of the alert, 73% cited improved team communication and 46% cited more frequent monitoring; fewer cited the prompting of diagnostic testing (23%) or interventions (2%). Of the 19 clinicians reporting unhelpful features, 37% cited triggering for known clinical abnormalities, 21% cited patients’ clinical stability, 16% each believed that the alert was a poor use of resources or that it fired too late, and 11% reported irrelevant clinical abnormalities. When asked for suggestions for improvements, clinicians most frequently requested transparency regarding factors leading to alert trigger (44% of 48 suggestions).
Nurses and providers frequently differed in their perceptions of alerted patients and EWS 2.0 in general. Nurses were less likely to think patients were septic; by 48 hours, none of the surveyed nurses attributed the alert to sepsis. Given that nurses are often the most proximal caregiver and may be the first to encounter signs of sepsis, this finding of differing sepsis assessments may reveal a crucial opportunity for improved sepsis awareness and highlights the potential importance of objective automated monitoring systems. Despite infrequent concerns for sepsis, nurses were more likely to report perceived changes in management and favorable overall impressions of the alert compared with providers, with nearly one third reporting changed management, half finding the alert helpful, and one third reporting improved care. Discrepancy in nurse and provider perceptions of EWS 2.0’s impact on care suggests that it conferred differential benefits and prognostic value to each group. Reported improved interdisciplinary communication may be particularly important given discrepant clinician impressions of sepsis risk in these patients.
EWS 2.0 was less favorably received than EWS 1.0. As previously reported, clinicians reported that EWS 1.0 changed management in about half of cases (56% nurses, 44% providers). Clinicians reported less frequent management changes with EWS 2.0 (30% nurses, 9% providers). Furthermore, although nurses’ impressions of the two systems were similar, providers more frequently reported that EWS 1.0 was helpful (40% nurses, 33% providers) and improved care (35% nurses, 24% providers), and less frequently reported that EWS 2.0 was helpful (42% nurses and 16% providers at 6 hr, 42% nurses and 32% providers at 48 hr) or improved care (33% nurses, 11% providers). The poorer perceptions of EWS 2.0 may reflect poor clinician acceptance of predictive alert systems more generally compared with alerts designed to detect clinical deterioration.
Although others have reported on the development and small-scale implementation of predictive alerts informed by ML algorithms, this is the first study to report on clinician perceptions of such tools. These results reveal potential barriers to positive clinical reception of EWS 2.0 including: 1) relative clinical stability of patients at the time of alert, 2) confidence in clinician judgment, 3) lack of transparency of the ML algorithm, and 4) uncertain response to alerts on high-risk patients who are not yet decompensating. These may be generalizable to other predictive alert systems, particularly those informed by ML algorithms.
Patients’ clinical stability at the time of alert may have contributed to poor confidence in the alerts’ clinical accuracy and relevance. As a predictive tool, EWS 2.0 triggered at a median of 5–6 hours, and in some cases several days before the onset of severe sepsis and septic shock. We suspect that many clinicians perceived EWS 2.0 as a traditional detection tool and dismissed its firing as erroneous or unhelpful when they discovered no evidence of clinical deterioration. The expectation of an immediate bedside evaluation likely contributed to this false perception that the alert was monitoring for decompensation requiring an immediate response. Although implementation campaigns may mitigate such misperceptions, optimal lead time of predictive alerts remains unclear.
Poor acceptance of EWS 2.0 may reflect little perceived added value to clinicians’ judgment given clinician confidence in their clinical reasoning and prognostication. Although EWS 2.0 demonstrated positive predictive values comparable to other widely accepted screening tools (3031), its ability to identify at-risk patients may not exceed that of clinicians. In fact, clinicians reported already suspecting sepsis in almost half of patients who triggered the alert. Although objective risk assessment through predictive alerts may help standardize otherwise subjective clinical impressions, clinicians may not find the alert helpful if they believe they already know which patients to monitor closely. The utility of such predictive alerts may thus be limited by a relatively small target population: high-risk patients not yet viewed as high risk by clinicians. Further studies are needed to identify the subset of patients most likely to benefit from predictive alerts.
Clinicians may find it difficult to trust alerts developed using complex algorithms. ML algorithms in particular have been described as “black box models” because the variables informing their prediction are often not explicit or easily available to the user (32). Because ML algorithms can incorporate hundreds of variables, the factors that contribute to a prediction may be too unwieldy to distill into a meaningful narrative for clinicians. Furthermore, because the ML process identifies important variables that may not have previously been associated with particular outcomes, predictions based on these variables may be less clinically intuitive. Although challenging, transparency in ML algorithm design and alert trigger may help to justify risk assessments from the clinicians’ perspective.
Lack of established action items to implement after an alert may have also contributed to the perceived lack of alert value. It is unclear what, if any, management changes clinicians should implement for high-risk patients “before” clinical onset of severe sepsis or septic shock. Although there are several interventions one might expect to improve sepsis outcomes if implemented prior to clinically apparent disease, including increased monitoring, there is a paucity of data regarding their efficacy and cost-effectiveness. Further research is needed to avoid increasing unnecessary cost, inappropriate testing, and poor antibiotic stewardship.
Although response rates for the first survey were comparable to that expected for web-based clinician surveys (3334), we cannot exclude nonresponder bias. Low response rates and limited continuity reported by clinicians at 48 hours may limit the interpretation of the second survey results. However, it is unclear in which direction nonresponder bias would influence our results because both clinicians who are satisfied and dissatisfied might respond more frequently.
To be most useful, systems to predict severe sepsis and septic shock will require an iterative development process informed by clinician perceptions. Thorough implementation campaigns are important to familiarize clinicians with the role and utility of predictive systems that are distinct from detection systems. Whether alerts are the optimal modality for communicating risk predictions remains in question. Rather than triggers for rapid response deployment, sepsis prediction systems may be most useful as longitudinal risk stratification tools to inform objective risk assessments during team handoffs and diagnostic decision-making. More research is needed to determine the most useful lead time and the most cost-effective and high impact interventions to deploy when patients are predicted to be high risk but do not yet have disease. Future predictive systems may be strengthened by screening for real-time sepsis-related orders from the EHR to more specifically target at-risk patients who would otherwise go undetected. In order to be trusted and adopted, ML predictive systems will need to be both accurate and interpretable (32). Interpretability will require transparency, ideally through interactive explanations and data visualization to translate the complex logic behind “black box models.”
Clinician perceptions of EWS 2.0 were mixed and, in general, poor. Despite excellent predictive characteristics, the EWS 2.0 alert infrequently provided new clinical information or changed management. Alerted patients’ relative clinical stability may have contributed to alert skepticism and uncertainty in response. Clinicians may find it difficult to trust complex predictive algorithms over their own clinical intuition if not provided with explanations to facilitate alert interpretation.
We thank Neil O. Fishman, MD, William C. Hanson III, MD, Mark E. Mikkelsen, MD, and Joanne Resnic, MBA, BSN, RN, for their contributions to the design, testing, and implementation of the clinical decision support intervention examined in this article.
1. Rhee C, Dantes R, Epstein L, et al.; CDC Prevention Epicenter Program: Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009-2014. JAMA 2017; 318:1241–1249
2. Jones SL, Ashton CM, Kiehne LB, et al. Outcomes and resource use of sepsis-associated stays by presence on admission, severity, and hospital type. Med Care 2016; 54:303–310
3. Levy MM, Rhodes A, Phillips GS, et al. Surviving sepsis campaign: Association between performance metrics and outcomes in a 7.5-year study. Intensive Care Med 2014; 40:1623–1633
4. Kumar A, Roberts D, Wood KE, et al. Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock
. Crit Care Med 2006; 34:1589–1596
5. Bhattacharjee P, Edelson DP, Churpek MM. Identifying patients with sepsis on the hospital wards. Chest 2017; 151:898–907
6. Buck KM. Developing an early sepsis alert program. J Nurs Care Qual 2014; 29:124–132
7. Palleschi MT, Sirianni S, O’Connor N, et al. An interprofessional process to improve early identification and treatment for sepsis. J Healthc Qual 2014; 36:23–31
8. Brandt BN, Gartner AB, Moncure M, et al. Identifying severe sepsis
via electronic surveillance. Am J Med Qual 2015; 30:559–565
9. Umscheid CA, Betesh J, VanZandbergen C, et al. Development, implementation, and impact of an automated early warning and response system for sepsis. J Hosp Med 2015; 10:26–31
10. Amland RC, Hahn-Cover KE. Clinical decision support for early recognition of sepsis. Am J Med Qual 2016; 31:103–110
11. McRee L, Thanavaro JL, Moore K, et al. The impact of an electronic medical record
surveillance program on outcomes for patients with sepsis. Heart Lung 2014; 43:546–549
12. Kurczewski L, Sweet M, Halbritter K, et al. Reduction in time to first action as a result of electronic alerts for early sepsis recognition. Crit Care Nurs Q 2015; 38:182–187
13. Hackmann G, Chen M, Chipara O, et al. Toward a two-tier clinical warning system for hospitalized patients. AMIA Annu Symp Proc 2011; 2011:511–519
14. Bailey TC, Chen Y, Mao Y, et al. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. J Hosp Med 2013; 8:236–242
15. Kollef MH, Chen Y, Heard K, et al. A randomized trial of real-time automated clinical deterioration alerts sent to a rapid response team. J Hosp Med 2014; 9:424–429
16. Escobar GJ, LaGuardia JC, Turk BJ, et al. Early detection of impending physiologic deterioration among patients who are not in intensive care: Development of predictive models using data from an automated electronic medical record
. J Hosp Med 2012; 7:388–395
17. Churpek MM, Yuen TC, Park SY, et al. Using electronic health record data to develop and validate a prediction model for adverse outcomes in the wards. Crit Care Med 2014; 42:841–848
18. Churpek MM, Yuen TC, Winslow C, et al. Multicenter comparison of machine learning
methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med 2016; 44:368–374
19. Escobar GJ, Ragins A, Scheirer P, et al. Nonelective rehospitalizations and postdischarge mortality: Predictive models suitable for use in real time. Med Care 2015; 53:916–923
20. Thiel SW, Rosini JM, Shannon W, et al. Early prediction of septic shock
in hospitalized patients. J Hosp Med 2010; 5:19–25
21. Sawyer AM, Deal EN, Labelle AJ, et al. Implementation of a real-time computerized sepsis alert in nonintensive care unit patients. Crit Care Med 2011; 39:469–473
22. Henry KE, Hager DN, Pronovost PJ, et al. A targeted real-time early warning score (TREWScore) for septic shock
. Sci Transl Med 2015; 7:299ra122
23. Shimabukuro DW, Barton CW, Feldman MD, et al. Effect of a machine learning
-based severe sepsis
prediction algorithm on patient survival and hospital length of stay: A randomised clinical trial. BMJ Open Respir Res 2017; 4:e000234
24. Calvert JS, Price DA, Chettipally UK, et al. A computational approach to early sepsis detection. Comput Biol Med 2016; 74:69–73
25. Desautels T, Calvert J, Hoffman J, et al. Prediction of sepsis in the intensive care unit with minimal electronic health record data: A machine learning
approach. JMIR Med Inform 2016; 4:e28
26. McCoy A, Das R. Reducing patient mortality, length of stay and readmissions through machine learning
-based sepsis prediction in the emergency department, intensive care unit and hospital floor units. BMJ Open Qual 2017; 6:e000158
27. Giannini HM, Ginestra JC, Chivers C, et al. A Machine Learning
Algorithm to Predict Severe Sepsis
and Septic Shock
: Development, Implementation, and Impact on Clinical Practice. Crit Care Med 2019; 47:1485–1492
28. Guidi JL, Clark K, Upton MT, et al. Clinician perception of the effectiveness of an automated early warning and response system for sepsis in an academic medical center. Ann Am Thorac Soc 2015; 12:1514–1519
29. Harris PA, Taylor R, Thielke R, et al. Research electronic data capture (REDCap)–A metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009; 42:377–381
30. Humphrey LL, Deffebach M, Pappas M, et al. Screening for lung cancer with low-dose computed tomography: A systematic review to update the US Preventive services task force recommendation. Ann Intern Med 2013; 159:411–420
31. Sprague BL, Arao RF, Miglioretti DL, et al. National performance benchmarks for modern diagnostic digital mammography: Update from the breast cancer surveillance consortium. Radiology 2017; 283:59–69
32. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning
in medicine. JAMA 2017; 318:517–518
33. Beebe TJ, Jacobson RM, Jenkins SM, et al. Testing the impact of mixed-mode designs (Mail and Web) and multiple contact attempts within mode (Mail or Web) on clinician survey response. Health Serv Res 2018; 53(Suppl 1):3070–3083
34. Cunningham CT, Quan H, Hemmelgarn B, et al. Exploring physician specialist response rates to web-based surveys. BMC Med Res Methodol 2015; 15:32
early warning system; electronic medical record; machine learning; predictive medicine; septic shock; severe sepsis
Supplemental Digital Content
Copyright © 2019 by the Society of Critical Care Medicine and Wolters Kluwer Health, Inc. All Rights Reserved.