Algorithm and Alert Implementation
Clinical Characteristics of Screen Positive Patients.
Demographics of the total study population in the silent and alert periods were clinically similar, as were the characteristics of screen positive patients from each group (Table 1). EWS 2.0 triggered for 7.4% of admissions (n = 1,540) during the silent period and 7.1% of admissions (n = 2,137) during the alert period. During the silent period, the tool triggered a median of 6 hours and 34 minutes (IQR, 0 hr:50 min to 53 hr:19 min) prior to the onset of severe sepsis or septic shock. This was similar to the alert period (median of 5 hr and 25 min) (IQR, 0 hr:45 min to 45 hr:0 min). Almost 60% of screen positive patients met two of four SIRS criteria at the time of alert, increasing to 84% by 48 hours post alert (Fig. 1). Only 11% of patients met study criteria for the outcomes of severe sepsis or septic shock at time of alert. By 48 hours after the alert, 30% of screen positive patients met study criteria for the outcomes of severe sepsis or septic shock. Screen positive patients demonstrated marked abnormalities in vital signs and laboratory data compared with those who did not trigger EWS 2.0 (Supplemental Fig. 1, Supplemental Digital Content 3, http://links.lww.com/CCM/E784; legend, Supplemental Digital Content 10, http://links.lww.com/CCM/E791).
The alert prompted a modest but statistically significant increase in lactate testing, administration of IV fluid boluses, and complete blood count or basic metabolic panel testing within 3 hours following the alert (Table 3). Increases in lactate testing and IV fluid bolus administration were sustained at 6 hours post alert, but only lactate testing remained significantly increased at the 48-hour mark (Supplemental Table 3, Supplemental Digital Content 4, http://links.lww.com/CCM/E785). Transfusion of packed RBCs was also significantly increased in the first 6 hours post alert (Supplemental Table 3, Supplemental Digital Content 4, http://links.lww.com/CCM/E785). Frequency of blood cultures or initiation of antibiotics did not significantly differ between the silent and alert periods (Supplemental Table 3, Supplemental Digital Content 4, http://links.lww.com/CCM/E785). Time to administration of broad-spectrum sepsis antibiotics also did not differ significantly (silent period: median 11 hr:12 min; IQR, 2 hr:27 min to 36 hr:34 min; alert period: median 9 hr:47 min, IQR, 2 hr:37 min to 35 hr:39 min; p = 0.59).
In the alert period, 27% of screen positive patients met criteria for Suspected Sepsis. Postalert increases in lactate testing were significant in both the Suspected and Unsuspected Sepsis groups (Supplemental Table 4, Supplemental Digital Content 5, http://links.lww.com/CCM/E786; and Supplemental Table 5, Supplemental Digital Content 6, http://links.lww.com/CCM/E787). Increases in IV fluid bolus administration, telemetry, and laboratory testing were primarily observed in Unsuspected Sepsis (Supplemental Table 5, Supplemental Digital Content 6, http://links.lww.com/CCM/E787). Antibiotic initiation did not significantly differ for either group.
Compared with screen positive patients during the silent period, screen positive patients during the alert period had a statistically significant decrease in time to ICU transfer, but no significant change in the frequency of ICU transfer or median length of stay in the ICU (Table 4). There were also no statistically significant differences in the development of severe sepsis or septic shock, all-cause mortality, or discharge disposition.
The observed decrease in median time-to-ICU-transfer among patients in the alert group was primarily driven by the Unsuspected Sepsis cohort (24 hr [IQR, 3–117] hr vs 8 hr [IQR, 2–73] hr; p < 0.01) (Supplemental Table 6, Supplemental Digital Content 7, http://links.lww.com/CCM/E788; and Supplemental Table 7, Supplemental Digital Content 8, http://links.lww.com/CCM/E789). There was no significant change observed for the Suspected Sepsis cohort (Supplemental Table 6, Supplemental Digital Content 7, http://links.lww.com/CCM/E788). Neither cohort had postalert changes in frequency of ICU transfer, median length of stay in ICU, or mortality; however, we did observe increased frequency of discharge to inpatient hospice among patients with Suspected Sepsis at the time of the alert (3.0% vs 5.9%; p = 0.04) (Supplemental Table 6, Supplemental Digital Content 7, http://links.lww.com/CCM/E788).
We developed a ML algorithm to predict severe sepsis and septic shock and implemented the tool on non-ICU services across our multihospital healthcare system. Here, we confirmed the feasibility of widespread implementation of a ML predictive alert but observed a limited impact on clinical practice and outcomes.
To train our algorithm, we sought to identify patients with unequivocal sepsis physiology. Our selected Sepsis Training Criteria included hypotension and lactic acidosis as markers of impaired perfusion and shock (20) and a positive blood culture as a specific marker of infection. Although recent sepsis definitions do not include bacteremia, and in fact up to 50% of sepsis cases have no confirmed source of infection, we used narrower criteria to improve the specificity and predictive value of our resulting algorithm. SIRS criteria and clinical data related to end-organ dysfunction were heavily weighted in the algorithm, thus supporting our approach to algorithm development. However, this study’s results may be limited by our use of more specific sepsis definitions that have not been externally validated.
The resulting algorithm accurately identified hospitalized patients at risk for developing severe sepsis or septic shock, despite the inherent limitations of EHR data, which can be plagued by missingness, inaccuracies, and changes in practice patterns over time. Importantly, the sensitivity of the tool was limited to minimize alert fatigue given that hospital providers are estimated to receive greater than 50 EHR alerts on average per day (21), leading to providers declining, ignoring, or deferring a majority of the alerts they encounter (22). Our lower sensitivity resulted in higher specificity and an excellent positive likelihood ratio.
Ultimately, EWS 2.0 did not significantly improve our main outcome measures. We hypothesize that the alert’s impact on clinical processes and patient outcomes was limited by multiple factors, including a lack of prespecified interventions, limited alert format, long alert lead-times, and perhaps most importantly, minimal predictive value beyond predictions already made by the clinical teams.
Despite good predictive values, in many cases, the postalert bedside evaluation resulted in minimal changes to clinical care or outcomes. Ambiguity may have arisen about how to manage patients in the setting of a positive screen but apparent clinical stability. Prior to apparent disease, the utility of further laboratory testing, fluid administration, and empiric antibiotics is unclear. This may contribute to the low level of practice change following the alert. In addition, our study is limited in that we did not evaluate whether observed practice changes were appropriate for the clinical context of each patient.
We previously reported that with our prior alert system, EWS 1.0, clinical teams already strongly suspected sepsis in greater than 50% of cases (23). A survey administered to alerted providers and nurses during the alert period of this current study revealed a similar sentiment (24). However, in the subgroup analysis of patients who were not suspected of having sepsis at the time of alert, there was a statistically significant decrease in time to ICU transfer. On the other hand, there was an increase in referral to inpatient hospice for patients already suspected of sepsis. Thus, it appears that the algorithm may have provided information that supported either escalating care for those not suspected of having sepsis or adjusting overall goals of care for those initially suspected of having sepsis.
The format of our intervention, as a one-time alert, may have affected the alert’s impact on clinical care and outcomes. There may be multiple critical opportunities for clinical teams to integrate clinical information with a sepsis risk assessment. Yet, we did not require reevaluation of alerted patients at later time-points, even though in many cases our one-time alert triggered hours prior to the onset of sepsis physiology. For these cases, the lead-time of the alert and evaluation prior to clinically overt disease may have been too long. The question remains as to whether alerts are the most effective method of communicating real-time predictive information or whether a continuous score may more dynamically support clinical decision-making. Furthermore, although some clinical data were reported with the alerts, the variables and logic leading to alert trigger were not clearly delineated, creating what has been referred to as a ML “black box model” (25). This lack of transparency may have reduced overall trust in the algorithm and may have affected the clinician perception of the reliability of the prediction.
Future Algorithm Optimization
Considerations must also be made for optimizing algorithm design. Recent studies have shown that ML predictions in sepsis and critical care may be strengthened by incorporating free text from provider documentation using natural language processing (102627). Importantly, we derived our algorithm using criteria that although guided by sepsis consensus guidelines, were defined by the study team, and not externally validated. Although we prioritized specificity, a more sensitive algorithm may pick up subtle clinical trends for patients who are less likely to be captured by clinician’s usual risk assessments (although at the risk of alert fatigue). Additionally, the most actionable moment in the course of a patient’s sepsis trajectory may be the time just prior to, or during, the onset of clinical change. In this case, our alert frequently fired at a time when the patient appeared clinically well, sometimes many hours ahead of later decompensation. Finally, severe sepsis and septic shock may not be the most relevant outcomes to target when predicting unsuspected active clinical deterioration requiring a response from frontline providers. Algorithms trained for general decline, which may predict ICU transfer (17) or even mortality (28), might be more impactful with respect to changing process and outcome measures and preventing these critical events.
This study demonstrates the feasibility of implementing a ML algorithm for real-time analysis of EHR data to accurately predict the development of severe sepsis or septic shock. We have also shown the potential implications of alerting clinicians to this prediction throughout a multihospital healthcare system. In this study, the alert did not significantly alter clinical practice or outcomes. Training the algorithm on more traditional definitions of clinical deterioration, enhancing ML algorithms through incorporation of natural language processing, and effectively communicating risk while avoiding alerts in patients already suspected of clinical deterioration, represent potential opportunities to improve the impact of sepsis prediction on clinical care outcomes.
We thank Mark E. Mikkelsen, MD, and Joanne Resnic, MBA, BSN, RN, for their contributions to the design, testing, and implementation of the clinical decision support intervention examined in this article.
1. Rhee C, Dantes R, Epstein L, et al. Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009–2014. JAMA 2017; 318:1241–1249
2. Torio CM, Andrews RM. National Inpatient Hospital Costs: The most Expensive Conditions by Payer, 2011. HCUP Statistical Brief #160. August 2013. Agency for Healthcare Research and Quality, Rockville, MD. http://www.hcup-us.ahrq.gov/reports/statbriefs/sb160.pdf
. Accessed July 9, 2019
3. Liu VX, Fielding-Singh V, Greene JD, et al. The timing of early antibiotics and hospital mortality in sepsis. Am J Respir Crit Care Med 2017; 196:856–863
4. Escobar GJ, LaGuardia JC, Turk BJ, et al. Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record
. J Hosp Med 2012; 7:388–395
5. Bellomo R, Ackerman M, Bailey M, et al.; Vital Signs to Identify, Target, and Assess Level of Care Study (VITAL Care Study) Investigators: A controlled trial of electronic automated advisory vital signs monitoring in general hospital wards. Crit Care Med 2012; 40:2349–2361
6. Churpek MM, Yuen TC, Winslow C, et al. Multicenter development and validation of a risk stratification tool for ward patients. Am J Respir Crit Care Med 2014; 190:649–655
7. Umscheid CA, Betesh J, VanZandbergen C, et al. Development, implementation, and impact of an automated early warning and response system for sepsis. J Hosp Med 2015; 10:26–31
8. Khurana HS, Groves RH Jr, Simons MP, et al. Real-time automated sampling of electronic medical records predicts hospital mortality. Am J Med 2016; 129:688–698.e2
9. Deo RC. Machine learning
in medicine. Circulation 2015; 132:1920–1930
10. Horng S, Sontag DA, Halpern Y, et al. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning
. PLoS One 2017; 12:e0174708
11. Taylor RA, Pare JR, Venkatesh AK, et al. Prediction of in-hospital mortality in emergency department patients with sepsis: A local big data-driven, machine learning
approach. Acad Emerg Med 2016; 23:269–278
12. Berger T, Birnbaum A, Bijur P, et al. A computerized alert screening for severe sepsis
in emergency department patients increases lactate testing but does not improve inpatient mortality. Appl Clin Inform 2010; 1:394–407
13. Churpek MM, Yuen TC, Winslow C, et al. Multicenter comparison of machine learning
methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med 2016; 44:368–374
14. Henry KE, Hager DN, Pronovost PJ, et al. A targeted real-time early warning score (TREWScore) for septic shock
. Sci Transl Med 2015; 7:299ra122
15. Hackmann G, Chen M, Chipara O, et al. Toward a two-tier clinical warning system for hospitalized patients. AMIA Annu Symp Proc 2011; 2011:511–519
16. Shimabukuro DW, Barton CW, Feldman MD, et al. Effect of a machine learning
-based severe sepsis
prediction algorithm on patient survival and hospital length of stay: A randomised clinical trial. BMJ Open Respir Res 2017; 4:e000234
17. Wellner B, Grand J, Canzone E, et al. Predicting unplanned transfers to the intensive care unit: A machine learning
approach leveraging diverse clinical elements. JMIR Med Inform 2017; 5:e45
18. McCoy A, Das R. Reducing patient mortality, length of stay and readmissions through machine learning
-based sepsis prediction in the emergency department, intensive care unit and hospital floor units. BMJ Open Qual 2017; 6:e000158
19. Mao Q, Jay M, Hoffman JL, et al. Multicentre validation of a sepsis prediction algorithm using only vital sign data in the emergency department, general ward and ICU. BMJ Open 2018; 8:e017833
20. Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for sepsis and septic shock
(Sepsis-3). JAMA 2016; 315:801–810
21. Murphy DR, Reis B, Sittig DF, et al. Notifications received by primary care practitioners in electronic health records: A taxonomy and time analysis. Am J Med 2012; 125:209.e1–209.e7
22. Ancker JS, Edwards A, Nosal S, et al.; with the HITEC Investigators: Effects of workload, work complexity, and repeated alerts on alert fatigue in a clinical decision support system. BMC Med Inform Decis Mak 2017; 17:36
23. Guidi JL, Clark K, Upton MT, et al. Clinician perception of the effectiveness of an automated early warning and response system for sepsis in an academic medical center. Ann Am Thorac Soc 2015; 12:1514–1519
24. Ginestra JC, Giannini HM, Schweickert WD, et al. Clinician Perception of a Machine Learning
-Based Early Warning System
Designed to Predict Severe Sepsis
and Septic Shock
. Crit Care Med 2019 May 24. [Epub ahead of print]
25. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning
in medicine. JAMA 2017; 318:517–518
26. Weissman GE, Hubbard RA, Ungar LH, et al. Inclusion of unstructured clinical text improves early prediction of death or prolonged ICU stay. Crit Care Med 2018; 46:1125–1132
27. Marafino BJ, Park M, Davies JM, et al. Validation of prediction models for critical care outcomes using natural language processing of electronic health record data. JAMA Netw Open 2018; 1:e185097
early warning system; electronic medical record; machine learning; predictive medicine; septic shock; severe sepsis
Supplemental Digital Content
Copyright © 2019 by the Society of Critical Care Medicine and Wolters Kluwer Health, Inc. All Rights Reserved.