Secondary Logo

Journal Logo

Beware of the Magic Eight Ball in Medicine*

Blum, James M. MD, FCCM

doi: 10.1097/CCM.0000000000004007
Editor's Choice

Department of Anesthesiology; and Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, Atlanta VA Medical Center, Decatur, GA

*See also pages 1477 and 1485.

Dr. Blum has disclosed that he does not have any potential conflicts of interest.

As a child, I found myself sitting with my friends playing with a magic eight ball asking for insights on a variety of important questions like if the Tigers would win the World Series. For those that are unfamiliar with the device, it is a toy that looks like a pool eight ball with a 20 sided floating die on the inside of a fluid-filled container with a transparent window (1). When you invert the ball, the die floats up to the window, and you read “an answer” to a question you posed.

Sometimes the eight ball would offer up what appeared to be solid answers to questions like “it is certain.” However, often, you would get a more obscure response to a question like “most likely” or “signs point to yes.” Although nobody would take the recommendations of the eight ball seriously, they were entertaining. In this issue of Critical Care Medicine, Ginestra et al (2) and Giannini et al (3) describe a new sepsis prediction algorithm which offers the obscured positive insights of the eight ball but based upon scientific probability rather than random chance. Such alerts have the remarkable potential to change the way we practice by moving the majority of sepsis care from being reactive to proactive but introduce the complexities of working with predictive machine learning-based models in acute care.

Medical alarms today are focused on one specific element, situational awareness. This is a fundamental concept in maintaining safety in a complex environment (4). In examining many accidents outside of the medical domain, loss of situational awareness is frequently the reason cited for a subsequent negative outcome (5, 6). In aviation, alarms informing the pilot that he/she is approaching a stall have been common for decades (7). Similar alarms exist today in many hemodynamic monitors (8). Providers can set such alarms to provide awareness that a patient has developed a concerning hemodynamic picture. These alarms allow the bedside provider to take immediate action, like titrating medications or calling for help.

Alarms in the acute care environment have continued to evolve, and now, complex alerting mechanisms exist to inform providers on the floor when a patient may be entering a concerning state. Investigators have developed early warning scores like MEWS and NEWS over the last 2 decades (9, 10). Although initially implemented on paper, such scores can usually be implemented in the electronic medical record using routine informatics techniques (11). These types of warnings are of less utility in that a clinician is faced with more options as to what to do. “Why exactly is a patient generating this alert and what should I do about it?” is a question that faces a clinician when dealing with one of these alerting tools. “Is this patient suffering cardiovascular collapse and why? Are they simply dehydrated or are they septic or are they having a myocardial infarction?” All of these are common questions facing a provider.

Due to the lack of a specific diagnosis, new, more sophisticated algorithms have been developed. Many of these algorithms provide early information on the actual development of a condition like sepsis, and Umscheid et al (12) previously described how such an alert was warmly received by users. Ginestra et al (2) and Giannini et al (3) describe a new sepsis algorithm across two manuscripts. The alert is not based on situational awareness, but a tool to predict the subsequent potential development of sepsis.

The results of these studies are interesting. This is the first large scale prospective evaluation of a sepsis prediction algorithm. The first article describes the development and subsequent implementation of the algorithm across two large academic health system hospitals (2). Overall the algorithm, which was based on a random forest classifier of 587 features, was tuned to have a positive predictive value (PPV) of 29%, with a negative predictive value (NPV) of 97%, and required for validation of a positive signal for sepsis that the patient has: 1) been coded as having sepsis during his/her hospital admission, 2) a positive blood culture, and 3) either an elevated lactate or systolic blood pressure less than 90. This places the model that was developed at a distinct disadvantage as up to 50% of sepsis cases never have a positive culture result. The overall sensitivity of 26% and specificity of 98% are to be expected and are likely of less importance than the PPV and NPV values considering the relative difficulty of predicting blood culture positive sepsis in the acute care setting and its relative rarity.

After the design period, the authors proceeded to silently and then actively validate the efficacy of the algorithm. Results of the alerts were mildly disappointing to the authors as outcomes did not change, but there was movement in-process measures including significant changes in time to ICU admission, an increase in testing of lactate, transfusion, and IV fluid administration. This all occurred despite the fact there was no protocol associated with the alert.

The second study examines users’ perceptions of the alert that is generated and may shed some light on why more interventions, like broad-spectrum antibiotics, were not immediately used in association with the alerts (3). Although the algorithm was tuned to provide reasonable PPV (29%, or roughly one in three cases) users, in general, were not supportive of the alert. This was particularly true of licensed independent providers and members of the medical team in which only 30% found benefit of the alerts after 48 hours. Although nurses were slightly more supportive (44% found benefit at 48 hr), they were still not overwhelmingly impressed with the benefits of the algorithm.

Combined, these two studies (2, 3) appear to suggest that in the vast majority of patients that were actually becoming septic at the time of the alerts, clinicians were aware and actively treating the condition. Otherwise, as the authors (3) suggest in the Discussion section, the alerts may have come too early to spark active intervention. Hence, the increase in diagnostic testing of lactate, an inexpensive, quick, and helpful study to assess for possible sepsis. One can also interpret these results as the alerts drew attention to other conditions, perhaps hypotension, hemorrhage, or anemia, with the increased use of IV fluids and transfusions.

Taken together, it appears the alerts were not profoundly useful in changing the trajectory of patients in general. The authors (3) provide extensive discussion of possible reasons for the outcomes of the studies. One item they suggest is that alerts for specific conditions may not be an appropriate target, and instead of targeting alerts focused around “general deterioration” may be more impactful. This has been the approach of commercially available products. However, the utility of such general alerts can also be questioned as being unhelpful in providing direction for the care of patients like MEWS and NEWS.

Although the individual results of such studies will continue to vary, the studies by Gianestra et al (2) and Giannini et al (3) provide a very concerning initial piece of evidence regarding disease-specific predictive alerts. Providers appear not to do well with prediction that something may happen, even if the alert is reasonably predictive. Most clinicians did not feel that the alerts in the study by Ginestra et al (2) and Giannini et al (3) subsequently modified a care plan, and from the actions documented, this appeared to be true in regards to sepsis treatment. As such, these alerts were likely often seen as additional noise by the clinicians attempting to provide care.

As we begin to move more into the realm of predictive medicine, it is imperative that we begin to understand the actual impact of such alerts and their perceived utility. Otherwise, we risk such tools being seen as the magic eight ball of medicine, potentially restricting innovation in the predictive medical domains due to limited believability, increased burden on the workforce, and subsequent poor adoption. Works like those of Ginestra et al (2) and Giannini et al (3) will need to be repeated to better develop this understanding, for without such investigations, we risk profoundly limiting this opportunity to dramatically improve care in the acute care environment.

Back to Top | Article Outline


1. Magic 8-Ball. Available at: Accessed September 16, 2019
2. Ginestra JC, Giannini HM, Schweickert WD, et al. Clinician Perception of a Machine Learning-Based Early Warning System Designed to Predict Severe Sepsis and Septic Shock. Crit Care Med 2019; 47:1477–1484
3. Giannini HM, Ginestra JC, Chivers C, et al. A Machine Learning Algorithm to Predict Severe Sepsis and Septic Shock: Development, Implementation, and Impact on Clinical Practice. Crit Care Med 2019; 47:1485–1492
    4. Schulz CM, Burden A, Posner KL, et al. Frequency and type of situational awareness errors contributing to death and brain damage: A closed claims analysis. Anesthesiology 2017; 127:326–337
      5. Falkland EC, Wiggins MW. Cross-task cue utilisation and situational awareness in simulated air traffic control. Appl Ergon 2019; 74:24–30
        6. Kelly D, Efthymiou M. An analysis of human factors in fifty controlled flight into terrain aviation accidents from 2007 to 2017. J Safety Res 2019; 69:155–165
          7. AOPA: How It Works: Stall Horn Warning a Pilot to Take Action. Available at:
            8. Schmid F, Goepfert MS, Kuhnt D, et al. The wolf is crying in the operating room: Patient monitor and anesthesia workstation alarming patterns during cardiac surgery. Anesth Analg 2011; 112:78–83
              9. Peng LS, Hassan A, Bustam A, et al. Using modified early warning score to predict need of lifesaving intervention in adult non-trauma patients in a tertiary state hospital. Hong Kong J Emerg Med 2018; 25:146–151
                10. Hill K. National early warning score. Nurs Crit Care 2012; 17:318–318
                  11. Finlay GD, Rothman MJ, Smith RA. Measuring the modified early warning score and the Rothman index: Advantages of utilizing the electronic medical record in an early warning system. J Hosp Med 2014; 9:116–119
                    12. Umscheid CA, Betesh J, VanZandbergen C, et al. Development, implementation, and impact of an automated early warning and response system for sepsis. J Hosp Med 2015; 10:26–31

                      acceptance; communication; early warning; prediction; sepsis

                      Copyright © 2019 by the Society of Critical Care Medicine and Wolters Kluwer Health, Inc. All Rights Reserved.