The Third International Consensus Task Force defined sepsis as an acute dysregulated host response to infection leading to life-threatening organ dysfunction (Sepsis-3) (1). The Task Force explored the predictive validity of a number of clinical criteria for sepsis in multiple clinical datasets. Balancing several aspects of predictive validity, reliability, and practicality, they recommended criteria for sepsis as an increase of 2 points or more in the Sequential (or Sepsis-related) Organ Failure Assessment (SOFA) score in response to infection (2–4). However, for patients outside the ICU, a set of three simple measures available without laboratory tests (respiratory rate ≥ 22 breaths/min, systolic blood pressure ≤ 100 mm Hg, and Glasgow Coma Scale [GCS] score < 15) had similar predictive validity for sepsis compared with the SOFA score. Termed the quick Sepsis-related Organ Failure Assessment (qSOFA), these signs may be a useful clinical prompt to help clinicians identify those patients with suspected infection most likely to be septic. Subsequent studies have confirmed the predictive validity of a single qSOFA measured with outcomes like hospital mortality (5–9).
Yet, important knowledge gaps about qSOFA remain. First, the predictive validity of repeated measurements is unknown and has not been tested in prior cohorts (6–9). Similar to other bedside tools (10–12), repeated measurements of qSOFA may help clinicians find patients with infection who are at greatest risk of sepsis, while using only objective, routinely measured vital signs. These measurements may better capture the evolving host response and organ dysfunction in sepsis. Therefore, reexamining the cohort in whom the qSOFA score was developed, we studied if repeated measurements of qSOFA in the 48 hours after suspected infection improve predictive validity for sepsis compared with a single measurement alone and whether repeated measurements can identify distinct clinical trajectories of qSOFA.
The institutional review board of the University of Pittsburgh approved this study with waiver of informed consent. These data have been reported previously in abstract form (13).
Study Design, Setting, and Population
As reported previously (4), we performed a retrospective cohort study among adult hospital encounters (age ≥ 18 yr) with suspected infection in 2012 at 12 community and academic hospitals in southwestern Pennsylvania. We included all medical and surgical encounters in the emergency department (ED), hospital ward, postanesthesia care unit, and ICU. For each encounter, we abstracted demographic, disposition, and administrative data, as well as time- and location-stamped vital signs, laboratory results, and orders (e.g., body fluid cultures, medications) from the electronic health record (EHR). We defined a cohort with suspected infection using the first combination of a body fluid culture order (e.g., blood, urine, cerebrospinal fluid) and at least one dose of antibiotics (oral or parenteral) within a specified time frame (4). Please see the online supplement (Supplemental Digital Content 1, http://links.lww.com/CCM/D867) for more details.
Measurement of qSOFA
We evaluated the first 48 hours after suspected infection in 6-hour epochs from the occurrence of the first culture or antibiotic event (time zero). We chose 6-hour epochs a priori, as this provides a pragmatic window in the EHR for reassessment and measurement of qSOFA variables. In each epoch, we calculated the maximum qSOFA score as previously reported (4), using the most parsimonious model accounting for GCS score of less than 15, systolic blood pressure of 100 mm Hg or less, and respiratory rate of 22/min or more (1 point each; score range, 0–3). A qSOFA score greater than or equal to 2 was considered high, a qSOFA score of 1 was moderate, and qSOFA score of 0 was low. The qSOFA score during the first 6-hour epoch after time zero was considered the initial qSOFA. Encounters missing an initial qSOFA score were excluded.
Trajectory of qSOFA in the 48 Hours After the Digital Signature of the Suspicion of Infection
We evaluated the trajectory of qSOFA in three ways: 1) simple summary measures, 2) crude trajectory groups of qSOFA, and 3) group-based trajectory modeling (GBTM). First, we calculated the mean qSOFA over 48 hours and the maximum qSOFA at both 24 and 48 hours. Second, we determined six groups corresponding to the crude trajectory of qSOFA using a threshold of 2 or more qSOFA points at the initial and maximum measurements. Crude groups included the following: 1) initial qSOFA equal to 0, maximum qSOFA less than 2; 2) initial qSOFA equal to 0, maximum qSOFA greater than or equal to 2; 3) initial qSOFA equal to 1, maximum qSOFA less than 2; 4) initial qSOFA equal to 1, maximum qSOFA greater than or equal to 2; 5) initial qSOFA equal to 2 or 3, maximum qSOFA less than 2; and 6) initial qSOFA equal to 2 or 3, maximum qSOFA greater than or equal to 2. Third, we used GBTM to further explore the trajectory of qSOFA and further describe these methods in the appendix (Supplemental Digital Content 1, http://links.lww.com/CCM/D867) (14).
To compare the predictive validity of repeated qSOFA, we assessed the gain in discrimination of in-hospital deaths not predicted by baseline risk (i.e., the gain in ability to discriminate those deaths not predicted by age, sex race/ethnicity, and preexisting comorbidity), as such unexpected deaths are presumed to be more common among infected patients who are septic than among infected patients who are not.
For descriptive statistics, we calculated mean (SD) and median (interquartile range) as appropriate to normality for continuous variables and used frequency (proportions) for categorical variables. To illustrate the general behavior of qSOFA over the 48 hours after suspected infection, we created heat maps of qSOFA among a randomly selected subset stratified by initial qSOFA (n = 3,538). Please see the appendix (Supplemental Digital Content 1, http://links.lww.com/CCM/D867) for additional details on the generation of the heat map cohort.
To compare the predictive validity of repeated measurements of qSOFA, we followed Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis recommendations (15). We evaluated results among all eligible encounters, as well as a priori restricted to encounters with suspected infection outside the ICU in a preplanned subgroup analysis. First, we assessed the frequency and pattern of missing qSOFA data by each 6-hour epoch (eTable 1, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Then to account for information already available to clinicians at the bedside when they recognize infection, we created a baseline model for in-hospital mortality using multivariable logistic regression that included age (as a fractional polynomial), sex, race/ethnicity (black, white, or other) and the weighted Charlson comorbidity score (as fractional polynomial) as a measure of chronic comorbidities (16 , 17). Race/ethnicity was derived from clinical documentation in the EHR.
Next, we determined predictive validity of the baseline model and after adding 1) the initial qSOFA, 2) simple summary measures of qSOFA over time, and 3) crude trajectory groups of qSOFA. We report the adjusted odds ratios (ORs) with 95% CIs and changes in the area under the receiver operating characteristic (AUROC) curve from baseline risk for each qSOFA measure. For the purposes of understanding whether a measure improved predictive validity, we considered a greater than 10% relative gain in discrimination to be a significant improvement between models. We determined relative gains based on the residual uncaptured discrimination (defined as 1 - AUROC) between a new model and the comparator. For example, if a baseline risk model’s AUROC is 0.7, then the uncaptured residual discrimination is 0.3. A model incorporating a new measure with an AUROC of 0.74 would thus capture 13% of the residual uncaptured discrimination compared with the baseline model.
We performed multiple sensitivity analyses to assess the robustness of our findings. These included assessment of the predictive validity of qSOFA measurements across deciles of baseline risk, changes to the cohort, when adding limitations in life-sustaining therapies at the time of suspected infection to the baseline model, and use of multiple imputation for missing data. Please see the online supplement (Supplemental Digital Content 1, http://links.lww.com/CCM/D867) for more details.
All analyses were performed with STATA 14.2 (StataCorp, College Station, TX) and used a two-sided p value of less than or equal to 0.05 for all tests of significance. Heat maps were generated in R 3.2.3 (R Foundation for Statistical Computing, Vienna, Austria) using the “heatmap3” package. GBTM was performed using the STATA “traj” plugin.
Cohort and Patient Characteristics
Among 1,309,025 encounters (eFig. 1, Supplemental Digital Content 1, http://links.lww.com/CCM/D867), 48,319 had suspected infection. After excluding those missing an initial qSOFA (n = 10,728), we studied 37,591 encounters, of whom 1,769 (4.7%) died in hospital. Compared with survivors, those who died were older, more frequently male, and more likely to have suspected infection occur in the ICU (p < 0.001 for all) (Table 1). Encounters who died were less likely to have onset of infection within 48 hours of admission, had greater serum lactate measurements on the day of suspected infection, and were more likely to require intensive care compared with survivors (p < 0.001 for all). Compared with the analysis cohort, encounters excluded due to missing an initial qSOFA were younger, less frequently male, more likely to have suspected infection occur in the ED, had lower systemic inflammatory response syndrome (SIRS) and SOFA scores, and had lower overall mortality (n = 82; 0.8%) (eTable 2, Supplemental Digital Content 1, http://links.lww.com/CCM/D867).
Repeated Measurement of qSOFA
Most initial qSOFA values were low (qSOFA = 0, n = 25,337; 67%), one quarter were moderate (qSOFA = 1, n = 9,801; 26%), and only a small proportion were high (qSOFA = 2, n = 2,292; 6% and qSOFA = 3, n = 161; < 1%). The mean initial qSOFA was greater among encounters who died than those who survived (eTable 3, Supplemental Digital Content 1, http://links.lww.com/CCM/D867) and remained higher during the 48-hour period after suspected infection (Fig. 1). When the initial qSOFA was low, it usually remained low or moderate (n = 24,136; 95%), and overall mortality was less than 2%. When the initial qSOFA was moderate, one quarter (n = 2,494; 25%) increased to high and incurred a mortality of 16%, whereas mortality remained low (4%) for the remainder who remained low (n = 7,307; 75%). When the initial qSOFA was high, it usually remained high (n = 1,925; 78%), and overall mortality was 25%. A minority of those with initially high qSOFA decreased to low (n = 528; 25%) but still had a 11% mortality (Table 2).
Predictive Validity of Single and Repeated qSOFA
When added to a baseline model of characteristics available to clinicians at the time of suspected infection, initial qSOFA was associated with an increase in the odds of death among all encounters (OR, 3.58; 95% CI, 3.37–3.80) and those outside the ICU (OR, 3.07; 95% CI, 2.81–3.34) (eTable 4, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). When summary measures of repeated qSOFA were added, the adjusted odds of death were greatest for the mean qSOFA score over 48 hours (OR, 8.55; 95% CI, 7.91–9.24). When comparing crude qSOFA trajectory groups to a referent group with consistently low qSOFA, the adjusted odds of death were greatest among groups 2) initial qSOFA equal to 0, maximum qSOFA greater than or equal to 2 (OR, 8.76; 95% CI, 7.14–10.74), 4) initial qSOFA equal to 1, maximum qSOFA greater than or equal to 2 (OR, 12.10; 95% CI, 10.40–14.09), and 6) initial qSOFA equal to 2 or 3, maximum qSOFA greater than or equal to 2 (OR, 23.16; 95% CI, 19.96–26.88). These associations were similar among encounters outside the ICU (eTable 4, Supplemental Digital Content 1, http://links.lww.com/CCM/D867).
The initial qSOFA improved the predictive validity for sepsis compared with the baseline model alone (AUROC for in-hospital mortality among all encounters, 0.79; 95% CI, 0.78–0.80 vs 0.63; 95% CI: 0.62–0.65; p < 0.001), a 43% reduction in uncaptured discrimination (Fig. 2; and eTable 5, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Mean qSOFA over 48 hours had the greatest reduction in uncaptured discrimination compared with the baseline model alone (AUROC, 0.86; 95% CI, 0.85–0.86; p < 0.001; 62% reduction). Crude qSOFA trajectory groups also improved predictive validity compared with both the baseline model and initial qSOFA (p < 0.001). Results were similar among encounters outside the ICU (Fig. 2; and eTable 5, Supplemental Digital Content 1, http://links.lww.com/CCM/D867).
Using GBTM, the best-fitting model had five groups: 1) low (n = 13,317; 35%), 2) increasing (n = 3,208; 9%), 3) decreasing (n = 10,921; 29%), 4) moderate (n = 7,301; 19%), and 5) high (n = 2,853; 8%) (Fig. 3; and eTable 2, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Crude mortality was greater in moderate (33%) and high trajectory groups (45%), and these groups had a greater adjusted odds of in-hospital mortality (eTable 6, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). The predictive validity of GBTM qSOFA trajectories (AUROC, 0.85; range, 0.84–0.85) was similar to that of the mean qSOFA and greater than that of both initial qSOFA and the crude trajectory groups (p < 0.001) (eFig. 2 and eTable 7, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Results were similar among encounters outside the ICU.
In sensitivity analyses, repeated measurements of qSOFA improved predictive validity for sepsis across deciles of baseline risk compared with both the baseline model and initial qSOFA (eFig. 2, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Analysis restricted to infections suspected in the ED (eTable 8, Supplemental Digital Content 1, http://links.lww.com/CCM/D867) also found similar ORs (eTable 4, Supplemental Digital Content 1, http://links.lww.com/CCM/D867) and predictive validity (eTable 5, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Model performance was unchanged when incorporating limitations in life-sustaining therapies in the baseline model (eTables 9 and 10, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Complete case analysis found similar results for both ORs (eTable 11, Supplemental Digital Content 1, http://links.lww.com/CCM/D867) and predictive validity (eFig. 3, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). Similarly, in multiple imputations, both ORs (eTable 6, Supplemental Digital Content 1, http://links.lww.com/CCM/D867) and predictive validity for in-hospital mortality were unchanged (eTable 7, Supplemental Digital Content 1, http://links.lww.com/CCM/D867).
In an integrated health system, repeated measures of qSOFA during the 48 hours after suspected infection had a greater predictive validity for sepsis using in-hospital mortality than a baseline measurement alone. These findings were consistent among encounters outside the ICU and in multiple sensitivity analyses. Encounters with a high or increasing qSOFA were at greater risk than those with low or decreasing qSOFA trajectory. Taken together, these findings suggest that repeated measurements of qSOFA could help prompt clinicians about patients more likely to be septic.
There are several reasons why repeated measurement of qSOFA can improve predictive validity for sepsis among patients with suspected infection. On one hand, individuals may present for care at different stages in their episode of illness. And among those who present at the same time, there may be evolving host response and organ injury on different timelines. In both cases, additional measurement of data over time may account for the dynamics of illness course and the effect of treatment on organ dysfunction. Our findings support recent literature confirming that qSOFA improves predictive validity for in-hospital mortality compared with SIRS (5–9) but add to them by showing an incremental benefit using repeated measurements.
Our findings have important clinical implications. The Third International Sepsis Definitions Task Force recommended qSOFA of more than 2 points as a prompt to suspect sepsis outside the ICU. A significant advantage of qSOFA is that it requires no laboratories or tests. Yet, a threshold of 2 or more points has a low sensitivity (47–70%) (5 , 6 , 8 , 9) and may miss patients at high risk for sepsis if measured at a single time. Our data suggest greater predictive validity with repeated measurements (Fig. 2; and eTable 5, Supplemental Digital Content 1, http://links.lww.com/CCM/D867). As shown in Table 2, patients with initial high qSOFA who decrease to low have reduced mortality (11% mortality) compared with those who remain high (25% mortality). Such data could be used in family discussions, deescalation of interventions, or assist with down-triage decisions in resource-constrained settings. Those with initial moderate qSOFA who increase to high (16% mortality) could prompt up-triage or initiation of organ support therapies (18). Yet if a moderate qSOFA decreases (4% mortality), therapies might be focused with greater precision (e.g., avoiding fluid overresuscitation) or deescalated more rapidly. Repeated measures may also lead to greater intensity of monitoring, particularly among those outside the ICU, and may be scalable inside the EHR. Last, our data may have particular importance for sepsis care in low- and middle-income countries, where repeated measurements of vital signs may be the only way to monitor patients over time (19).
From a research perspective, these data suggest that trajectory modeling may identify clusters of patients at greater risk for sepsis. The GBTM technique identified five distinct groups of septic encounters. Similar to the efforts of other groups (20), these clusters may provide a method to identify different pathophysiology of sepsis, leading to more clues about the underlying mechanisms of host response and organ dysfunction. They may be useful in future studies evaluating the heterogeneity of treatment effects of new therapies aimed at preventing organ dysfunction and shock (21). However, GBTM groups require additional validation in separate cohorts.
This study has several limitations. First, because there is no gold standard for sepsis in the EHR, we used an outcome more common in septic patients than not—in-hospital mortality. This approach is now used in multiple qSOFA validation studies (5–9). Second, missing data was common, particularly among encounters discharged prior to 48 hours after infection. However, in both complete cases analysis and after multiple imputations, our results were similar. Third, we only used data in an integrated health system in southwestern Pennsylvania, and our results may not be generalizable to regions with different sepsis case mix or in low-middle income countries. Fourth, the effects of interventions may impact the trajectory of qSOFA. Future studies which model ongoing treatment and location status in the hospital (e.g., transfer from ward to ICU after infection suspected) on qSOFA trajectories are warranted. Fifth, our study did not compare the predictive validity of qSOFA trajectories to other repeated measures, such as the full SOFA score, which may not have complete laboratory values at frequent intervals.
Among encounters with suspected infection, the predictive validity of repeated qSOFA measurements for sepsis using in-hospital mortality was greater than a single measurement alone.
1. Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for sepsis
and septic shock (Sepsis
-3). JAMA 2016; 315:801–810
2. Angus DC, Seymour CW, Coopersmith CM, et al. A framework for the development and interpretation of different sepsis
definitions and clinical criteria. Crit Care Med 2016; 44:e113–e121
3. Seymour CW, Coopersmith CM, Deutschman CS, et al. Application of a framework to assess the usefulness of alternative sepsis
criteria. Crit Care Med 2016; 44:e122–e130
4. Seymour CW, Liu VX, Iwashyna TJ, et al. Assessment of clinical criteria for sepsis
: For the third international consensus definitions for sepsis
and septic shock (Sepsis
-3). JAMA 2016; 315:762–774
5. Raith EP, Udy AA, Bailey M, et al; Australian and New Zealand Intensive Care Society (ANZICS) Centre for Outcomes and Resource Evaluation (CORE): Prognostic accuracy of the SOFA score, SIRS criteria, and qSOFA score for in-hospital mortality
among adults with suspected infection admitted to the intensive care unit. JAMA 2017; 317:290–300
6. Freund Y, Lemachatti N, Krastinova E, et al; French Society of Emergency Medicine Collaborators Group: Prognostic accuracy of sepsis
-3 criteria for in-hospital mortality
among patients with suspected infection presenting to the emergency department. JAMA 2017; 317:301–308
7. Singer AJ, Ng J, Thode HC Jr, et al. Quick SOFA scores predict mortality
in adult emergency department patients with and without suspected infection. Ann Emerg Med 2017; 69:475–479
8. Churpek MM, Snyder A, Han X, et al. Quick sepsis
-related organ failure assessment, systemic inflammatory response syndrome, and early warning scores for detecting clinical deterioration in infected patients outside the intensive care unit. Am J Respir Crit Care Med 2017; 195:906–911
9. Henning DJ, Puskarich MA, Self WH, et al. An emergency department validation of the sep-3 sepsis
and septic shock definitions and comparison with 1992 consensus definitions. Ann Emerg Med 2017; 70:544–552.e5
10. Vincent JL, Moreno R, Takala J, et al. The SOFA (Sepsis
-related Organ Failure Assessment) score to describe organ dysfunction/failure. Intensive Care Med 1996; 22:707–710
11. Teasdale G, Jennett B. Assessment of coma and impaired consciousness. A practical scale. Lancet 1974; 2:81–84
12. Lyden P, Brott T, Tilley B, et al. Improved reliability of the NIH Stroke Scale using video training. NINDS TPA Stroke Study Group. Stroke 1994; 25:2220–2226
13. Kievlan D, Zhang LA, Chang JC-C, et al. 1345: Serial evaluation of qsofa among patients with suspected infection. Critical Care Medicine 2016; 44:412
14. Nagin DS, Odgers CL. Group-based trajectory modeling in clinical research. Annu Rev Clin Psychol 2010; 6:109–138
15. Collins GS, Reitsma JB, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. Ann Intern Med 2015; 162:55–63
16. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care 2005; 43:1130–1139
17. Royston P, Sauerbrei W. Building multivariable regression models with continuous covariates in clinical epidemiology
–with an emphasis on fractional polynomials. Methods Inf Med 2005; 44:561–571
18. Ferreira FL, Bota DP, Bross A, et al. Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA 2001; 286:1754–1758
19. Huson MA, Kalkman R, Grobusch MP, et al. Predictive value of the qSOFA score in patients with suspected infection in a resource limited setting in Gabon. Travel Med Infect Dis 2017; 15:76–77
20. Knox DB, Lanspa MJ, Kuttler KG, et al. Phenotypic clusters within sepsis
-associated multiple organ dysfunction syndrome. Intensive Care Med 2015; 41:814–822
21. Prescott HC, Calfee CS, Thompson BT, et al. Toward smarter lumping and smarter splitting: rethinking strategies for sepsis
and acute respiratory distress syndrome clinical trial design. Am J Respir Crit Care Med 2016; 194:147–155
epidemiology; mortality; multiple organ failure; organ dysfunction scores; sepsis
Supplemental Digital Content
Copyright © by 2018 by the Society of Critical Care Medicine and Wolters Kluwer Health, Inc. All Rights Reserved.