Most critically ill patients experience significant morbidity or mortality despite receiving intensive care by a multidisciplinary medial team (1,2). The ability to accurately predict mortality for critically ill patients can help healthcare providers optimize care and provide valuable information for patients and caregivers (3,4). Various prognostic indices have been proposed, and several scoring systems to calculate the severity of organ dysfunction have been externally validated and are globally used (4–7).
The Sequential Organ Failure Assessment (SOFA) score is one of the well-accepted scales to quantify organ function and predict in-hospital mortality (4). The use of the SOFA score to assess patient status changes over time in the ICU has been validated to represent better mortality prediction (8). Both the European Society of Intensive Care Medicine and Society of Intensive Care Medicine have proposed that acute changes in the SOFA score may be used to define organ dysfunction among patients with infection and to diagnose sepsis (9,10). However, the SOFA score, as well as other prognostic scales such as Acute Physiology and Chronic Health Evaluation (APACHE) II and Simplified Acute Physiology Score, does not include biological markers correlated with systemic inflammation as a variable for score calculation (5,6,8). Notably, such inflammation markers, including cytokines/chemokines and acute phase proteins, are associated with unfavorable clinical outcomes (11–13).
Interleukin (IL)–6 is a cytokine released by immune cells and plays a role in systemic inflammatory changes caused by infection or tissue injury (14). Several studies have reported that serum IL-6 concentration is associated with disease severity, adverse events, and overall mortality among patients with sepsis, burn and trauma injury, cardiovascular diseases, and hemodialysis (15–19). However, the diagnostic accuracy of IL-6 for sepsis has been inconclusive, despite extensive analysis (13), and the clinical feasibility of IL-6 for mortality prediction in critically ill patients remains unclear (11,20,21). This study sought to elucidate whether serum IL-6 concentration can be a valid component of the SOFA scoring system to better predict mortality in critically ill patients. The hypothesis is that the addition of serum IL-6 concentration to SOFA score provides better mortality prediction in this population.
MATERIALS AND METHODS
Study Design and Setting
This prospective observational study used emergency department (ED) and ICU data from five university hospitals in Japan. All hospitals received individual local institutional review board approval for conducting research on human subjects. The Ethics Committee at the Keio University School of Medicine approved this study (approval number 16-03-007). Informed consent was obtained from all patients for participation included in the study.
The study enrolled critically ill patients admitted to the participating centers between September 2016 and September 2018. The inclusion criteria were as follows: age greater than or equal to 20 years, greater than or equal to two systemic inflammatory response syndrome (SIRS) criteria of the American College of Chest Physicians/Society of Critical Care Medicine on ED/ICU admission (22), and expected ICU stay greater than or equal to 48 hours. In addition to meeting these criteria, burn patients with a burn index (full-thickness burn area + 1/2 partial-thickness burn area) of greater than or equal to 15 and trauma patients with injuries in greater than or equal to two body regions on the abbreviated injury scale coding system and with an Injury Severity Score of greater than or equal to 10 were included. The exclusion criteria were as follows: current medications that affect serum IL-6 concentration (e.g., corticosteroids, immunosuppressants) within 1 week before study inclusion; discharge or death within 48 hours after admission; deviation from study protocol for biomarker tests and SOFA score calculation (e.g., blood sample was not obtained per protocol, SOFA score was not calculated); HIV infection; pregnancy; and any other condition precluding suitability for enrollment in the investigators’ opinion.
Data Collection and Definitions
The following patient information were included: demographic characteristics, admission source, comorbidities, medications administered within 1 week before study inclusion, etiology at admission, episode of cardiac arrest before study inclusion, presence of hemodynamic instability defined as vasopressor requirement or persistent hypotension despite fluid resuscitation, and any ED/ICU treatments.
Blood samples were obtained within 6 hours after ED/ICU admission (day 0) and the next morning (day 1). Blood tests were then performed daily from days 2–3 and as needed until 7 days after admission. These inflammatory biological markers were as follows: C-reactive protein (CRP), IL-6, IL-8, IL-10, tumor necrosis factor (TNF)–α, and procalcitonin. Serum CRP concentration was measured immediately with commercially available assays at each hospital; ILs, TNF-α, and procalcitonin were measured blindly to treating physicians at an outside facility after serum samples were frozen and stored at –20°C (IL-6, procalcitonin; Roche Diagnostics, Mannheim, Germany and IL-8, IL-10; BioSource Europe, Nivelles, Belgium and TNF-α; R&D Systems, Minneapolis, MN).
Arterial blood gas analysis and other blood tests for calculating each SOFA score component were performed at each hospital simultaneously with blood sampling for biological markers. The SOFA score was recorded daily until day 3 and as needed until day 7; APACHE II score was also calculated on ED/ICU admission.
The primary outcome was 28-day all-cause mortality. The secondary outcome was ICU-free days, defined as the number of days alive and out of the ICU between admission and day 28.
To assess improved accuracy for mortality prediction by adding a biomarker test to SOFA score, a baseline model was developed using logistic regression analysis to predict 28-day mortality, in which the SOFA score at day 2 was chosen as a sole explanatory variable. The day 2 score was chosen because the original SOFA score validation study showed that the SOFA score at 48 hours post admission had a higher discrimination power for predicting ICU mortality than the score at admission or difference in scores between admission and 48 hours later (8). To identify the best time for each biomarker to predict mortality, receiver operating characteristic (ROC) curves were drawn for serum concentration of each biomarker at days 0–3 based on 28-day mortality, and area under the ROC curve (AUROC) was calculated. The day with the highest AUROC was considered as the best time point for each biomarker.
Logistic regression analyses to predict 28-day mortality were performed again to derive the linear combination of the baseline model (day 2 SOFA score) and an additional biomarker as measured based on the best time point just described. As biomarkers were expected to be skewedly distributed, they were log transformed before analyses (11,15,16). Some biomarkers, including ILs, were analyzed with sex as suggested in other studies (23,24). The ROC curves were drawn, and AUROC was compared between the baseline model and combination model developed with the additional biomarker. Improvement of AUROC from baseline was shown with 95% CI. The sensitivity and specificity of each model were also obtained at a best cut off point defined as the Youden index (25). To assess optimism of the combination model using the additional biomarker, a corrected AUROC was calculated with bootstrap analysis (resampling the model 1,000 times) (26). The combination models with additional biomarkers were also examined in linear regression analyses to predict ICU-free days. The clinical applicability of biomarker was then assessed by calculating observed 28-day mortalities in subgroups classified as SOFA score and the biomarker dichotomized at the median value.
Descriptive statistics are presented as the mean (sd), median (interquartile range), or number (percentage). No imputation was used to estimate missing data. The improvement of predictive ability for mortality by adding biomarkers was unclear before the study, and sample size estimation was not performed for the main analysis. Sample size estimation for ROC analysis in which an AUROC of 0.7 was expected for an event with an incident rate of 15% indicated that 150 cases were needed with a power of 80% and an α error of 0.05 (27). Results were compared using Mann-Whitney U tests, chi-square tests, or Fisher exact tests, as appropriate. For testing all hypotheses, a two-sided α threshold of 0.05 was considered statistically significant. All statistical analyses were conducted using SPSS, Version 26.0 (IBM, Armonk, NY), and R Version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria).
Among 199 patients who met all inclusion criteria, the following patients were excluded: four who took corticosteroids before inclusion, seven who died within 48 hours post admission, and 21 who deviated from the protocol for biomarker tests and SOFA score calculation (Fig. 1). Among the 161 eligible patients, 18 (11.2%) did not survive at day 28. Table 1 shows the characteristics of the participant population. The most common etiology at admission was infectious disease (105 [65.2%]); greater than 50% required mechanical ventilation (86 [53.4%]), and about 33% underwent renal replacement therapy (51 [31.7%]).
TABLE 1. -
Characteristics of Study Population
|Age, years, mean (sd)
|Sex, male, n (%)
|Type of admission, n (%)
| Medical, infectious disease
| Medical, noninfectious disease
| Surgical, trauma/burn
| Surgical, nontrauma/burn
|Comorbidity, n (%)
| Cerebrovascular disease
| Cardiovascular disease
| Chronic lung disease
| Chronic kidney disease
| Liver disease
|Acute Physiology and Chronic Health Evaluation II score, median (IQR)
|Sequential Organ Failure Assessment score, median (IQR)
| Day 0
| Day 1
| Day 2
|Cardiac arrest prior to admission, n (%)
|Hemodynamic instabilitya, n (%)
|Mechanical ventilation, n (%)
|Renal replacement therapy, n (%)
|Length of ICU stay, d, median (IQR)
|Mortality, n (%)
| 7 d mortality
| 28 d mortality
IQR = interquartile range.
aHemodynamic instability was defined as vasopressor requirement or persistent hypotension despite fluid resuscitation.
Univariate analyses for each biomarker identified that the median IL-6 concentration at days 1–3 was higher among nonsurvivors than among survivors at day 28 (Fig. 2) (Table S1, https://links.lww.com/CCX/A575). Similarly, the median IL-8 serum concentration at days 0–3 and the median IL-10 concentration at days 1–3 were higher among nonsurvivors than among survivors at day 28. Conversely, serum CRP, procalcitonin, and TNF-α concentrations were comparable between nonsurvivors and survivors until day 3. Mortality prediction by a single biomarker on ROC analyses showed that the serum IL-6, IL-8, and IL-10 concentrations at days 1–3 had a significant discrimination power to predict 28-day mortality. The best time point for mortality prediction was day 3 for IL-6, day 1 for IL-8, and day 2 for IL-10. The IL-6 concentration at day 3 had the highest discrimination power (AUROC = 0.766; 95% CI = 0.656–0.876) (Table S1, https://links.lww.com/CCX/A575).
Accuracy in predicting 28-day mortality with SOFA score at day 2 was assessed as the baseline model using a logistic regression analysis (AUROC = 0.776; 95% CI = 0.672–0.880). Multivariate logistic regression analyses were performed to derive combination models with SOFA score and additional biomarkers (IL-6, CRP, procalcitonin, TNF-α at day 3; IL-8 at day 1; IL-10 at day 2). On AUROC comparison, the combination model with additional serum IL-6 concentration at day 3 was significantly higher than the baseline model using only the SOFA score (AUROC = 0.844; AUROC improvement = 0.068; 95% CI = 0.002–0.133) (Table 2) (Fig. S1, https://links.lww.com/CCX/A574). Conversely, other combination models using CRP, procalcitonin, IL-8, IL-10, and TNF-α had comparable discrimination power with the baseline model. Furthermore, a combination model using all of IL-6, IL-8, and IL-10 showed similar comparable discrimination power to the baseline model. Improvement of accuracy for mortality prediction by adding serum IL-6 concentration to the SOFA score was maintained in bootstrap analysis that estimated optimism of the combination model (corrected AUROC = 0.815). The serum IL-6 concentrations were also associated with decreased ICU-free days in the combination model (coefficient = 2.4 d decrease; 95% CI = 0.1–4.7 d decrease; p = 0.042) (Table S2, https://links.lww.com/CCX/A576), whereas models with other biomarkers were not.
TABLE 2. -
Accuracy for Mortality Prediction With Additive Biomarkers
||Improvement of AUROC (95% CI)
|SOFA score at day 2
|SOFA with IL-6
||0.068 (0.002 to 0.133)
|SOFA with C-reactive protein
||0.008 (–0.038 to 0.055)
|SOFA with procalcitonin
||0.000 (–0.012 to 0.012)
|SOFA with IL-8
||0.031 (–0.028 to 0.089)
|SOFA with IL-10
||0.061 (–0.057 to 0.179)
|SOFA with tumor necrosis factor–α
||–0.002 (–0.010 to 0.007)
AUROC = area under the receiver operating characteristic curve, IL = interleukin, PCT = procalcitonin, SOFA = Sequential Organ Failure Assessment.
Logit-transformed predictive mortality rate was calculated in each model as follows (Log-transformed values of biomarker levels were entered in the calculation): SOFA score at day 2 (baseline model): SOFA score at day 2 × 0.190–3.871; SOFA with IL-6: SOFA score at day 2 × 0.102 + IL-6 at day 3 × 1.226 + male × 1.011–6.381; SOFA with C-reactive protein: SOFA score at day 2 × 0.211–CRP at day 3 × 0.867–3.167; SOFA with PCT: SOFA score at day 2 × 0.181 + PCT at day 3 × 0.134–0.3862; SOFA with IL-8: SOFA score at day 2 × 0.146 + IL-8 at day 1 × 0.471 + male × 0.834–5.000; SOFA with IL-10: SOFA score at day 2 × 0.155 + IL-10 at day 2 × 2.180 + male × 0.372–6.096; SOFA with tumor necrosis factor (TNF)–α: SOFA score at day 2 × 0.189–TNF-α at day 3 × 0.023–3.856.
On subgroups analysis of observed 28-day mortalities classified with SOFA score at day 2 and the median serum IL-6 concentration at day 3, low IL-6 concentration (≤ 74 pg/dL) was associated with mortality less than or equal to 10% among patients with SOFA scores less than or equal to 11 (Fig. 3). For high IL-6 concentrations (> 74 pg/dL), the mortality rate averaged greater than 20% in patients with SOFA scores of 8–11 and 25.8% with SOFA scores of greater than 11.
This multicenter observational study examined the accuracy of mortality prediction with additive biomarkers and found that serum IL-6 concentration had the highest discrimination power to predict 28-day mortality in critically ill patients. The improvement of accuracy for mortality prediction by adding serum IL-6 concentration at day 3 to SOFA score was identified as increased AUROC from baseline that used only SOFA score. Higher serum IL-6 concentration was also associated with longer ICU stay when used as an additional inflammation biomarker with SOFA score.
Several pathophysiologic mechanisms for the high prediction ability of IL-6 may be considered based on previous studies (14,28,29). IL-6 activates target genes involved in host defense mechanisms and is a major player in pro- and anti-inflammatory responses to infection and injury (28). Because IL-6 is synthesized in a local lesion in the initial stage of inflammation or infection and then moves to the liver where acute phase proteins such as CRP are rapidly induced, elevation of serum IL-6 concentration usually precedes elevation of other inflammatory biomarkers and also clinical signs such as fever (14,29). In addition, removal of the inflammation source is quickly followed by cessation of IL-6–mediated cascade and degradation of IL-6 messenger RNA (29). Therefore, alteration of serum IL-6 concentration closely reflects the degree or severity of systemic inflammation. Furthermore, persistent IL-6 production with high serum concentration has been identified in patients with severe SIRS who experience cytokine storm (30), suggesting that dysregulated IL-6 abnormally accelerates inflammatory pathways and organ insult (14,30). Considering that serum IL-6 concentrations at day 3 versus days 0–2 had the highest discrimination in predicting mortality, persistent systemic inflammation would be detected by high IL-6 concentration at day 3 among critically ill patients.
Although several mortality prediction models have been developed, most of them used clinical and physiologic variables at admission or within the first 24 hours in the ICU (5–7). Although useful to predict early consequences such as ICU adverse events, ignoring deteriorations and improvements of patient status as a result of initial responses to treatment limit projecting later clinical outcomes, such as 28-day mortality. Furthermore, some prediction models involving IL-6 concentrations did not include clinical variables (11,12,19). Given that the use of both clinical and biological variables would better capture patient status, the SOFA score at day 2 and the serum IL-6 concentration at day 3, representing the patient condition altered by early treatment, would be feasible to predict 28-day mortality. A very high sensitivity (94.1%) for mortality was detected with the combination model using SOFA score and serum IL-6 concentration.
The clinical applicability of serum IL-6 concentration was assessed, and survival at 28 day will likely be predicted in patients with a SOFA score of less than 7 at day 2 and IL-6 concentration of less than or equal to 74 pg/dL at day 3. This point is significant because a retrospective study reported patients with initial or highest SOFA score of 6–7 had an ICU mortality of about 20% (8). Among patients with SOFA scores of 8–11, 28-day mortality almost doubled when the IL-6 concentration was greater than 74 pg/dL at day 3, suggesting such persistent elevation of serum IL-6 concentration would warn of unfavorable clinical consequences.
The results of this study must be interpreted in the context of the design. First, neither did it develop a new scoring scale using serum IL-6 concentration nor did it elucidate a cut off value of IL-6 concentration to predict 28-day mortality. Although results suggested that adding IL-6 concentration to SOFA score would be valuable to develop a better prediction system and that patients with an IL-6 concentration of less than or equal to 74 pg/dL at day 3 would be expected to survive even with a SOFA score of less than or equal to 7, more cases are required to derive and validate a new scale using IL-6.
Another limitation was that the study population included patients with various diseases. Given that another biomarker, such as procalcitonin, was extensively examined among patients with bacterial infection (31), the mortality of such population would be better predicted by procalcitonin rather than by IL-6 concentration. Although the small sample size precluded subgroup analyses, a disease-specific prediction model should be further examined.
Finally, the study did not collect data regarding long-term mortality or functional outcomes, including physical impairment and cognitive function, which may be more important than 28-day mortality among critically ill patients. Although the serum IL-6 concentration at day 3 was associated with length of ICU stay, further study on long-term and/or functional outcomes should be performed.
In this multicenter observational study, accuracy for 28-day mortality prediction was improved by adding serum IL-6 concentration to the SOFA score. Persistent high IL-6 concentration until 3 days after admission would predict longer ICU stay and higher probability of mortality at day 28. Further study is needed to develop a new scoring scale using both SOFA score and serum IL-6 concentrations.
1. Muscedere J, Waters B, Varambally A, et al. The impact of frailty on intensive care unit outcomes: A systematic review and meta-analysis. Intensive Care Med. 2017; 43:1105–1122
2. Tipping CJ, Harrold M, Holland A, et al. The effects of active mobilisation and rehabilitation in ICU on mortality and function: A systematic review. Intensive Care Med. 2017; 43:171–183
3. Kim SY, Kim S, Cho J, et al. A deep learning model for real-time mortality prediction in critically ill children. Crit Care. 2019; 23:279
4. Vincent JL, Moreno R, Takala J, et al. The SOFA (sepsis-related organ failure assessment) score to describe organ dysfunction/failure. On behalf of the working group on sepsis-related problems of the european society of intensive care medicine. Intensive Care Med. 1996; 22:707–710
5. Le Gall JR, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study. JAMA. 1993; 270:2957–2963
6. Knaus WA, Zimmerman JE, Wagner DP, et al. APACHE-acute physiology and chronic health evaluation: A physiologically based classification system. Crit Care Med. 1981; 9:591–597
7. Lemeshow S, Teres D, Avrunin JS, et al. Refining intensive care unit outcome prediction by using changing probabilities of mortality. Crit Care Med. 1988; 16:470–477
8. Ferreira FL, Bota DP, Bross A, et al. Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA. 2001; 286:1754–1758
9. Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for sepsis and septic shock (sepsis-3). JAMA. 2016; 315:801–810
10. Rhodes A, Evans LE, Alhazzani W, et al. Surviving sepsis campaign: International guidelines for management of sepsis and septic shock: 2016. Crit Care Med. 2017; 45:486–552
11. Dieplinger B, Egger M, Leitner I, et al. Interleukin 6, galectin 3, growth differentiation factor 15, and soluble ST2 for mortality prediction in critically ill patients. J Crit Care. 2016; 34:38–45
12. Basile-Filho A, Lago AF, Menegueti MG, et al. The use of APACHE II, SOFA, SAPS 3, C-reactive protein/albumin ratio, and lactate to predict mortality of surgical critically ill patients: A retrospective cohort study. Medicine (Baltimore). 2019; 98:e16204
13. Molano Franco D, Arevalo-Rodriguez I, Roqué I Figuls M, et al. Plasma interleukin-6 concentration for the diagnosis of sepsis in critically ill adults. Cochrane Database Syst Rev. 2019; 4:CD011811
14. Tanaka T, Narazaki M, Kishimoto T. IL-6 in inflammation, immunity, and disease. Cold Spring Harb Perspect Biol. 2014; 6:a016295
15. Sun J, Axelsson J, Machowska A, et al. Biomarkers of cardiovascular disease and mortality risk in patients with advanced CKD. Clin J Am Soc Nephrol. 2016; 11:1163–1172
16. Song J, Park DW, Moon S, et al. Diagnostic and prognostic value of interleukin-6, pentraxin 3, and procalcitonin levels among sepsis and septic shock patients: A prospective controlled study according to the Sepsis-3 definitions. BMC Infect Dis. 2019; 19:968
17. Qiao Z, Wang W, Yin L, et al. Using IL-6 concentrations in the first 24 h following trauma to predict immunological complications and mortality in trauma patients: A meta-analysis. Eur J Trauma Emerg Surg. 2018; 44:679–687
18. Hager S, Foldenauer AC, Rennekampff H-O, et al. Interleukin-6 serum levels correlate with severity of burn injury but not with gender. J Burn Care Res. 2018; 39:379–386
19. Lin YH, Glei D, Weinstein M, et al. Additive value of interleukin-6 and C-reactive protein in risk prediction for all-cause and cardiovascular mortality among a representative adult cohort in Taiwan. J Formos Med Assoc. 2017; 116:982–992
20. Naffaa M, Makhoul BF, Tobia A, et al. Interleukin-6 at discharge predicts all-cause mortality in patients with sepsis. Am J Emerg Med. 2013; 31:1361–1364
21. Shukeri WFWM, Ralib AM, Abdulah NZ, et al. Sepsis mortality score for the prediction of mortality in septic patients. J Crit Care. 2018; 43:163–168
22. Bone RC, Balk RA, Cerra FB, et al. Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis. The ACCP/SCCM Consensus Conference Committee. American College of Chest Physicians/Society of Critical Care Medicine. Chest. 1992; 101:1644–1655
23. Nasir N, Jamil B, Siddiqui S, et al. Mortality in Sepsis and its relationship with Gender. Pak J Med Sci. 2015; 31:1201–1206
24. Mörs K, Braun O, Wagner N, et al. Influence of gender on systemic IL-6 levels, complication rates and outcome after major trauma. Immunobiology. 2016; 221:904–910
25. YOUDEN WJ. Index for rating diagnostic tests. Cancer. 1950; 3:32–5
26. Zemek R, Barrowman N, Freedman SB, et al.; Pediatric Emergency Research Canada (PERC) Concussion Team. Clinical risk score for persistent postconcussion symptoms among children with acute concussion in the ED. JAMA. 2016; 315:1014–1025
27. Obuchowski NA, Lieber ML, Wians FH Jr. ROC curves in clinical chemistry: Uses, misuses, and possible solutions. Clin Chem. 2004; 50:1118–1125
28. Rose-John S. Interleukin-6 family cytokines. Cold Spring Harb Perspect Biol. 2018; 10:1–18
29. Heinrich PC, Behrmann I, Haan S, et al. Principles of interleukin (IL)-6-type cytokine signalling and its regulation. Biochem J. 2003; 374:1–20
30. Tanaka T, Narazaki M, Kishimoto T. Immunotherapeutic implications of IL-6 blockade for cytokine storm. Immunotherapy. 2016; 8:959–970
31. Wacker C, Prkno A, Brunkhorst FM, et al. Procalcitonin as a diagnostic marker for sepsis: A systematic review and meta-analysis. Lancet Infect Dis. 2013; 13:426–435