Secondary Logo

Journal Logo

Original Articles: Hepatology

Development of a Prognostic Score to Predict Mortality in Patients With Pediatric Acute Liver Failure

Lee, Eun Joo; Kim, Ju Whi; Moon, Jin Soo; Kim, Yu Bin; Oh, Seak Hee; Kim, Kyung Mo; Ko, Jae Sung

Author Information
Journal of Pediatric Gastroenterology and Nutrition: June 2020 - Volume 70 - Issue 6 - p 777-782
doi: 10.1097/MPG.0000000000002625
  • Free

Abstract

See “Prognostication in Paediatric Acute Liver Failure: Are We Dynamic Enough?” by Fitzpatrick on page 757.

What Is Known

  • The laboratory variables in pediatric acute liver failure have been identified for prognostic markers, and incorporation of these variables into scoring systems has been attempted. These current prognostic scoring systems do not adequately predict the likelihood of death.

What Is New

  • Our pediatric acute liver failure-delta score using the change of laboratory values accurately can predict mortality in patients with pediatric acute liver failure.

Pediatric acute liver failure (PALF) is a life-threatening but rare disorder (1,2). Several prognostic laboratory values have been identified, and a number of attempts have been made to incorporate the prognostic variables into PALF prognostic scoring systems. These scoring systems include: King's College Hospital criteria (KCHC), Pediatric End-Stage Liver Disease (PELD)/Model for End-Stage Liver Disease (MELD) score, and Liver Injury Units (LIU) (3,4). The majority of published studies using these prognostic models have regarded the combination of liver transplantation (LT) and death as poor outcome. As the natural course of PALF has not been revealed and patients who have received LT could have possibly recovered without LT, considering both LT and death as poor outcome becomes a significant limitation of previous studies (5). Furthermore, prognostic scoring models using static clinical and biochemical parameters obtained either at admission or from peak values on subsequent days, including KCHC (6,7) and LIU (3), do not reliably predict mortality.

Acute liver failure (ALF) might have a dramatic course and progress to deteriorate rapidly. Several adult ALF studies for predicting poor outcomes have focused on serial laboratory measurements and changes in hepatic encephalopathy (8–12). The change in serial laboratory values may be an ideal prognostic factor, but no studies with PALF patients have examined its prognostic accuracy.

The aim of this study was to develop a new prognostic score based on changes in serial laboratory values in nontransplanted PALF patients and to validate the prognostic accuracy of this new score through an independent cohort.

METHOD

Study Population

From 2000 to 2018, we reviewed medical records of patients under 18 years of age with PALF at the Seoul National University Children Hospital (SNUCH) and the Asan Medical Center (AMC). PALF was defined as the acute onset of severe liver injury in the absence of preexisting liver diseases with uncorrectable coagulopathy (defined as an international normalized ratio for prothrombin time [INR] ≥2.0 irrespective of the presence or absence of clinical encephalopathy or INR ≥1.5 in the presence of encephalopathy) (5). Patients with ischemic liver disease because of multiorgan failure were excluded. This study was approved by the institutional review board at each of the hospitals.

The derivation and validation cohorts were derived from consecutive nontransplant patients in SNUCH and AMC, respectively.

Laboratory Data

Daily morning laboratory records were obtained for up to 7 days after the diagnosis of ALF. The following data were included:

  • 1. Total bilirubin (TB) (mg/dL), INR, aspartate transaminase (AST) (IU/L), and alanine transaminase (ALT) (IU/L) at enrollment
  • 2. Peak TB, peak INR, peak ammonia (μmol/L)
    • a. The difference between the peak TB and TB at enrollment (ie, Δpeak TB)
    • b. The difference between the peak INR and INR at enrollment (ie, Δpeak INR)
    • c. The maximum change in serial TB (ie, Δdaily TB)
    • d. The maximum change in serial INR level (ie, Δdaily INR)

Existing Prognostic Score

Patients in SNUCH were divided into 3 groups (spontaneous recovery vs nonsurvival vs LT), and these groups were evaluated for existing prognostic scores.

PELD/MELDs score (PELDs for children under 12 years old, MELDs for older children) were assessed at enrollment and evaluated with an online calculator. The original LIU score was calculated using the peak laboratory values (LIU = [3.507 × peak TB (mg/dL)] + [45.51 × peak INR] + [0.254 × peak ammonia (μmol/L)]), whereas aLIU score was based on TB and INR at admission except ammonia (aLIU = [8.4 × TB (mg/dL) at admission] + [50.5 × peak INR at admission] (4). We derived the aLIU from the laboratory values at enrollment instead of values at admission. Daily PELD/MELDs and daily LIU score were also obtained using daily morning lab, and then the peak levels of these scores were chosen (peak PELD/MELDs, highest daily LIU [hdLIU] score). KCHC fulfillment was assessed over the course 7 days after the diagnosis of ALF.

Statistical Analyses and New Score

Continuous variables were expressed as mean (SD) or median (interquartile range, IQR) based on variable distribution. Data were analyzed by using SPSS software V.22.0 and Medcalc V18.2.1. Continuous variables were performed using chi-squared or Fisher exact test, and categorical variables were analyzed by t-test or Mann-Whitney U test.

Laboratory variables showing a significant difference from prognosis (spontaneous recovery versus nonsurvival) in univariate analysis were identified, and receiver operating characteristic (ROC) analysis was applied to these variables. The new score was developed by multivariable logistic regression model using laboratory variables showing a P <0.05 on previous ROC analysis. The logistic model was performed by the forward likelihood ratio test.

The Hosmer-Lemeshow test was used to evaluate the goodness of fit of the new model, with P >0.05 indicating a new model is appropriate. The accuracy of a new score was compared with existing scores using the area under the ROC (AUC), and the cut-off was determined by Youden index. Then, the performance of the new score was validated in the cohort of nontransplanted patients from AMC.

RESULT

A total of 146 medical charts of patients with PALF were reviewed in this study, as shown in Table 1, 69 patients of the SNUCH and 77 patients of the AMC were admitted (Table 1). The 2 populations had no significant difference in terms of age, sex, and etiology. SNUCH had significantly lower ALT and higher Δpeak INR and Δdaily INR than AMC. However, AMC had more patients who received LT (57.1% vs 36.2%, P = 0.012) than SNUCH. The median time of LT after admission was 4 days (IQR 2--11) in AMC and 5 days in SNUCH (IQR 3--15) (P = 0.75). Incidences of spontaneous recovery and nonsurvival showed no significant difference in the 2 centers, and the median times of death after admission was 12 days in SNUCH (IQR 6--15), and was 14 days in AMC (IQR 7--23) (P = 0.15). Patients of both centers who died within 7 days after admission (7/37, 18.9%) had significantly lower Δpeak TB (0.25, IQR 0--7.4 vs) and higher peak ammonia (254, IQR 135--283) than those who died afterward (7.6, IQR 2.8–14.2; 156, IQR 119--231). However, the difference of peak ammonia was not statistically significant (P = 0.06). As for the remaining variables, there were no significant differences in age, sex, etiology, and other laboratory values.

T1
TABLE 1:
Baseline characteristics of between Seoul National University Children Hospital and Asan Medical Center

We reanalyzed the overall outcome of combined 2 centers for study period divided into 3: 2000–2008, over 9 years; 2009–2014, over 6 years,; 2015–2018, over 4 years. The incidence of spontaneous recovery was relatively consistent over the first 2 periods. Between 2000–2008 and 2009–2014, the rates were 22% and 23%, respectively. However, there was a significant increase in the rate of spontaneous recovery in the last period, reaching 53% (P = 0.01) in 2015–2018. There was a decrease in nonsurvival across the 3 periods (33%, 22%, 20%).), but this difference was not statistically significant. Cumulative incidence of LT was initially increased (45%, 58%), but this initial increase was followed by a statistically significant decline in the next 4 years (58% vs 27%, P = 0.004).

Existing Prognostic Score

We excluded 2 patients who died on the day or following day of diagnosis from SUNCH. Both patients showed an advanced state of multiorgan failure at admission and died within 12 hours of hospitalization (1 Epstein-Barr virus infection, 1 indeterminate). The remaining 67 consecutive patients of SNUCH were further analyzed into 3 groups (spontaneous recovery vs nonsurvival vs LT) (Table 2). Δdaily TB, peak INR, Δpeak INR, Δdaily INR, aLIU, LIU, hdLIU, and peak PELD/MELD showed a significant difference between the spontaneous recovery group and other 2 groups. Δpeak TB and peak ammonia were significantly higher in the nonsurvival group than the spontaneous recovery group. The LT group had significantly higher PELD/MELDs at enrollment than the spontaneous recovery group but lower Δpeak TB and peak ammonia than the nonsurvival group. ALT at enrollment was lower in the nonsurvival group than the other 2 groups, but the difference did not reach statistically significant value (all P > 0.05).

T2
TABLE 2:
Comparison of characteristics and existing prognostic models of Seoul National University Children Hospital cohort

The nonsurvival group spent significantly more time to peak TB than the spontaneous recovery group (5.0 day vs 1.0 day, P = 0.00). The time to peak INR and ammonia was significantly less in the spontaneous recovery group than the other 2 groups.

A significantly higher portion of patients from the nonsurvival group used a ventilator (76.2%) and a vasopressor (76.2%) than the other 2 groups. The nonsurvival group had more renal replacement therapy (47.6%) than the spontaneous recovery group. There was no difference in age, sex, height, weight, and etiology among these 3 groups.

Among 21 patients in the nonsurvival group, 17 (81%) patients were not listed for LT because of medically unstable conditions (n = 14) and underlying disease (n = 3). Four children were listed but died awaiting LT with waiting times ranging from 13 to 20 days.

aLIU, hdLIU, LIU score, and peak PELD/MELDs were significantly lower in patients with spontaneous recovery than patients with nonsurvival and LT (Table 2). Three of these scores had a statistically significant ROC curve in nontransplanted patients, and AUC of those were; 0.800 for hdLIU score (95% CI 0.648–0.908), 0.819 for LIU score (95% CI 0.669–0.920), 0.788 for peak PELD/MELD (95% CI 0.634–0.899) (Table 3).

T3
TABLE 3:
Predictive ability of significant laboratory values, existing prognostic models and PALF-Ds in non-transplanted patients (derivation cohort of SNUCH)

KCHC, PELD/MELDs at enrollment and aLIU had poor discrimination for mortality; AUC 0.690 for the fulfillment of KCHC, AUC 0.641 for PELD/MELDs at enrollment, AUC 0.662 for aLIU score (Table 3).

Development of New Score

Laboratory values significantly associated with death (peak TB, Δpeak TB, Δdaily TB, peak INR, Δpeak INR, Δdaily INR, and peak ammonia) were applied to ROC analysis. All 7 values presented acceptable or good discrimination, and the best cut-off values for each value are shown in Table 3. Multivariate logistic regression was conducted using the forward method to predict death. A new score, PALF-Delta score (PALF-Ds), was developed using coefficient β of appropriate laboratory values resulted from previous logistic regression.

PALF-Delta score (PALF-Ds) = [0.232 × Δpeak TB (mg/dL)]

+ [2.263 × Δdaily INR] + [0.013 × peak ammonia (μmol/L)] − 4.498.

Hosmer- Lemeshow test (P = 0.92) showed the goodness of fit of the new model. PALF-Ds showed excellent accuracy of AUC 0.918 (P < 0.0001). The best cut-off was 0.02 with balanced sensitivity and specificity (95% CI-0.7920.980, sensitivity 81%, specificity 91%). We applied our new models to derivation cohort at different time points from day 3 up to 6 days after enrollment, and they showed acceptable AUC 0.844 (95% CI 0.698–0.937) at day 3, 0.846 (95% CI 0.701–0.938) at day 4, 0.889 (95% CI 0.754–0.965) at day 5, and 0.909 (95% CI 0.780–0.976) at day 6.

Investigation of models using both “LT and non-survival groups,” or “LT group only” as poor outcome did not yield significant results, and no equation could be derived.

Comparison Between New Score and Existing Prognostic Score

AUC of PALF-Ds (0.918) was higher in predictive discrimination than those of all 7 existing scores (Table 3). Prognostic accuracy of PALF-Ds was significantly superior to those of KCHC, PELD/MELDs at enrollment, aLIU, and hdLIU (all P < 0.05), although there was no statistical significance in LIU and peak PELD/MELDs (P = 0.09, P = 0.08, respectively) (Fig. 1).

F1
FIGURE 1:
Receiver operating characteristic curves for King's College Hospital criteria, Pediatric End-Stage Liver Disease/Model for End-Stage Liver Diseases at enrollment, aLIU at enrollment, Liver Injury Units, highest daily Liver Injury Units, peak Pediatric End-Stage Liver Disease/Model for End-Stage Liver Diseases and pediatric acute liver failure-Delta score (derivation cohort of Seoul National University Children Hospital).

Validation of the New Score

PALF-Ds were applied to the nontransplanted validation cohort of AMC and retained good discrimination with AUC of 0.947 (95% CI 0.809–0.995, sensitivity 100%, specificity 89%).

DISCUSSION

This study was performed in the 2 largest pediatric liver transplant centers in Korea. These centers were not much different in age, sex, cause, and rate of survival/death. Cumulative incidence of death and LT declined over the past decade in our cohorts, and this may relate the recent improvement of intensive medical care, similar to those reported in Western studies (5,21).

A number of prognostic and scoring systems for adult ALF has been suggested, but there is a limitation to the applicability of prognostic scores for ALF to PALF because of discrepancy in cause, definition, and age (5).

An ideal prognostic model should reflect the dynamic nature of PALF (1). There is growing interest in developing new scores based on the changes in laboratory values. ALF early dynamic (ALFED) model for predicting death in adult ALF was based on dynamicity of arterial ammonia, TB, INR, and hepatic encephalopathy greater than grade II over 3 days (10). When ALFED model was applied to PALF patients, its specificity was good, but its sensitivity was poor in predicting LT (13). A recent study with a cohort of paracetamol-induced adult ALF developed both a Day 1 model using clinical variables on admission and a Day 2 model with supplementary changes in the previous variables between admission and after admission.

Both models showed good discrimination, but the day 2 model had a better calibration than the day 1 model. These results demonstrated that the dynamic course of ALF was important for predicting mortality (14).

PALF is a rare disease and extremely heterogeneous; therefore, prognostic variables should be easy to measure and unbiased to interpretation. We hypothesized that changes in laboratory values yield more prognostic accuracy than static value, and our results showed that the change of TB/INR was significantly associated with mortality.

TB/INR as prognostic factors in PALF have been studied (15–20), but this is the first study to calculate the change of TB/INR as numerical values and to develop a new score using these values.

We compared our new PALF-Ds with existing scores widely used in PALF study. KCHC has been extensively studied and applied to draw up a list of patients who need an urgent LT. Most recently, a meta-analysis of the KCHC showed 0.76 of AUC with 58% sensitivity and 74% specificity for predicting mortality in adult patients with nonacetaminophen ALF (7). Using a PALF study group database, Sundaram et al (6) revealed the sensitivity and specificity of KCHC in predicting death was 61% and 70%, respectively. The sensitivity of KCHC in our derivation cohort was 48%, and our finding supports that KCHC does not reliably predict mortality in patients with PALF.

PELD/MELD scores were also used as prognostic scores for poor outcome in PALF patients. Sanchez and D’Agostino (16) reported that the PELD score on admission had excellent diagnostic accuracy of AUC 0.88 (specificity of 81%, sensitivity of 86%). Rajanayagam et al (15) showed a peak PELD/MELD score of cut-off 42 had AUC 0.856 for poor outcome, 66% sensitivity and 92% specificity. Our cut-off of peak PELD/MELD is 31.4, and this value achieves an AUC of 0.788 (specificity of 62%, sensitivity of 90%). Although these 2 studies validating PELD score included LT in the poor outcome, we excluded children who underwent LT from the poor outcome.

LIU and aLIU scores were designed by incorporating both LT and death into poor outcome (3). A recent large multinational study for validation of LIU showed LIU score predicted LT better than it predicted death (AUC for LT 0.84, AUC for death 0.76) (4). Our study showed the AUC for mortality of LIU and hdLIU scores are 0.819 and 0.800, respectively. But aLIU score, less accurately predicted the mortality.

In our derivation cohort, Δpeak TB, Δdaily TB, Δpeak INR, and Δdaily INR are significantly higher in patients with death than patient with spontaneous recovery. Among them, Δpeak TB, Δdaily TB, and Δdaily INR had good discrimination (AUC >0.8). This result indicates that the change of laboratory values has an important prognostic role for predicting mortality.

We developed a new score to predict the mortality more accurately than existing prognostic scores. New PALF-Ds have the highest predictive accuracy among all existing prognostic scores in the derivation cohort, and it is maintained in the validation cohort (AUC 0.918, 0.947, respectively).

As most of the decisions to LT listing in PALF occur within the first few days of admission, our new score may be less applicable in clinical settings (21). Our new modeling was developed and showed good discrimination power based on the data at day 7 after enrollment, but diagnostic accuracy from day 3 to day 6 after enrollment was also acceptable (AUC 0.844--0.909). Cumulative incidence of LT is declining within the improvement of conservative medical therapies. Thus, the dynamic course and sequential approach of PALF may be more important for decision of LT in the years ahead.

In conclusion, our findings that PALF-Ds had a higher predictive accuracy than other existing prognostic scores suggest that prognostic scoring system using the change of TB/INR may be useful in predicting mortality in patients with PALF. However, our findings should be taken with due caution because of limited sample size. Further research is needed to focus on the dynamics of PALF is needed. We also need to validate our new prognostic system in larger and more diverse populations.

REFERENCES

1. Li R, Belle SH, Horslen S, et al. Pediatric Acute Liver Failure Study Group. Clinical course among cases of acute liver failure of indeterminate diagnosis. J Pediatr 2016; 171:163.e1–170.e3.
2. Devictor D, Tissieres P, Durand P, et al. Acute liver failure in neonates, infants and children. Expert Rev Gastroenterol Hepatol 2011; 5:717–729.
3. Liu E, MacKenzie T, Dobyns EL, et al. Characterization of acute liver failure and development of a continuous risk of death staging system in children. J Hepatol 2006; 44:134–141.
4. Lu BR, Zhang S, Narkewicz MR, et al. Pediatric Acute Liver Failure Study Group. Evaluation of the liver injury unit scoring system to predict survival in a multinational study of pediatric acute liver failure. J Pediatr 2013; 162:1010.e1–1016.e4.
5. Jain V, Dhawan A. Prognostic modeling in pediatric acute liver failure. Liver Transpl 2016; 22:1418–1430.
6. Sundaram V, Shneider BL, Dhawan A, et al. King's College Hospital criteria for non-acetaminophen induced acute liver failure in an international cohort of children. J Pediatr 2013; 162:319.e1–323.e1.
7. McPhail MJ, Farne H, Senvar N, et al. Ability of King's College Criteria and model for end-stage liver disease scores to predict mortality of patients with acute liver failure: a meta-analysis. Clin Gastroenterol Hepatol 2016; 14:516.e5–525.e5.
8. Harrison PM, O’Grady JG, Keays RT, et al. Serial prothrombin time as prognostic indicator in paracetamol induced fulminant hepatic failure. BMJ 1990; 301:964–966.
9. Schiodt FV, Ostapowicz G, Murray N, et al. Alpha-fetoprotein and prognosis in acute liver failure. Liver Transpl 2006; 12:1776–1781.
10. Kumar R, Shalimar, Sharma H, et al. Prospective derivation and validation of early dynamic model for predicting outcome in patients with acute liver failure. Gut 2012; 61:1068–1075.
11. Kumar R, Shalimar, Sharma H, et al. Persistent hyperammonemia is associated with complications and poor outcomes in patients with acute liver failure. Clin Gastroenterol Hepatol 2012; 10:925–931.
12. Figorilli F, Putignano A, Roux O, et al. Development of an organ failure score in acute liver failure for transplant selection and identification of patients at high risk of futility. PLoS One 2017; 12:e0188151.
13. Uchida H, Sakamoto S, Fukuda A, et al. Sequential analysis of variable markers for predicting outcomes in pediatric patients with acute liver failure. Hepatol Res 2017; 47:1241–1251.
14. Bernal W, Wang Y, Maggs J, et al. Development and validation of a dynamic outcome prediction model for paracetamol-induced acute liver failure: a cohort study. Lancet Gastroenterol Hepatol 2016; 1:217–225.
15. Rajanayagam J, Coman D, Cartwright D, et al. Pediatric acute liver failure: etiology, outcomes, and the role of serial pediatric end-stage liver disease scores. Pediatr Transplant 2013; 17:362–368.
16. Sanchez MC, D’Agostino DE. Pediatric end-stage liver disease score in acute liver failure to assess poor prognosis. J Pediatr Gastroenterol Nutr 2012; 54:193–196.
17. Squires RH Jr, Shneider BL, Bucuvalas J, et al. Acute liver failure in children: the first 348 patients in the pediatric acute liver failure study group. J Pediatr 2006; 148:652–658.
18. Lee WS, McKiernan P, Kelly DA. Etiology, outcome and prognostic indicators of childhood fulminant hepatic failure in the United Kingdom. J Pediatr Gastroenterol Nutr 2005; 40:575–581.
19. Dhawan A, Cheeseman P, Mieli-Vergani G. Approaches to acute liver failure in children. Pediatr Transplant 2004; 8:584–588.
20. Durand P, Debray D, Mandel R, et al. Acute liver failure in infancy: a 14-year experience of a pediatric liver transplantation center. J Pediatr 2001; 139:871–876.
21. Squires RH, Squires JE, Rudnick DA, et al. Liver transplant listing in pediatric acute liver failure: practices and participant characteristics. Hepatology 2018; 68:2338–2347.
Keywords:

acute liver failure; pediatric; predictive model; prognosis

Copyright © 2020 by European Society for Pediatric Gastroenterology, Hepatology, and Nutrition and North American Society for Pediatric Gastroenterology, Hepatology, and Nutrition