Who’s at Risk? A Prognostic Model for Severity Prediction in Pediatric Acute Pancreatitis

Objectives: The aim of the study was to validate and optimize a severity prediction model for acute pancreatitis (AP) and to examine blood urea nitrogen (BUN) level changes from admission as a severity predictor. Study Design: Patients from 2 hospitals were included for the validation model (Children’s Hospital of the King’s Daughters and Children’s National Hospital). Children’s Hospital of the King’s Daughters and Cincinnati Children’s Hospital Medical Center data were used for analysis of BUN at 24 to 48 hours. Results: The validation cohort included 73 patients; 22 (30%) with either severe or moderately severe AP, combined into the all severe AP (SAP) group. Patients with SAP had higher BUN (P = 0.002) and lower albumin (P = 0.005). Admission BUN was confirmed as a significant predictor (P = 0.005) of SAP (area under the receiver operating characteristic [AUROC] 0.73, 95% confidence interval [CI] 0.60–0.86). Combining BUN (P = 0.005) and albumin (P = 0.004) resulted in better prediction for SAP (AUROC 0.83, 95% CI 0.72–0.94). A total of 176 AP patients were analyzed at 24–48 hours; 39 (22%) met criteria for SAP. Patients who developed SAP had a significantly higher BUN (P < 0.001) after 24 hours. Elevated BUN levels within 24 to 48 hours were independently predictive of developing SAP (AUROC: 0.76, 95% CI: 0.66–0.85). Patients who developed SAP had a significantly smaller percentage decrease in BUN from admission to 24 to 48 hours (P = 0.002). Conclusion: We externally validated the prior model with admission BUN levels and further optimized it by incorporating albumin. We also found that persistent elevation of BUN is associated with development of SAP. Our model can be used to risk stratify patients with AP on admission and again at 24 to 48 hours.

A cute pancreatitis (AP) represents a significant disease burden in the pediatric population, the estimated incidence has been increasing over the past 2 decades and has now stabilized to greater than 1 in 10,000, with significant health and economic implications as well (1,2). Although there has been increased attention to pediatric AP in recent years, pediatric data regarding optimal management remain limited. The first management guidelines for pediatric AP were published by the North American Society for Pediatric Gastroenterology, Hepatology and Nutrition (NASP-GHAN) in 2018 (3). This society effort noted several potential predictive models of severity of pediatric AP were previously

What Is Known
Severe acute pancreatitis is associated with persistent organ dysfunction and may require interventions not available in a community setting. In a single-center study, blood urea nitrogen on admission has been shown as a significant predictor of disease severity.

What Is New
In a multicenter study, the role of blood urea nitrogen obtained on admission in predicting progression to severe acute pancreatitis has been validated. This model was further optimized with the inclusion of albumin. Persistent elevation of blood urea nitrogen after 24 to 48 hours was further associated with progression to severe acute pancreatitis.
published, some of which had varying specificity and sensitivity upon validation (3). The authors concluded further investigation was required to identify predictive markers of severity on admission with models that can be replicated at independent sites (3). In 2017, the NASPGHAN Pancreas Committee published severity classification guidelines for pediatric AP, providing a consensus definition of mild, moderately severe, and severe AP (SAP) in pediatrics for the first time (4). Now that a commonly shared definition of severity has been established, predictive models that use this classification are needed.
Following the publication of the 2017 severity guidelines, Vitale et al (5) developed a model to predict severity utilizing internal data from a prospective first attack-AP registry at Cincinnati Children's Hospital Medical Center (CCHMC) (5). The model characterized patients based on the severity of their disease and examined clinical parameters on admission in an attempt to identify predictors of AP severity (5). From their model-building efforts, they identified blood urea nitrogen (BUN) on admission as a significant predictor of the development of severe disease in the pediatric population (5). BUN change after fluid resuscitation has been shown to predict severity in prior adult studies, but to our knowledge has not been studied in children (6,7).
The primary aim of this study was to validate and optimize the previously reported model using a separate cohort of patients from 2 distinct children's hospitals, and the secondary aim was to further reexamine the role of BUN change after admission (5). Through resuscitation and fluid management, BUN shifts may occur, and the change in BUN has not been studied as a predictor of severity in pediatric AP. Our secondary aim was to evaluate whether BUN alone after 24 hours of resuscitation would be an independent predictor AP severity.

METHODS
There were 3 distinct patient populations used to achieve the goals of the study.
For the external validation and optimization cohort patients were included from 2 sites: The Children's Hospital of the King's Daughters (CHKD) in Norfolk, VA, and Children's National Hospital (CNH) in Washington, DC, obtained data from patients younger than 19 years from 2012 to 2018. Patients admitted to CHKD were identified retrospectively using billing codes from 2012 to 2017. Patients prospectively enrolled in an ongoing study of first time AP (Clincaltrials.gov: NCT03232473) at CNH from 2016 to 2018 were also included. This was designed to externally validate the model previously constructed at CCHMC, and thus data from CCHMC were not included.
For the BUN change analysis, we used collected data from CCHMC and CHKD cohorts. Data from CNH were not included because the patients were part of an ongoing randomized control trial and we did not wish to unintentionally prejudice the results of that study.
Permission was granted by each local institutional review board (IRB). The local measures for identifying patients at each institution are specified below.
The CHKD data represent a retrospective chart review of pediatric patients who presented to the emergency department or the inpatient ward diagnosed with AP. Permission was granted by the local IRB (CHKD IRB 17-04-WC-0096). Patients diagnosed with AP based on International Classification of Diseases, ninth revision (ICD-9) or ICD-10 codes in either the emergency department or inpatient hospital ward were included. The following codes were used: ICD-9 Codes ¼ 577.0-577. 2 The CNH data were obtained prospectively as part of an ongoing randomized controlled trial examining intravenous fluid choice in pediatric AP, which is registered at Clinicaltrials.gov (NCT03242473) between 2016 and 2018. Patients with first episode of AP were identified and enrolled in the study and the laboratory values on presentation, before randomization and the ultimate development of mild AP, moderate AP, or SAP were included in the analysis. Permission was granted by the local IRB for the study (CNH IRB Pro00007698). As previously described, the data from CCHMC was prospectively collected on patients who presented with their first episode of AP between March 2013 and January 2017 (CCHMC IRB 2012-4050) (5).
For all of the datasets, the first laboratory values for each patient obtained within 24 hours of presentation were included in the analysis. In all patients, the diagnosis of AP was confirmed using the International Study Group of Pediatric Pancreatitis: In Search for a Cure (INSPPIRE) group diagnostic criteria for pediatric AP originally published in 2012, which have subsequently been endorsed by the Pancreas Committee of NASPGHAN (3,8). Diagnosis required a patient to meet at least 2 of the following 3 criteria: characteristic abdominal pain, amylase or lipase !3 times the upper limit of normal for age, and imaging findings consistent with AP (eg, edema, necrosis, hemorrhage, abscess, pseudocyst) (3,8). Severity of AP was classified utilizing the 2017 NASPGHAN criteria, based on the presence or absence of local pancreatic complications or transient (<48 hours) versus persistent (!48 hours) systemic organ dysfunction (4). Mild AP was defined as the absence of local pancreatic or systemic organ dysfunction and the absence of exacerbation of underlying disease (4). Moderately severe AP was defined as transient (<48 hours) organ dysfunction, local pancreatic complications, or exacerbation of underlying disease (4). SAP is defined as involving prolonged (!48 hours) organ dysfunction (4). Moderately severe AP and SAP were included in a single group labeled SAP for the purposes of statistical analysis to separate the mild cases from the nonmild cases.

Statistical Analysis Methods
Data were analyzed using SAS, version 9.4 (SAS Institute, Cary, NC). Because of skewed distributions, continuous data were summarized as medians with interquartile ranges (IQR: 25th-75th percentiles), whereas categorical data were summarized as frequency counts and percentages. If laboratory values were reported as below the limit of detection, the values used for analysis were the limit of detection value divided by the square root of 2 (9). For continuous data, nonparametric Wilcoxon-Mann-Whitney tests were used to compare characteristics and laboratory values between groups. Chi-square and Fisher exact tests were used, as appropriate, for group comparisons of categorical data. Logistic regression Each author listed on the manuscript participated in the concept and design, analysis and interpretation of data for this project, was involved in editing and has seen and approved the submission of this version of the manuscript and takes full responsibility for the manuscript. Stepwise selection was performed to identify variables to optimize prediction of developing SAP when combined in a multivariable logistic regression model based on significant P values and the receiver operating characteristics (ROC) curve. A P value <0.05 was considered statistically significant.

Validation of the Severity Model
During the study period, a total of 57 patients at CHKD met the criteria for first episode of AP. There were a total of 16 patients recruited from CNH who met the diagnostic criteria for first episode of AP. For the validation of the previously described model, these 73 total patients were included for analysis (51 mild AP, 15 moderately severe AP, and 7 SAP cases). Moderately severe AP and SAP were included in a single group labeled SAP for this analysis. From CHKD, 16 of 57 patients were included in the SAP cohort, and from CNH, 6 of 16 were included in the SAP cohort.

Baseline Clinical and Biochemical Laboratory
Values on Admission Between Mild Acute Pancreatitis and Severe Acute Pancreatitis Cases Table 1 shows baseline characteristics of the patients in this cohort segregated by AP severity. Patients with mild AP had similar baseline characteristics compared to SAP group. The SAP group had significantly longer length of stay (LOS), with a median LOS of 147.5 hours (IQR 70-348 hours), compared to the mild AP group who had a median LOS of 94 hours (IQR 63-154 hours) (P ¼ 0.0497, Table 1). Table 1 also includes a selected sample of laboratory values obtained on admission and the results of the univariate analyses performed to test for group differences. BUN and serum albumin significantly differed between the SAP and mild AP groups (Table 1). One hundred percent of the patients in the dataset had a BUN level measured at the time of admission, and there was a significant difference between BUN levels in the SAP (median 14.5 mg/dL, IQR 11-19 mg/dL) and mild AP (median 11 mg/dL, IQR 813 mg/dL) groups (P ¼ 0.002, Table 1). More than 90% of the patients had a serum albumin level measured on admission, and the values were significantly higher in the mild AP (median 4.0 g/dL, IQR 3.6-4.6 g/dL) group when compared to the SAP (median 3.3 g/dL, IQR 3.1-4.2 g/dL) group (P ¼ 0.005, Table 1). The complete list of biochemical parameters examined on admission is included as Supplemental

Higher Values for Blood Urea Nitrogen and Lower Values for Albumin Represent Increasing Levels of Severity
We explored the relationship of BUN and albumin levels across all of the severity classifications (mild, moderately severe AP and SAP), and found a trend toward higher BUN levels and lower albumin levels from mild to moderately severe to the most severe AP group. We found that as the BUN value increased, so did the likelihood of developing SAP ( Fig. 1 A and B). In addition, as serum albumin decreased, the likelihood of developing SAP increased ( Fig. 1C and D).

Validation of the Blood Urea Nitrogen Model
The previously reported model had identified BUN as a significant prognostic marker of severity (area under the receiver operating characteristic [AUROC] curve: 0.75, 95% confidence interval [CI] 0.61-0.89); therefore, one of our aims was to validate BUN as a predictor of severity using a separate cohort of patients (5). We were able to validate the previous model, as BUN was found to be a significant predictor of any form of SAP in our cohort ( Fig. 2A

Optimization of the External Validation Model
To optimize the BUN model using stepwise selection and including other potential predictor variables from the external validation cohort into a multivariable logistic regression model, we found BUN (P ¼ 0.005) and adding serum albumin (P ¼ 0.004) created a better predictive model for SAP (AUROC curve: 0.83, 95% CI: 0.72-0.94, sensitivity 71%, specificity 79%, PPV 60%, NPV 86%) (Fig. 2B). Threshold admission values of BUN >13 mg/ dL and serum albumin <3.6 g/dL were associated with an increased probability of developing SAP.
Akaike's Information Criterion (AIC) was calculated for the models and evaluated to help determine which model would perform the best, with a lower score suggesting a better model. For the BUN only model, the AIC was 79.4. For the albumin only model, the AIC was 79.1. For the combined BUN and albumin model, the AIC was 66.2. Therefore, the BUN and albumin combined had the best AIC score, superior AUROC, and significant P values for both BUN and albumin, suggesting that BUN and albumin on admission optimized the model predictive capability.

Blood Urea Nitrogen Change at 24 to 48 Hours Exploration
A total of 176 primary patients with AP were included for the purpose of this analysis from the 2 study sites as specified in the method section. Thirty-nine patients (22%) met criteria for SAP. For the clinical presentation and management; there was no statistical difference based on age, sex, rate or type of fluid used (isotonic, hypotonic, or total parenteral nutrition [TPN]) in the first 24 hours between the SAP and mild AP groups (Supplemental Table 2 (Fig. 3). In our validation cohort, a BUN of >20 mg/dL resulted in a 98% specificity of developing SAP, with a PPV of 83% and an NPV of 75% (Supplemental Table 3A, Supplemental Digital Content, http://links.lww.com/MPG/B859), consistent with the previously reported adult findings (6,7). When we examined the BUN change data, we applied this same value to determine whether this would be useful as a cutoff value (Supplemental Table 2, Supplemental  Table 3B, Supplemental Digital Content, http://links.lww.com/ MPG/B859). In all cohorts and at all time points, applying the cutoff of >20 mg/dL as suggested by the adult literature resulted in a specificity of 98%. Elevated BUN levels within 24 to 48 hours were predictive of developing SAP (AUROC: 0.76, 95% CI: 0.66-0.85) (Supplemental Fig. 1, Supplemental Digital Content, http:// links.lww.com/MPG/B859).

DISCUSSION
Utilizing the 2017 severity classification guidelines, Vitale et al (5) built a model that prospectively predicts the development of SAP based on BUN at admission. We were able to validate this model utilizing a different cohort of patients from 2 separate sites and found that elevated BUN levels on admission were associated with increased severity of disease. Our initial analysis suggested that the addition of albumin optimized the model. In addition, we found that the increases in BUN and decreases in albumin values predicted increasing severity of the disease, from mild to moderately severe AP to SAP.
This model is easy to apply to patient care, involves commonly obtained laboratory samples that can be (and often are) acquired at the time of presentation, and is clinically useful early in presentation (5). These are all characteristics that are generally considered requirements for the development of a strong predictive model (10)(11)(12). Optimal validation of the model generally requires an external sample from a similar population to replicate the previously generated results, which we were able to identify at the 2 additional clinical sites, and to generate a discriminating model with a concordance statistic performance that is represented by the area under the ROC curve (10,13,14). An area under the ROC curve of between 0.70 and 0.80 is considered to represent adequate discrimination, whereas >0.80 is considered excellent (10,14). Our BUN only model had an area under the ROC curve of 0.73, very similar to the previously reported results of 0.75, thereby confirming the robustness in replicating the previous values (5). Our model incorporating serum albumin had better prediction, with an area under the curve of 0.83.
Several adult studies have linked elevated BUN levels to increased severity in AP, whether as part of the bedside index for severity of AP scoring or as a stand-alone predictor of severity (6,7,(15)(16)(17)(18)(19)(20)(21). The proposed mechanism is that the elevated BUN reflects intravascular volume depletion as opposed to renal involvement, given the lack of observed change in the creatinine between the 2 groups (6,17,21). There is also adult data to suggest that decreased albumin on admission is an independent predictor of disease severity in AP as a marker of persistent organ failure, both at admission and then again at 48 hours after admission, likely reflecting the underlying inflammatory state, although the precise mechanism remains unclear (22). Persistent BUN elevation in our population was also predictive of severity, and BUN elevation at 24-48 hours was an independent predictor of severity in this population. This is a novel finding not previously reported in the pediatric literature. Although it showed similar trends from the adult literature, previous studies did not address the management effects on the BUN change (6,7). We have shown that persistent elevation of BUN was a predictor of SAP independent from rates and types of fluids the patient had received.
Multiple attempts have been made to develop prognostic models for severity in pediatric AP. DeBanto et al (23) developed the Pediatric Acute Pancreatitis Score, which was modeled after adult scoring metrics and involves demographic and biochemical data at admission and then again at 48 hours to calculate the score. In 2013, it was reported that a significantly elevated lipase (!7Â upper limit of normal [ULN]) at time of admission was predictive of severity, but validation of the model had varying sensitivities and specificities (24). Thus, previous attempts to develop a predictive tool in the pediatric population either lacked a common definition of severity or required up to 48 hours from admission to calculate the predictive score, or had varying specificity and sensitivities when replicated in validation studies (23)(24)(25)(26)(27). To our knowledge, the first model to use the severity classification consensus published in 2017 and laboratory values on presentation is the BUN model in the derivation cohort and we have now replicated that model successfully (5).
Because of the increased specificity of the predictive model as the BUN level increases, frontline providers should be aware that a higher initial BUN on admission or after initial fluid resuscitation is associated with a higher probability that the patient would develop SAP. In addition, the NPV and PPV can be used to help promote clinical decision making. In all of the models presented, the higher the BUN, the higher the specificity and PPV. The NPV is much higher than the PPV at a lower BUN due to the nature of the severity distribution. When trying to use our model to predict severity, we chose cutoff values that optimize both sensitivity and specificity and PPV and NPV. For instance, in our validation cohort, the model performs optimally at a BUN of 13 mg/dL (sensitivity 68%, specificity 73%). By using a cutoff of 13, we may, however, include some patients who may not develop severe disease. Thus, to interpret BUN as a predictor at admission in our population it may be helpful to think of it in groupings: <13 mg/dL they will most likely have mild AP, 13 to 20 mg/dL there's a high chance that they will develop SAP so intervention should be considered, and at >20 mg/dL they will almost surely develop SAP so intervention is warranted. This information will allow providers to use commonly performed laboratory tests to augment their clinical decision-making to determine which patients may require an escalation of care or transfer to a specialized facility.
Although our study was successful in validating the utility of BUN as a prognostic indicator of severity in pediatric AP, it is not without limitations. The validation component of this study is limited in size, with only 73 patients across the 2 sites that met the criteria for first episode of AP. We do not have data regarding how long before presentation the patients may have been experiencing symptoms, which may have affected the initial laboratory values for some patients, but this was a ''real-world'' application of this model. All patients were managed per their respective hospitals' protocols which may have some institutional variation, are nevertheless similar regarding fluid choices and nutrition. Previous authors have commented on the benefits of avoiding estimation of treatment effects when building prognostic risk prediction models (14).
In conclusion, this study validates a previously generated model, showing that initial BUN is a significant predictor of SAP in the pediatric population. It further optimizes prediction by adding albumin as a significant variable at admission. It also gives another time point for evaluation, allowing clinicians to check BUN at 24 to 48 hours to monitor progression of the disease. Timely identification of high-risk patients will allow referrals to a center that has access to pediatric gastroenterologists with affiliations to a pancreatic center of excellence or a higher level pediatric intensive care unit. Future efforts should focus on combining the predictors of AP into a clinical tool that providers can use to identify the patients at highest risk of progression to severe disease at the time of presentation.