Hospitalization is the primary driver of inflammatory bowel diseases (IBD)-related healthcare costs and morbidity. Traditional prediction models have poor performance at identifying patients at highest risk of unplanned healthcare utilization. Identification of patients who are high-need, high-cost (HNHC) could reduce unplanned healthcare utilization and healthcare costs.
We conducted a retrospective cohort study in adult patients hospitalized with IBD using the Nationwide Readmissions Database (model derivation in NRD 2013and validation in NRD 2017). We built 2 tree-based algorithms (decision tree classifier [DTC] and decision-tree using gradient boosting framework [XGBoost]) and compared traditional logistic regression to identify patients at risk for becoming HNHC (patients in the highest decile of total days spent in hospital in a calendar year).
Out of 47,402 adults hospitalized with IBD, we identified 4,717 HNHC patients. The DTC model (length of stay (LOS), Charlson comorbidity index, procedure, frailty risk score (FRS), and age) had a mean area under the receiver operating characteristic curve (AUC) of 0.78±0.01 in the derivation dataset and 0.78±0.02 in the validation dataset. XGBoost (LOS, procedure, chronic pain, drug abuse, diabetic complication) had a mean AUC of 0.79±0.01 and 0.75 ± 0.02 in the derivation and validation datasets, respectively, compared with AUC 0.55±0.01 and 0.56±0.01 with traditional logistic regression (peptic ulcer disease, paresthesia, admission for osteomyelitis, renal failure, and lymphoma) in derivation and validation datasets, respectively.
In hospitalized patients with IBD, simplified tree-based machine learning algorithms using administrative claims data can accurately predict patients at-risk of progressing to HNHC.