Acute respiratory distress syndrome is frequently under recognized and associated with increased mortality. Previously, we developed a model that used machine learning and natural language processing of text from radiology reports to identify acute respiratory distress syndrome. The model showed improved performance in diagnosing acute respiratory distress syndrome when compared to a rule-based method. In this study, our objective was to externally validate the natural language processing model in patients from an independent hospital setting.
Secondary analysis of data across five prospective clinical studies.
An urban, tertiary care, academic hospital.
Adult patients admitted to the medical ICU and at-risk for acute respiratory distress syndrome.
Measurements and Main Results:
The natural language processing model was previously derived and internally validated in burn, trauma, and medical patients at Loyola University Medical Center. Two machine learning models were examined with the following text features from qualifying radiology reports: 1) word representations (n-grams) and 2) standardized clinical named entity mentions mapped from the National Library of Medicine Unified Medical Language System. The models were externally validated in a cohort of 235 patients at the University of Chicago Medicine, among which 110 (47%) were diagnosed with acute respiratory distress syndrome by expert annotation. During external validation, the n-gram model demonstrated good discrimination between acute respiratory distress syndrome and nonacute respiratory distress syndrome patients (C-statistic, 0.78; 95% CI, 0.72–0.84). The n-gram model had a higher discrimination for acute respiratory distress syndrome when compared with the standardized named entity model, although not statistically significant (C-statistic 0.78 vs 0.72; p = 0.09). The most important features in the model had good face validity for acute respiratory distress syndrome characteristics but differences in frequencies did occur between hospital settings.
Our computable phenotype for acute respiratory distress syndrome had good discrimination in external validation and may be used by other health systems for case-identification. Discrepancies in feature representation are likely due to differences in characteristics of the patient cohorts.