To explore whether machine learning applied to pediatric critical care data could discover medically pertinent information, we analyzed clinically collected electronic medical record data, after data extraction and preparation, using k-means clustering.
Retrospective analysis of electronic medical record ICU data.
Tertiary Children’s Hospital PICU.
Anonymized electronic medical record data from PICU admissions over 10 years.
Measurements and Main Results:
Data from 11,384 PICU episodes were cleaned, and specific features were generated. A k-means clustering algorithm was applied, and the stability and medical validity of the resulting 10 clusters were determined. The distribution of mortality, length of stay, use of ventilation and pressors, and diagnostic categories among resulting clusters was analyzed. Clusters had significant prognostic information (p < 0.0001). Cluster membership predicted mortality (area under the curve of the receiver operating characteristic = 0.77). Length of stay, the use of inotropes and intubation, and diagnostic categories were nonrandomly distributed among the clusters (p < 0.0001).
A standard machine learning methodology was able to determine significant medically relevant information from PICU electronic medical record data which included prognosis, diagnosis, and therapy in an unsupervised approach. Further development and application of machine learning to critical care data may provide insights into how critical illness happens to children.