The aim of this study was to build electronic algorithms using a combination of structured data and natural language processing (NLP) of text notes for potential safety surveillance of 9 postoperative complications.
Postoperative complications from 6 medical centers in the Southeastern United States were obtained from the Veterans Affairs Surgical Quality Improvement Program (VASQIP) registry. Development and test datasets were constructed using stratification by facility and date of procedure for patients with and without complications. Algorithms were developed from VASQIP outcome definitions using NLP-coded concepts, regular expressions, and structured data. The VASQIP nurse reviewer served as the reference standard for evaluating sensitivity and specificity. The algorithms were designed in the development and evaluated in the test dataset.
Sensitivity and specificity in the test set were 85% and 92% for acute renal failure, 80% and 93% for sepsis, 56% and 94% for deep vein thrombosis, 80% and 97% for pulmonary embolism, 88% and 89% for acute myocardial infarction, 88% and 92% for cardiac arrest, 80% and 90% for pneumonia, 95% and 80% for urinary tract infection, and 77% and 63% for wound infection, respectively. A third of the complications occurred outside of the hospital setting.
Computer algorithms on data extracted from the electronic health record produced respectable sensitivity and specificity across a large sample of patients seen in 6 different medical centers. This study demonstrates the utility of combining NLP with structured data for mining the information contained within the electronic health record.