Secondary Logo

Journal Logo


Predictive Risk Models for Wound Infection-Related Hospitalization or ED Visits in Home Health Care Using Machine-Learning Algorithms

Song, Jiyoun PhD, RN, AGACNP-BC; Woo, Kyungmi PhD, RN; Shang, Jingjing PhD, RN; Ojo, Marietta MPH; Topaz, Maxim PhD, RN

Author Information
Advances in Skin & Wound Care: August 2021 - Volume 34 - Issue 8 - p 1-12
doi: 10.1097/01.ASW.0000755928.30524.22
  • Free



In the US, the number of individuals who receive home healthcare (HHC) has increased annually.1 As life expectancy and the prevalence of chronic diseases among older adults increase, the demand for HHC is expected to continue to grow. Patients receiving HHC are at high risk of infections2 because of underlying disease or lack of informed caregivers providing needed care.3,4 Wound infections can lead to hospitalizations, ED visits, and even death, resulting in increased healthcare costs and a burden to families and patients.5 Any delay in the diagnosis of wound infections may result in late initiation of antibiotic treatment, leading to negative outcomes such as sepsis and mortality.6,7 Therefore, healthcare practitioners should comprehensively assess each patient’s risk of wound infection.8

A diverse range of predictive analytics approaches can assist with early detection of infection risk and reduce negative outcomes.9 For example, machine-learning algorithms have been used to predict infections in patients.10 These algorithms can find patterns in data, effectively linking risk factors and the outcome of interest in a large dataset. Machine-learning algorithms can help determine the severity of infection, enhancing the quality of care provided to the patient.11 Natural language processing (NLP), a technique that allows healthcare practitioners to automatically uncover valuable information in free-text clinical notes, can support machine-learning algorithms and extract words and expressions of interest from informative clinical notes (eg, medical chart or nursing notes). In previous studies, NLP approaches helped identify infection cases that might have been missed by diagnosis codes.12–14

Most of the information about wounds in HHC is documented in clinical notes (unstructured data format) and a few multiple-choice questions (structured data format).15 One study found that up to 50% of important wound information is stored in clinical notes.16 None of the previous studies of wounds in HHC used data from clinical notes, creating a gap in the understanding of wound infection risk. This gap can be filled via an application of NLP, which allows extracting risk factors from clinical notes, thus creating a more accurate description of a patient’s condition.

An NLP algorithm to extract wound infection-related information from clinical notes in HHC was created and validated (Table 1).17 The researchers demonstrated that the NLP algorithm achieved overall good performance in detecting wound infection-related information: positive predictive value and sensitivity were high (0.87 and 0.91, respectively), and the F score, which takes into account false positives and false negatives, was 0.88.17 In this study, the hypothesis was that the availability of wound information extracted from clinical notes would improve the predictive ability of risk models. Specifically, this study aimed to (1) identify risk factors for wound infection-related hospitalization or ED visits among patients who received HHC and (2) evaluate whether infection risk predictive performance of machine-learning algorithms can be improved with data extracted with NLP.

The natural language processing (NLP) algorithm used in the study was developed using multiple methods. Initially, a literature search was conducted in various health research databases to identify relevant studies of wound infection. Given synthesized literature and clinical expertise of home healthcare (HHC), a list of candidates for wound infection-related categories was generated (wound type, wound infection, exudate, foul odor, periwound skin, wound bed tissue, spreading systemic signs, possible wound infection name, possible wound infection treatment). Next, a large standardized health terminology database (Unified Medical Language System [UMLS])2 was used to identify a preliminary list of terms for each wound infection category. In addition, a list of UMLS synonyms was extracted to enhance the information schema. Two of the research team members (a certified wound, ostomy, and continence nurse and a nursing PhD student) reviewed the preliminary list independently and then validated and finalized a list of specific and nonspecific wound infection symptoms and treatment categories (n = 9).
Steps Description
Initial concept identification After conducting a literature search to identify relevant studies, authors created a preliminary list of wound infection categories.
Face validation of concepts Next, authors matched each wound infection category with similar terms and synonyms from the UMLS and conducted expert validation of the lists to produce finalized categories.
Interactive rapid vocabulary explorer (see following section) Using the NLP software NimbleMiner, authors identified words and expressions related to the wound infection categories from the sample of nursing clinical notes. Then two users (HHC experts) entered the query of interest, and the software generated a large vocabulary of related items. The users then selected and saved relevant terms.
Label assignment and review In the next stage, the system used similar terms discovered by the users to assign positive or negative labels to the nursing clinical notes in the presence or absence of the terms. The users reviewed the assigned labels for accuracy.
Algorithm testing The authors developed a “gold standard” test set of clinical notes using a high-likelihood sampling approach. Two reviewers annotated 200 randomly selected clinical notes for the presence of one or more of the wound infection-related information categories. The NLP system was then applied to the test set; precision, recall, and F score were calculated.
Interactive Rapid Vocabulary Explorer
The first stage in developing the algorithm is creating a language model. Language models are statistical representations of a certain body of text (in this study, nursing clinical notes). To create the language model in NimbleMiner, a large corpus of clinical notes (a file that includes all the notes) was identified, and a specific type of language model (word embedding model) was used.3 A word embedding model enables the researchers to identify similar terms in the clinical notes and build a vocabulary based on the topic of interest (in this case, “wound infections”).
The next stage is aimed at helping users rapidly discover large vocabularies of relevant terms and expressions. In this study, interactive rapid vocabulary explorer was implemented by two nurses who are experts in HCC. The user enters a query term of interest (eg, “wound infection”), and the system returns a list of similar terms it identified as relevant (eg, “infected ulcer,” “infected wounds,” “wd infect”). For this study, lists of similar expressions for each of the wound infection-related information categories extracted from UMLS were prepopulated. The user selects and saves the relevant terms by clicking on them in the interactive vocabulary explorer screen. Negated terms (eg, “no wound infection,” “wd infection ruled out”) or other irrelevant terms that are not selected by the user are also saved in the system for further tasks, such as negation detection.
 In the final stage, the system uses similar terms discovered by the user during stage 2 to assign labels to clinical notes (while excluding notes with negations and other irrelevant terms). Assigning a positive label means that a concept of interest is present in the clinical note. When needed, the user reviews and updates the list of similar terms and negated terms. The user reviews the clinical notes with assigned labels for accuracy. This weakly supervised rapid labeling approach is based on a positive-labels learning framework validated in previous research.4,5
Further details about NimbleMiner’s architecture are described in detail elsewhere,6 and the system can be downloaded from under General Public License v3.0.
NLP Algorithm Testing
To test algorithm accuracy, a “gold standard” testing set of clinical notes was created using a high-likelihood sampling approach as follows. First, a subset of patients admitted to a hospital for a wound infection during an HHC episode was identified, as indicated in the structured data. Among these patients, a random subset of 200 clinical notes (50% visit notes and 50% care coordination notes) was extracted. Each note was annotated by two human expert reviewers, both with more than 5 years of experience in HHC, for the presence of one or more of the nine wound infection-related information categories. The interrater reliability was relatively high (κ = 0.72), indicating good agreement between reviewers.7 All disagreements were discussed until a final consensus was reached. Next, the research team’s NLP system was applied to the “gold standard” testing set, and precision (the number of true positives out of the total number of predicted positives), recall (the number of true positives out of the actual number of positives), and F score (the weighted harmonic mean of the precision and recall) were calculated for each category.
1. Woo K, Song J, Adams V, et al. Exploring prevalence of wound infections and related patient characteristics in homecare using natural language processing. Int Wound J 2021.
2. Kleinsorge R, Tilley C, Willis J. Unified Medical Language System (UMLS). In: Encyclopedia of Library and Information Sciences 2002:369-78.
3. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Adv Neural Inform Process Syst 2013:3111-9.
4. Elkan C, Noto K. Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2008; Las Vegas, Nevada.
5. Halpern Y, Horng S, Choi Y, Sontag D. Electronic medical record phenotyping using the anchor and learn framework. J Am Med Inform Assoc 2016;23(4):731-40.
6. Topaz M, Murga L, Bar-Bachar O, McDonald M, Bowles K. NimbleMiner: an open-source nursing-sensitive natural language processing system based on word embedding. CIN Comput Inform Nurs 2019;37(11):583-90.
7. McHugh ML. Interrater reliability: the kappa statistic. Biochem Med 2012;22(3):276-82.


This study was approved by the Columbia University Irving Medical Center and Visiting Nurse Service of New York home care agency institutional review boards.

Description of Dataset

The dataset included all adult patients served by a large HHC agency from January 1, 2014, to December 31, 2014. The data from the HHC agency’s electronic health record were extracted and linked to the Outcome and Assessment Information Set (OASIS) version C dataset. The OASIS is a standardized assessment used to measure patient outcomes in HHC and contains nearly 100 patient characteristics in the domains of sociodemographic variables; information on the patient home environment and informal caregivers; and health status, including diagnosis codes, functional status, psychosocial status, and health service utilization.18 The unit of analysis for this study was HHC episodes that began with initial patient admission assessment and lasted for 30 to 60 days until the patient was (1) discharged from HHC or (2) admitted to the hospital or ED.

NLP Method

The NLP method was used for processing the clinical notes data (Table 1).17 For all participants, the authors extracted all clinical notes written during their HHC episode, including visit notes (n = 1,149,586) and care coordination notes (n = 1,461,171).17 Next, the authors applied an open-source NLP system (NimbleMiner v. 3.0; GitHub, San Francisco, California)19 to automatically identify nine categories of wound infection-related information in the clinical notes: wound type (eg, open blister, pressure injury), general wound infection (eg, inflamed ulcer, cellulitis), exudate (eg, scant purulent drainage, serous drainage), foul odor (eg, bad odor, bad smell), periwound skin (eg, swollen wound, edematous), wound bed tissue (eg, hypergranulated tissue, nongranulating), spreading systemic signs (eg, vomiting, confusion), possible wound infection name (eg, gangrene, folliculitis), and possible wound infection treatment (eg, IV antibiotics, antibiotic ointment; Table 2).

Category Example Words and Expressions
Wound type Open blister
Venous ulcer
Surgical wound
Pressure injury
General wound infection Inflamed ulcer
Local infection of wound
Surgical site infection
Incision infection
Exudate Scant purulent drainage
Draining large amounts
White slough
Serous drainage
Foul odor Bad odor
Bad smell
Foul odor
Offensive odor
Periwound skin Swollen wound
Granulated slough
Erythema noted
Wound bed tissue Hypergranulated tissue
New necrotic tissue
Spreading systemic signs Vomiting
Possible wound infection name Gangrene
Necrotizing fasciitis
Skin necrosis
Possible wound infection treatment IV antibiotics
Antibiotic ointment
Apply silver sulfadiazine (Silvadene) cream
Surgical debridement

The wound vocabularies for NLP were developed through the following steps: (1) initial concept identification from the literature search; (2) validation from large standardized medical terminologies and clinical experts; (3) interactive rapid vocabulary explorer to identify words and expressions related to wound description using NimbleMiner; and (4) NLP testing by developing an expert-reviewed set of clinical notes. Overall, it was found that the NLP system’s accuracy in identifying wound-related concepts in the clinical notes was high (>80%).

Study Outcome

The outcome of interest was wound infection-related hospitalization or ED visits. From the OASIS dataset, the information on wound status assessed at admission and a reason for hospitalization or ED visits at discharge from home care were extracted. Specifically, wound status was identified through OASIS items including M1306, M1330, M1340, and M1350; in addition, hospitalization or ED visits were identified through items including M2430 and M2310. The information collected at admission with OASIS assessment, which included the patient’s characteristics, diagnosis codes, functional status, psychosocial status, and health service utilization, was linked with the outcome variable that was captured at the end of the HCC episode.

Variable Selection

The initial dataset had 501 sociodemographic, clinical, and functional status variables. Categories with more than 15% missing and redundant variables were initially removed from the analysis; following that, variables were eliminated through Student t test or Fisher exact test at α level = .05. In addition, for every variable correlated above 0.5 or below −0.5 (representing strong correlations), one with low frequency was excluded to avoid linear dependency on the other (eg, history of diabetes and medications for diabetes). Among the 67 remaining potential risk factors, a list of 35 variables was finalized through stepwise selection at α level = .05 for machine learning analysis: 26 variables derived from the OASIS dataset and 9 variables from clinical notes through NLP (Table 3).

Category Detail Operationalization
Sex Female Binary: yes/no
Race Asian or Pacific Islandera Binary: yes/no
Black Binary: yes/no
Hispanic Binary: yes/no
White Binary: yes/no
Type of insurance Medicaid Binary: yes/no
Prior conditions Indwelling/suprapubic catheter Binary: yes/no
Memory deficit Binary: yes/no
Diagnosis Arthritisa Binary: yes/no
Cancer Binary: yes/no
Diabetesa Binary: yes/no
Neurologic disorder Binary: yes/no
Peripheral artery diseasea Binary: yes/no
Skin ulcera Binary: yes/no
Strokea Binary: yes/no
Therapies at home IV or infusion therapy Binary: yes/no
Risk for hospitalization
Risk for hospitalization Multiple hospitalizations in the past 6 moa Binary: yes/no
Currently taking five or more medications Binary: yes/no
Risk factors Smokinga Binary: yes/no
Obesitya Binary: yes/no
Sensory status Frequency of paina Four categories
Integumentary status
Risk of developing pressure injury Binary: yes/no
Having at least one unhealed pressure injury stage 2 or higher Binary: yes/no
Having stasis ulcera Binary: yes/no
Having skin lesion or open wound Binary: yes/no
Respiratory status
Shortness of breatha 4 categories
Respiratory treatments at home (any)a Binary: yes/no
Elimination status
Urinary cathetera Binary: yes/no
Urinary incontinent Binary: yes/no
Ostomy for bowel elimination Binary: yes/no
Neurologic/emotional/behavioral status
Cognitive functioning Binary: alert/oriented or requires prompting
Requires assistance or totally dependent
Activities of daily living/Instrumental activities of daily living
Grooming Four categories
Dress upper body Four categories
Dress lower body Four categories
Bathing Four categories
Toilet transferringa Three categories
Toileting hygiene a Four categories
Transferring Four categories
Ambulation/locomotiona Four categories
Feeding or eating Four categories
Medication and medication management
Drug regimen review Did a complete review identify potential clinically significant issues? Binary: yes/no
Care management Assist—procedures/treatmentsa Binary categories—assistance (yes/no)
Analgesics, narcotic Binary: yes/no
Antibiotics, othera Binary: yes/no
Anticoagulantsa Binary: yes/no
Antibacterials, urinarya Binary: yes/no
Anticonvulsants Binary: yes/no
Antiparasitics Binary: yes/no
Antiulcer and other gastrointestinal drugsa Binary: yes/no
Cardiovascular preparations, other Binary: yes/no
Diabetic therapy Binary: yes/no
Emollients, protectives Binary: yes/no
Enzymes Binary: yes/no
Fungicides Binary: yes/no
Hypotensives, other Binary: yes/no
Laxativesa Binary: yes/no
Penicillinsa Binary: yes/no
Streptomycins Binary: yes/no
Vitamins, fat-solublea Binary: yes/no
Vitamin K preparationsa Binary: yes/no
Narrative-free text charting (extracted through natural language processing)
Wound typea Binary: yes/no
General wound infectiona Binary: yes/no
Exudatea Binary: yes/no
Foul odora Binary: yes/no
Periwound skina Binary: yes/no
Wound bed tissuea Binary: yes/no
Spreading systemic signsa Binary: yes/no
Possible wound infection namea Binary: yes/no
Possible wound infection treatmenta Binary: yes/no
aThe finalized 35 variables that were identified through the stepwise selection method at α level of .05 were included for machine learning.

Applying Machine-Learning Algorithms

After variable selection, the data were randomly divided into 80% for the training set and 20% for testing to verify how well the model was built. In addition, fivefold cross-validation was used to increase the model’s generalizability.20

Logistic Regression

Widely used as an interpretable statistical method, logistic regression models quantify the relationship between a binary outcome and one or more independent variables (ie, predictors).21 As a logistic regression, a traditional method of predictive modeling is better for assessing the reliability of risk predictions within real-world data;22 multivariable logistic regression was used in this study. The relationship was presented with an adjusted odds ratio (OR) at a significance level P < .05.

Random Forest

As one of the ensemble classification methods (a technique that combines several base models), the random forest is a result of a combination of several decision tree classifiers. The final prediction of the random forest is generated based on the average predictions that were obtained by training on a different subset of the dataset and by splitting branch nodes in each different tree.23 As a result, the random forest is better at improving predictive accuracy and controlling overfitting than a single decision tree. In addition, similar to the structure of the decision tree, the strongest feature or variable to discriminate the classes is located at the top of the tree. The variable importance of random forest provides information regarding the rank of classifiers that is best suited to differentiate the classes from top to bottom of the decision tree within very high dimensional data.

Two metrics are commonly used for determining variable importance: (1) the mean decrease in accuracy (MDA) measuring the change in prediction accuracy by randomly permuting the values of the variable and (2) the mean decrease in Gini (MDG) adding all decreases in Gini impurity attributable to a given variable.23 In this study, both metrics were used to explore ranking stability, and randomForest package in R software (version 4.1.0; Foundation of Statistical Computing, Vienna, Austria) was used to build the random forest model.24

Artificial Neural Networks

Similar to the complex structure of the human brain, an artificial neural network (ANN) is an information-processing algorithm used to identify and learn the structure of a system based on the measurement of input, output, and state variables. In ANN, artificial neurons are organized in layers: the input layer is the information fed into the network from outside; the output layer is an outcome where information is processed in the network; and the layer between the input and output layers is the hidden layer (ie, state variables). The layers are composed of multiple predictive functions of connection weights that are determined by repeatedly calculating and adjusting the output of the network to the training dataset to minimize error.25 These predictive functions are similar to the way that neurons in the human brain make complex decisions, such as receiving multiple inputs multiplied by weights and providing outputs with the sum of the inputs. In this study, nnet package in R was used to build the neural network models.26

Model Evaluation

To evaluate the model’s risk prediction performance, the following estimates were calculated: (1) sensitivity, which refers to the proportion of people with the diseases that are correctly identified; (2) specificity, which refers to the rate at which a test sets a correct detection for a patient who is without the disease as a negative case; (3) positive predictive value, which refers to the probability that the patient identified as a positive case truly has the disease; (4) negative predictive value, which refers to the probability that the patient identified as a negative case truly does not have the disease; (5) accuracy, which refers to the total number of records that are correctly classified by the model; and (6) the area under the receiver operating characteristic curve (AUC), which refers to all the possible decision thresholds from a predicted probabilities result. The predictive ability was compared among not only three predictive models, but also among the three models built on variables derived from only structured data and the other three models built on variables derived from both structured data and unstructured clinical notes.


Of 112,789 patients who received HHC during the study period, 54,316 (48.2%) had wounds. Of these, 754 (1.39%) were hospitalized or had ED visits related to wound infection. The median participant age was 67.6 years, and 56.7% of the patients were female. More than 30%, 19%, and 4% of patients overall were diagnosed with diabetes, skin ulcer, and peripheral vascular disease, respectively. However, patients with wound infection-related hospitalization or ED visits were more likely to have these diagnoses than those without (49.5% vs 31.1%, 45.2% vs 18.9%, and 13.8% vs 4.2%, respectively; all P < .05). Although 32.3% of patients had multiple hospitalizations in the past 6 months, and 71.5% were taking five or more medications, both were more common in the patients with hospitalization or ED visits related to wound infection than those without (45% vs 32.2%, and 75.7% vs 71.4%, respectively; all P < .05).

Frequency of Words Related to Wound Infection

Table 4 shows the frequency of words related to wound infection extracted through NLP from clinical notes. Overall, wound type (eg, stasis ulcer, pressure injury, nonspecific wound, etc) was mentioned in 78.1% of patients’ clinical notes. Signs of infection such as exudate, spreading systemic signs, and foul odor were mentioned in 21.5%, 26.5%, and 3.3% of clinical notes, respectively. In addition, treatments for wound infections were described in 34.5% of patients’ clinical notes.

Characteristic Total (n = 54,316) Patients Without Wound Infection-Related Hospitalization/ED Visit (n = 53,562) Patients With Wound Infection-Related Hospitalization/ED Visit (n = 754)
Age (interquartile range) 67.6 (56.2–79.7) 67.6 (56.2–79.8) 66.5 (54.7–78.6)
Female sex, % 56.7 53.9 56.8
Race, %
 Asian or Pacific Islander 7.0 7.0 4.8a
 Black 23.5 23.4 28.0a
 Hispanic 20.6 20.5 24.9a
 White 48.9 48.9 42.2a
Type of insurance
 Medicare 41.6 41.6 41.5
 Medicaid 19.0 19.0 21.6a
Prior condition, %
 Urinary incontinence 15.3 15.3 17.6
 Indwelling/suprapubic  catheter 1.52 1.5 2.9a
 Intractable pain 14.8 14.8 15.7
 Impaired  decision-making 6.7 6.7 7.6
 Disruptive, infantile, or  socially inappropriate  behavior 0.5 0.5 0.8
 Memory deficit 4.3 4.3 5.8a
Diagnosis, %
 Acute myocardial  infarction 14.3 14.3 15.7
 AIDS 2.1 1.9 2.1
 Arthritis 19.8 19.9 10.0a
 Cancer 6.5 6.6 3.9a
 Cardiac dysrhythmias 8.8 8.9 7.6
 Cerebral degeneration 1.9 1.9 2.0
 Dementia 5.7 5.7 6.4
 Depression 9.4 9.4 8.5
 Diabetes 31.4 31.1 49.5a
 Heart failure 9.8 9.8 8.1
 Hypertension 56.6 56.5 59.8
 Neurologic disorder 4.0 3.9 6.4a
 Pulmonary disease 12.3 12.3 10.3
 Peripheral vascular  disease 4.4 4.2 13.8a
 Renal disease 10.1 10.1 10.5
 Skin ulcer 19.2 18.9 45.2a
 Stroke 4.6 4.6 3.0a
Therapies at home, %
 IV or infusion therapy  (excludes TPN) 3.7 3.7 5.8a
 Parenteral nutrition  (TPN or lipids) 0.23 0.23 0.13
 Enteral nutrition 2.1 2.1 1.5
Risk for hospitalization, %
 History of falls 12.7 12.7 12.0
 Multiple hospitalizations  in the past 6 mo 32.3 32.2 45.0a
 Currently taking five  or more medications 71.5 71.4 75.7a
 Decline in mental, emotional, or behavioral status in the past 3 mo 11.1 11.1 11.1
Overall, %a
 Stable 11.7 11.8 9.1
 Likely to be stable 75.2 75.1 77.0
 Fragile 12.2 12.2 12.8
 Serious 1.0 0.9 1.1
Narrative-free text charting (extracted through NLP), %
 Wound type 78.1 77.8 97.2a
 General wound infection 6.0 5.8 18.7a
 Exudate 21.5 21.1 48.7a
 Foul odor 3.3 3.1 16.6a
 Periwound skin 29.2 29.0 46.2a
 Wound bed tissue 4.9 4.8 9.3a
 Spreading systemic  signs 26.5 26.3 38.9a
 Possible wound  infection name 2.5 2.4 10.5a
 Possible wound  infection treatment 34.5 34.0 71.0a
Abbreviations: NLP, natural language processing; TPN, total parental nutrition.
aP < .05, t test or Fisher exact test, as appropriate.

Building Predictive Models

Logistic Regression

Table 5 presents the results of bivariate and multivariable logistic regression for wound infection-related hospitalization or ED visits. Presence of diabetes, peripheral vascular disease, or skin ulcer was associated with increased risk of wound infection-related hospitalization or ED visits in the bivariate analysis (OR, 2.17 [95% confidence interval {CI}, 1.88–2.5]; OR, 3.62 [95% CI, 2.93–4.47]; and OR, 3.35 [95% CI, 3.07-4.11], respectively). However, in the multivariable logistic regression, they were less likely to have this risk (OR, 0.64 [95% CI, 0.55–0.75]; OR, 0.52 [95% CI, 0.41–0.65]; and OR, 0.52 [95% CI, 0.43–0.62], respectively). For integumentary status, patients who had stasis ulcers were more likely to be hospitalized or have ED visits related to wound infection in the bivariate analysis (OR, 3.08 [95% CI, 2.43–3.9]), but it was no longer significant in the multivariable logistic regression. Regarding elimination status, patients with an indwelling urinary catheter were more likely to be hospitalized or have ED visits (OR, 1.12 [95% CI, 1.03–1.4]). From the narrative-free text charting, all categories of expression were associated with the risk of wound infection-related hospitalization or ED visits. Documentation of wound type and foul odor in the patient’s clinical notes was highly associated with increased risk of wound infection-related hospitalization or ED visits (OR, 9.94 [95% CI, 6.44–15.34]; and OR, 6.28 [95% CI, 5.15–7.66], respectively; all P > .05).

Description Bivariate Logistic Regression Multivariable Logistic Regression
OR 95% CI Adjusted OR 95% CI
 Race: Asian (ref: non-Asian) 0.67a 0.47–0.95 1.33 0.93–1.91
Previous history and diagnosis
 Arthritis (ref: no) 0.44b 0.35–0.57 1.33a 1.03–1.71
 Diabetes (ref: no) 2.17b 1.88–2.5 0.64b 0.55–0.75
 Peripheral vascular disease (ref: no) 3.62b 2.93–4.47 0.52b 0.41–0.65
 Skin ulcer (ref: no) 3.55b 3.07–4.11 0.52b 0.43–0.62
 Stroke (ref: no) 0.62a 0.4–0.95 1.85a 1.19–2.85
Risk for hospitalization
 Multiple hospitalizations (ref: no) 1.72b 1.49–1.99 0.7b 0.6, 0.81
Risk factor
 Obesity (ref: no) 1.62b 1.36–1.93 0.72a 0.6–0.86
 Smoking (ref: no) 1.43a 1.13–1.8 0.81 0.63–1.03
Sensory status
 Frequency of pain (ref: no pain)
 Patient has pain that does not interfere or less often than daily 1.16 0.91–1.47 0.78a 0.61–0.99
 Daily, but not constantly 1.26a 1.03–1.55 0.65b 0.52–0.80
 All of the time 1.76a 1.28–2.42 0.44b 0.31–0.61
Integumentary status
 Stasis ulcer (ref: no) 3.08b 2.43–3.9 0.8 0.62–1.05
Respiratory status
 Short of breath (ref: no)
 When walking more than 20 ft, climbing stairs 0.72a 0.59–0.87 1.49b 1.22–1.83
 With moderate exertion 0.85 0.68–1.06 1.46a 1.15–1.85
 With minimal exertion or at rest 0.68 0.45–1.02 1.76a 1.14–2.72
 Respiratory treatments utilized at home
 Oxygen (ref: no) 0.52a 0.32–0.85 2.01a 1.19–3.39
Elimination status
 Urinary catheter (ref: no) 1.12a 1.03–1.4 1.19 1.0–1.42
Activities of daily living
 Toilet transferring (ref: independent)
 Assistance 1.12 0.95–1.32 0.79a 0.64–0.98
 Cannot get to toilet or dependent 1.98b 1.63–2.41 0.63a 0.45–0.89
 Toileting hygiene (ref: independent)
 Independent if supplies/implements are laid out 0.96 0.8–1.16 1.26 1.0–1.58
 Assistance 1.11 0.9–1.37 1.38a 1.04–1.85
 Dependent 1.63b 1.28–2.07 2.03a 1.35–3.06
Ambulation (ref: independent or one-handed device)
 Requires a two-handed device to walk alone on a level surface and/or supervision/assistance to negotiate stairs, steps, or uneven surfaces. 0.99 0.81–1.21 0.87 0.7–1.07
 Requires supervision or assistance at all times 1.1 0.88–1.37 0.79 0.61–1.02
 Chairfast (cannot ambulate but can wheel self independently) or bedfast (cannot ambulate or sit up in a chair) 2.51b 2.02–3.13 0.5b 0.36–0.69
Care management
 Medical procedures/treatments (eg, changing wound dressing) (ref: no assistance needed) 4.33b 3.56–5.26 0.36b 0.3–0.45
 Gastrointestinal drugs (ref: no) 0.73b 0.63–0.85 1.21a 1.03–1.42
 Laxative (ref: no) 0.72b 0.62–0.84 1.16 0.99–1.35
 Penicillin (ref: no) 2.27b 1.78–2.86 0.62b 0.49–0.79
 Antibiotics (ref: no) 1.63b 1.29–2.04 0.82 0.65–1.03
 Antibacterial, urinary (ref: no) 1.74a 1.25–2.42 0.71a 0.51–0.99
 Anticoagulants (ref: no) 1.24a 1.06–1.44 0.85 0.72–1.0
 Vitamins, fat soluble (ref: no) 0.75a 0.58–0.97 1.31a 1.02–1.7
 Vitamin K preparations (ref: no) 0.076a 0.01–0.54 6.36 0.89–45.7
Narrative-free text charting (extracted through NLP)
 Wound type (ref: no) 9.94b 6.44–15.34 3.45b 2.26–5.57
 General wound infection (ref: no) 3.76b 3.12–4.53 1.85b 1.51–2.25
 Exudate (ref: no) 3.54b 3.07–4.09 1.64b 1.39–1.93
 Foul odor (ref: no) 6.28b 5.15–7.66 2.4b 1.91–2.99
 Periwound skin (ref: no) 2.1b 1.82–2.42 1.45b 1.2–1.71
 Wound bed tissue (ref: no) 2.03b 1.58–2.6 0.63b 0.48–0.82
 Spreading systemic signs (ref: no) 1.18b 1.53–2.06 1.24b 1.05–1.46
 Possible wound infection name (ref: no) 4.85b 3.81–6.16 1.34b 1.03–1.73
 Possible wound infection treatment (ref: no) 4.75b 4.05–5.56 1.92b 1.61–2.01
Abbreviations: CI, confidence interval; NLP, natural language processing; OR, odds ratio; ref, reference.
aP < .05.
bP < .0001.

Random Forest

The importance of variables for predicting wound infection-related hospitalization or ED visits is shown in the Figure. The 10 variables with the greatest MDA (the greatest predictive power of every variable) were toilet hygiene (activities of daily living [ADLs] and instrumental activities of daily living [IADLs]), toilet transferring (ADLs/IADLs), presence of a urinary catheter, arthritis, ambulation (ADLs/IADLs), “periwound skin” (extracted through NLP), shortness of breath, stasis ulcer, “exudate” (extracted through NLP), and smoking. The four types of word expressions related to wound infection (ie, wound types, possible wound infection treatment, foul odor, and general wound infection) that had negative value in MDA did not provide additional accuracy. Five features were associated with larger MDG: frequency of pain, ambulation (ADLs/IADLs), toilet hygiene (ADLs/IADLs), shortness of breath, and toilet transferring (ADLs/IADLs). The word expressions related to wound infection, except for spreading systemic signs and periwound skin, were located at or below 60% in the MDG ranking.


Artificial Neural Networks

The output variable of the prediction models was a binary variable of wound infection-related hospitalization or ED visits. The other 35 variables were selected as input variables. The best-performing model was a two-layer neural network with one hidden layer of 24 nodes and an output layer with 1 node. That is, one hidden layer of 24 nodes, which are more sensitive than others by accounting for in their weight value, helped to understand the correlations that are not clearly causative.

Model Evaluation

A comparison of the wound infection-related hospitalization or ED visit risk prediction ability among the three models is presented in Table 6. Overall, all three machine learning models with variables derived from both structured and clinical notes showed acceptable risk prediction performance based on the AUC score: logistic regression = 0.82, random forest = 0.75, and ANN = 0.78. The logistic regression had the highest sensitivity at 87.6%, and the ANN had the highest positive predictive value at 3.8%. Compared with the predictive models with variables that were derived from only structured data, the predictive ability of the model with variables from both structured data and clinical notes improved markedly: a 6% increase in risk prediction performance of logistic regression, 13.2% in random forest, and 6.4% in ANN.

Evaluation Sensitivity Specificity PPV NPV Accuracy AUC
Logistic regression
 Model without NLP variablesa 0.765 0.677 0.033 0.995 0.678 0.772
 Model with NLP variablesb 0.876 0.641 0.034 0.997 0.644 0.818
Random forest
 Model without NLP variables 0.569 0.710 0.027 0.991 0.708 0.658
 Model with NLP variables 0.680 0.730 0.035 0.994 0.729 0.745
Artificial neural network
 Model without NLP variables 0.732 0.740 0.038 0.995 0.740 0.731
 Model with NLP variables 0.861 0.749 0.046 0.997 0.750 0.778
Abbreviations: AUC, area under the curve; NLP, natural language process; NPV, negative predictive value; PPV, positive predictive value.
aModel without NLP variables is the model with variables that came from only structured data.
bModel with NLP variables is the model with variables that came from both structured data and unstructured clinical notes.


Given that wound infection is one of the top five reasons for unplanned hospitalization in patients who receive HHC services,2 risk prediction for wound infection-related hospitalization or ED visits is warranted. Because the study approach to early prediction with accurate predictive modeling can be used to prevent adverse outcomes, the risk factors for wound infection-related hospitalization or ED visits among patients who received HHC were identified using both structured OASIS-C dataset and unstructured information extracted from clinical notes via NLP and then compared using the risk prediction performance of three machine-learning algorithms: logistic regression, random forest, and ANN.

The potential risk factors found from bivariate logistic regression were consistent with a previous study that found that chronic diseases such as diabetes, peripheral vascular disease or skin ulcer, obesity, smoking, prior history of multiple hospitalizations, presence of urinary catheter, and the needs of caregiver for ADL/IADLs were associated with an increased risk of wound infection or unplanned hospitalization.27,28 However, controlling for other factors in the multivariable logistic regression model, some risk factors became insignificant or less associated with the wound infection-related hospitalization or ED visits in the multivariable logistic regression. This is probably because it was affected by the presence of other measured or unmeasured factors, or other factors such as information extracted from clinical notes were predominant within the correlation. Therefore, further studies should be conducted to assess these associations, performing variable selection with a more conservative method such as LASSO (least absolute shrinkage and selection operator) regression.29

Of the wound statuses that would directly affect wound infection, the odds of having a stasis ulcer were 3.1 times greater among patients with wound infection-related hospitalization or ED visits than those without wound infection-related hospitalization or ED visits. However, the association was not significant in the multivariable logistic regression. This is probably because variables of narrative-free text charting (extracted through NLP) may more clearly reflect the wound status than the binary coded variable. Therefore, additional post hoc analysis is warranted to examine accurate associations in further study.

In addition, in the respiratory status-related characteristics, patients who were short of breath or received oxygen treatments at home were associated with an increased risk of wound infection-related hospitalization or ED visits (OR, 1.76 [95% CI, 1.14–2.72] and 2.01 [95% CI, 1.19–3.39], respectively). Although the authors identified wound infection-related hospitalization or ED visits through wound status assessed at admission and a reason for hospitalization or ED visits at discharge from home care, it is possible that the existing respiratory problem itself or worsening symptoms caused by wound infection (eg, a symptom of sepsis) are another reason for unplanned hospitalization or ED visits.

In this study, various machine-learning algorithms were used to predict wound infection-related hospitalization or ED visits. Although all three models showed acceptable discrimination when the models were built based on the risk factors identified, the logistic regression model showed higher risk prediction performance (AUC = 0.818). Even though ensemble machine-learning algorithms such as random forest and ANN were applied to develop accurate predictive models and overcome the weakness of a single model, traditional logistic regression showed better risk prediction performance for predicting low-prevalence conditions such as wound infection-related hospitalization or ED visits in HHC settings. This was consistent with a previous study where the logistic regression was better at predicting low-prevalence conditions such as healthcare-associated infections.30 Although ANN, which consists of interconnected nodes in the layer to compute the output of a network, showed a better predictive ability than the random forest, it can be improved by conducting exploratory analysis for the accuracy of backpropagation neural networks in single or multiple hidden layers.31

Predictive value refers to the impact of a test result on a predictive model built using training dataset to detect correct cases in test dataset. Therefore, positive and negative predictive values are important parameters that influence the impact on decision support in clinical settings. In this study, the clinical implications of the three models were limited because the positive predictive values of three models were very low (3.3%–4.6%). However, because both negative and positive predictive values depend on the prevalence of disease, it is not appropriate to determine the predictive ability based on these values given the low prevalence of wound infection-related hospitalizations or ED visits (1.39%).

In addition, this study provided significant insights into the necessity of mining information from unstructured clinical notes. The authors demonstrated that the models were improved when incorporating both structured data and unstructured clinical notes. In the process of building predictive models, significant issues such as redundancy or overfitting are confronted. If many possible independent variables are being investigated, the chance of redundancy rises from hidden relationships between variables or the presence of other measured or unmeasured relevant features.32 In addition, even though the numerous independent variables were not related, a predictive model with numerous independent variables (ie, complex model) might have a problem with overfitting.32 That is, its predictive ability would be great on training datasets but poor on test datasets. Therefore, a simple model with fewer independent variables is preferred to build an accurate model, despite the known assumption that significant issues such as redundancy or overfitting can occur.

In line with this, this study showed that the three models using both structured and unstructured datasets showed better risk prediction performance than the models with only structured datasets (35 independent variables versus 26 independent variables). Further, as seen in the feature importance in the random forest model (Figure), the independent variables of word expressions related to wound infection were not located in the top five rankings; rather, they were mostly located in the middle or lower rankings. This means that even though the independent variables of the word expressions related to wound infection (extracted through NLP) were not top-ranked features for the classification, they significantly contributed to improving the predictive ability of the three models (range, 6%–13.2%).

For a study with a large dataset, the quality is more important than the quantity of the dataset. That is, unstructured clinical notes (quality) can contain more robust information such as subjective expressions by patients or narratively expressed objective assessment by healthcare providers. Therefore, the information that comes from both structured standardized tools for routine assessments and unstructured narrative text charting should be continuously used for accurate investigation of association or improvement of the predictive ability of predictive models in future studies.


In theory, early risk prediction can help healthcare providers take quick action, thus preventing negative outcomes.33–36 Although predictive risk modeling can help identify infections earlier, further studies should test the impact of risk identification on reducing negative outcomes in clinical trials. Even though the sample was large enough to fulfill the objectives of the study, the findings and the predictive models might not be widely generalizable because this study was conducted using data from a single HHC agency that serves mostly urban populations. Further, because a single year of data was used (2014), it is possible that the circumstances of that particular year might have affected the study outcomes.

Because of the high number of independent variables and the low prevalence of the negative outcome, it is plausible that the model performance was artificially inflated by small-sample bias. Therefore, a weighting method, such as propensity scoring, might be useful in future studies to adjust for these types of biases.37

Further limitations are attributable to the OASIS dataset because it captures only a snapshot on admission to HHC, not the fluctuations of conditions during the HHC episode. From the dataset, many variables were excluded for the initial analysis because of missing values. However, most of the missing values were results of the question design; follow-up questions did not always have to be completed (eg, skip the following question if you answered “No” and go to “Question #”). Therefore, subsequent post hoc analysis might be conducted rather than excluding the variables with lots of missing data to prevent information loss in future studies. Further, although patients who needed hospitalization or ED visits were identified with OASIS dataset codes in this study, using the content of free-text clinical notes could also identify these patients.38 Last, unmeasured factors such as environmental barriers or psychological aspects for care adherence that could not be assessed within the dataset might have acted as confounders.


The risk of wound infection-related hospitalization or ED visits was identified using both structured dataset and unstructured information extracted from clinical notes via NLP. Significantly, word expressions of wound type and foul odor in the patient’s clinical notes were highly associated with increased risk of wound infection-related hospitalization or ED visits. In comparing different models, logistic regression showed the best risk prediction performance for a low prevalence event such as wound infection-related hospitalization or ED visits in HHC. In addition, the models using structured data and incorporating text data from clinical notes through NLP improved the performance of predictive risk models.


1. Medicare Payment Advisory Commission. Report to the Congress: Medicare Payment Policy. Chapter 9: Home Health Care Services. March 2019. Last accessed May 4, 2021.
2. Shang J, Larson E, Liu J, Stone P. Infection in home health care: results from national outcome and assessment information set data. Am J Infect Control 2015;43(5):454–9.
3. Shang J, Russell D, Dowding D, et al. A predictive risk model for infection-related hospitalization among home healthcare patients. J Healthc Qual 2020;42(3):136–47.
4. Dowding D, Russell D, Trifilio M, McDonald MV, Shang J. Home care nurses' identification of patients at risk of infection and their risk mitigation strategies: a qualitative interview study. Int J Nurs Stud 2020;107:103617.
5. Perencevich EN, Sands KE, Cosgrove SE, Guadagnoli E, Meara E, Platt R. Health and economic impact of surgical site infections diagnosed after hospital discharge. Emerg Infect Dis 2003;9(2):196–203.
6. Andersson M, Östholm-Balkhed Å, Fredrikson M, et al. Delay of appropriate antibiotic treatment is associated with high mortality in patients with community-onset sepsis in a Swedish setting. Eur J Clin Microbiol Infect Dis 2019;38(7):1223–34.
7. Nauclér P, Huttner A, van Werkhoven CH, et al. Impact of time to antibiotic therapy on clinical outcome in patients with bacterial infections in the emergency department: implications for antimicrobial stewardship. Clin Microbiol Infect 2021;27(2):175–81.
8. Boga SM. Nursing practices in the prevention of post-operative wound infection in accordance with evidence-based approach. Int J Caring Sci 2019;12(2):1228.
9. Cadogan J, Baldwin D, Carpenter S, et al. Identification, diagnosis and treatment of wound infection. Nurs Stand 2011;26(11):44–8.
10. Scardoni A, Balzarini F, Signorelli C, Cabitza F, Odone A. Artificial intelligence-based tools to control healthcare associated infections: a systematic review of the literature. J Infect Public Health 2020;13(8):1061–77.
11. Goto T, Camargo CA Jr, Faridi MK, Freishtat RJ, Hasegawa K. Machine learning-based prediction of clinical outcomes for children during emergency department triage. JAMA Netw Open 2019;2(1):e186937.
12. Dublin S, Baldwin E, Walker RL, et al. Natural language processing to identify pneumonia from radiology reports. Pharmacoepidemiol Drug Saf 2013;22(8):834–41.
13. Liu V, Clark MP, Mendoza M, et al. Automated identification of pneumonia in chest radiograph reports in critically ill patients. BMC Med Inform Decis Mak 2013;13:90.
14. Kang CM, Chang SC, Chen PL, et al. Comparison of family partnership intervention care vs. conventional care in adult patients with poorly controlled type 2 diabetes in a community hospital: a randomized controlled trial. Int J Nurs Stud 2010;47(11):1363–73.
15. Brown P. Quick Reference Guide to Wound Care: Palliative, Home and Clinical Practices. Burlington, MA: Jones & Bartlett Learning; 2013.
16. Mirto IM, Monteleone M, Silberztein M. Formalizing natural languages with NooJ 2018 and its natural language processing applications. In: 12th International Conference, NooJ 2018; Palermo, Italy; June 20-22, 2018; Revised Selected Papers. Vol 987. Springer; 2018.
17. Woo K, Song J, Adams V, et al. Exploring prevalence of wound infections and related patient characteristics in homecare using natural language processing. Int Wound J 2021.
18. Centers for Medicare & Medicaid Services. OASIS User Manuals. May 2019. Last accessed May 4, 2021.
19. Topaz M, Murga L, Bar-Bachar O, McDonald M, Bowles K. NimbleMiner: an open-source nursing-sensitive natural language processing system based on word embedding. CIN Comput Inform Nurs 2019;37(11):583–90.
20. Schaffer C. Selecting a classification method by cross-validation. Mach Learn 1993;13(1):135–43.
21. Hosmer DW Jr, Lemeshow S, Sturdivant RX. Applied Logistic Regression. Vol 398. John Wiley & Sons; 2013.
22. Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 2019;110:12–22.
23. Breiman L. Random forests. Mach Learn 2001;45(1):5–32.
24. Liaw A, Wiener M. Classification and regression by randomForest. R News 2002;2(3):18–22.
25. Hill T, Marquez L, O'Connor M, Remus W. Artificial neural network models for forecasting and decision making. Int J Forecast 1994;10(1):5–15.
26. Venables WN, Ripley BD. Modern Applied Statistics With S-PLUS. Springer Science & Business Media; 2013.
27. Lohman MC, Scherer EA, Whiteman KL, Greenberg RL, Bruce ML. Factors associated with accelerated hospitalization and re-hospitalization among Medicare home health patients. J Gerontol Ser A 2017;73(9):1280–6.
28. Shang J, Wang J, Adams V, Ma C. Risk factors for infection in home health care: analysis of national outcome and assessment information set data. Res Nurs Health 2020;43(4):373–86.
29. Li Q, Shao J. Regularizing LASSO: a consistent variable selection method. Stat Sin 2015;25(3):975–92.
30. Nusinovici S, Tham YC, Chak Yan MY, et al. Logistic regression was as good as machine learning for predicting major chronic diseases. J Clin Epidemiol 2020;122:56–69.
31. Liu MC, Kuo W, Sastri T. An exploratory study of a neural network approach for reliability data analysis. Qual Reliab Eng Int 1995;11(2):107–12.
32. Shi L, Westerhuis JA, Rosén J, Landberg R, Brunius C. Variable selection and validation in multivariate modelling. Bioinformatics 2018;35(6):972–80.
33. Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff (Millwood) 2014;33(7):1123–31.
34. Neuberger L, Silk KJ. Uncertainty and information-seeking patterns: a test of competing hypotheses in the context of health care reform. Health Commun 2016;31(7):892–902.
35. Crockett DK. Why predictive modeling in healthcare requires a data warehouse. 2017. Last accessed May 4, 2021.
36. Watson K. Predictive Analytics in Health Care: Emerging Value and Risks. Deloitte Development LLC; 2019.
37. Pirracchio R, Resche-Rigon M, Chevret S. Evaluation of the propensity score methods for estimating marginal odds ratios in case of small sample size. BMC Med Res Methodol 2012;12:70.
38. Topaz M, Woo K, Ryvicker M, Zolnoori M, Cato K. Home healthcare clinical notes predict patient hospitalization and emergency department visits. Nurs Res 2020;69(6):448–54.

home healthcare; machine learning; natural language process; predictive risk model; wound infection

Copyright © 2021 Wolters Kluwer Health, Inc. All rights reserved.