Machine Learning for Prediction of Patients on Hemodialysis with an Undetected SARS-CoV-2 Infection : Kidney360

Journal Logo

Original Investigations: Dialysis

Machine Learning for Prediction of Patients on Hemodialysis with an Undetected SARS-CoV-2 Infection

Monaghan, Caitlin K.1; Larkin, John W.1; Chaudhuri, Sheetal1,2; Han, Hao1; Jiao, Yue1; Bermudez, Kristine M.3; Weinhandl, Eric D.3; Dahne-Steuber, Ines A.3; Belmonte, Kathleen4; Neri, Luca5; Kotanko, Peter6,7; Kooman, Jeroen P.2; Hymes, Jeffrey L.3; Kossmann, Robert J.3; Usvyat, Len A.1; Maddux, Franklin W.8

Author Information
Kidney360 2(3):p 456-468, March 2021. | DOI: 10.34067/KID.0003802020
  • Free
  • Infographic

Abstract

Key Points

  • We developed a machine learning predictive model to detect patients on dialysis with a SARS-CoV-2 infection 3 days before symptom onset.
  • Changes in physiologic markers were subtle independently; model appeared to detect important combinations for each patient’s prediction.
  • We proposed a conceptual workflow for application of model-directed mitigation and testing within the standard practices of a provider.

Introduction

The coronavirus disease 2019 (COVID-19) pandemic is challenging the world’s health care systems, including bringing complexities to the maintenance of dialysis in people with ESKD (1234–5). In the United States, most patients with ESKD are treated by outpatient hemodialysis (HD), where social distancing can be difficult and heightened infection control measures are required (e.g., temperature screenings, universal masking, isolation treatments/shifts/clinics) (1234–5). Patients with ESKD are typically older and have multiple comorbidities, placing the population at higher risk for requiring intensive care and dying if affected by COVID-19 (6891011–12).

Early reports from the United States show an 11% COVID-19 mortality in ESKD (13), which is higher than the 3% COVID-19 mortality shown in the national population (14,15). This is not unexpected, with reports from Asia and Europe suggesting a 16% to 23% COVID-19 mortality in ESKD (161718–19). Despite the high mortality rate, an impaired immune response may render patients on dialysis more frequently asymptomatic when infected by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (16,17). In both the general and ESKD populations, the most prevalent symptoms of COVID-19 at presentation are fever (11%–66% in dialysis; 82% in the general population) and cough (37%–57% in dialysis; 62% in the general population) (16,2021–22). The less frequent occurrence of signs and symptoms indicative of COVID-19 in patients on dialysis could be making the outbreak even more challenging to manage.

Dialysis providers routinely capture patient/clinical data during care. The robust data collected during HD treatments (generally thrice weekly) provide unique opportunities to leverage artificial intelligence (AI) in predicting COVID-19 outcomes. AI modeling helped identify the onset of the outbreak in China (23,24), and is being used to help with early detection of areas and individuals in the general population at risk for COVID-19 (2526–27).

As part of a health care operations effort in response to the COVID-19 outbreak, an integrated kidney disease health care company aimed to develop a machine learning (ML) prediction model that identifies the risk of patients on HD having an undetected SARS-CoV-2 infection. We analyzed the model performance to determine the possible utility for testing in the HD population.

Materials and Methods

General

An integrated kidney disease health care company (Fresenius Medical Care, Waltham, MA) used retrospective real world data from its national network of dialysis clinics to develop a ML model that predicts the risk of an adult patient on HD having an undetected SARS-CoV-2 infection that is identified after the following ≥3 days.

This analysis was performed in adherence with the Declaration of Helsinki under an initial and revised protocol reviewed by the New England Independent Review Board (NEIRB). This retrospective analysis was determined to be exempt and did not require patient consent (Protocol version 1.0 NEIRB#1–17–1302368–1; Protocol revision version 1.1 NEIRB#17–1348994–1; Needham Heights, MA).

COVID-19 Mitigation and Testing Practices

The national network of dialysis clinics (Fresenius Kidney Care, Waltham, MA) started implementing modified infection control measures in late February 2020, in response to the COVID-19 outbreak in the general population. Universal mitigation efforts at the provider included screening patients and staff before entry into the dialysis facility for high body temperature, signs or symptoms of flu-like illness, exposure to others with COVID-19, or a known infection diagnosed elsewhere (28). Patients and staff were required to thoroughly wash their hands on entering and leaving the facility. Patients were provided surgical masks and were required to wear them when in any area of the facility. Staff were required to wear enhanced personal protective equipment, including masks, face shields, gowns, and gloves, when in the proximity of patients in any area. The first patients on dialysis (n=2) at the provider were identified as COVID-19 positive on March 3, 2020.

All patients and staff with an elevated body temperature or symptoms of a flu-like illness were considered under investigation, and had RT-PCR laboratory testing for SARS-CoV-2 performed at a laboratory contracted by the dialysis provider. Patients under laboratory investigation for a SARS-CoV-2 infection were treated in dedicated isolation areas (rooms, shifts, or clinics) for patients who were suspected of being infected, until confirmed negative by two RT-PCR tests that were more than 24 hours apart. Patients who had been exposed to others with COVID-19 were moved to unique isolation areas for patients who had been exposed under investigation for 14 days, and received RT-PCR testing if they presented with signs or symptoms of a flu-like illness. Patients with RT-PCR–confirmed COVID-19 were treated in dedicated isolation areas for patients who were infected until two negative RT-PCR tests more than 24 hours apart were documented.

Population and Outcome

We considered data from adult (age ≥18 years) patients on HD treated throughout the national network for development of a model to predict individuals with an undetected SARS-CoV-2 infection. The observation period started on February 27, 2020. The positive arm included data from patients who had ≥1 confirmed positive RT-PCR COVID-19 test at of the end of the observation period (September 8, 2020, n=11,166). The negative arm included data from patients who: (1) were found COVID-19 negative (n=7959), or (2) were randomly sampled from all active patients at the dialysis provider without a reported suspicion of COVID-19 as of the end of the observation period (n=21,365). The random sampling was performed using the “sample” function from the “pandas” Python package.

We defined the index date of a patient on HD having a SARS-CoV-2 infection as the date of the COVID-19–positive test. In patients who were the control with a negative COVID-19 test result, the test date was used as the index date. In controls without a test, the index date was randomly sampled from the positive patients’ index dates occurring before August 25, 2020, 2 weeks before the end of the observation period. This cutoff was chosen to minimize the possibility that patients in the control were infected, but had not displayed signs or symptoms leading to testing before the end of the observation period. We included data from patients with (1) ≥1 hemoglobin sample collected both 1–14 days and 31–60 days before the individual’s prediction date (3 days before the index date, further defined below), and (2) ≥1 HD treatment both 1–7 days and 31–60 days preceding the prediction date. This was done to ensure we included only patients who were active as hemoglobin draws are conducted weekly for in-center HD (typically thrice-weekly treatments). We excluded data from patients suspected to have COVID-19 who were pending laboratory testing, or were classified as a person under investigation where no laboratory testing was performed or documented.

AI Model Development

Software and ML Model Logic

We used Python version 3.7.7 (Python Software Foundation, Delaware) to build the ML model utilizing the XGBoost package (29). The XGBoost Python package used input variables from the training dataset to construct multiple decision trees, giving each a random sample, and established a series of thresholds that split variables to maximize the information gain. Decision trees were constructed iteratively, and new decision trees were added to predict prior errors. The decision trees made by the XGBoost ML model are inherently able to handle missing values without imputation, by including their presence when determining the splits (e.g., splitting observations with temperatures ≥98.0°F (≥36.7°C) from temperatures <98.0°F (<36.7°C), or missing temperatures). After no further improvements in performance were achieved using the validation dataset (also used for hyperparameter tuning), the ensemble of decision trees produced the final ML model that was assessed with the testing dataset.

Undetected SARS-CoV-2 Prediction Model

We used 81 a priori selected treatment/laboratory variables up to the individually defined prediction date (3 days before the index date defined above) to predict the risk of a SARS-CoV-2 infection being identified in the following ≥3 days (Figure 1). This is intended to yield individual predictions at least 3 days in advance of symptoms that warranted testing. We used a 60%:20%:20% randomized split of COVID-19–positive samples for the training, validation, and testing datasets, and added the same number of patients who were COVID-19 negative to only the training and validation datasets. The testing dataset used to evaluate final model performance had a higher number of COVID-19–negative samples added to more closely match the prevalence observed in the overall national HD population (30,31).

fig1
Figure 1.:
Prediction timeline for data ascertainment and prediction of patients on hemodialysis (HD) with and without severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection identified in the subsequent ≥3 days. Machine learning (ML) model used HD treatment variables (†mean values 1–7 days before the prediction date; ‡difference in mean values 31–60 days to 1–7 days before the prediction date) and laboratory variables (◊mean values 1–14 days before the prediction date; ○difference in mean values 31–60 days to 1–14 days before the prediction date) for prediction of SARS-CoV-2 infection.

Statistical Methods

Descriptive Statistics

Descriptive statistics for patients on HD were tabulated for demographics and variables at the time of the prediction for an undetected SARS-CoV-2 infection. Data are stratified by patients on HD who did, or did not, have laboratory confirmation of COVID-19 after the date of prediction.

Analysis of ML Model Feature Importance

Shapley values (32,33) were calculated using the SHAP Python package to determine the influence of each variable on the predictions (34,35). SHAP values are calculated for each variable and each observation, representing a measure of effect (positive or negative value) of the observed value on each individual prediction. SHAP methods withhold and include individual inputs in all possible combinations, and compare differences between withheld and included data, to compute the mean value of all possible differences for attributing the feature importance. SHAP values are output as log odds (i.e., the logarithm of the odds ratio), meaning they are additive explanations of feature importance. SHAP values for each variable are summed for each set of observations (in this case, for each patient), and converted from log odds to probability, which is then output by the model as the prediction. Thus, the more positive SHAP values increase the predicted probability, whereas more negative SHAP values decrease it. Overall feature importance for individual variables in the model were calculated from the SHAP values using the mean absolute values for each variable across all observations.

Analysis of ML Model Performance

Performance of the ML model was measured by the area under the receiver operating characteristic curve (AUROC) in the training, validation, and testing datasets, and the recall, precision, and lift in the testing datasets. Additionally, we evaluated the area under the precision-recall curve (AUPRC) in the testing dataset.

AUROC measures the rate of true and false positives classified by the prediction model across probability thresholds. The definition of true/false positives and negatives is shown in Table 1.

Table 1. - Definition of true/false positive and negative predictions classified by the model in the assessment of performance in the testing dataset
Classification Group
True positives Patients classified as COVID-19 positive by the model who were in the COVID-19–positive group
False positives Patients classified as COVID-19 positive by the model who were in the COVID-19–negative group
True negatives Patients classified as COVID-19 negative by the model who were in the COVID-19–negative group
False negatives Patients classified as COVID-19 negative by the model who were in the COVID-19–positive group
COVID-19, coronavirus disease 2019.

Recall (sensitivity) measures the rate of true positives classified by the model at a specified threshold and is calculated as follows:Recall = number of true positives classified by model / (number of true positives classified by model + number of false negatives classified by model)Precision measures the positive predictive value for the model at a specified threshold and is calculated as follows:Precision = number of true positives classified by model / (number of true positives classified by model + number of false positives classified by model)Lift measures the effectiveness of the model compared with random sampling and is calculated as follows:Lift = model precision / proportion of positives in datasetAUPRC measures the ratio of precision for corresponding recall values across probability thresholds (36).

AUROC, AUPRC, recall, and precision metrics yield scores on a scale of 0 (lowest) to 1 (highest). A model performing at chance would yield an AUROC of 0.5, an AUPRC equal to the proportion of positives in the dataset, and a lift value of 1. The cutoff threshold for classifying predictions were selected to optimize recall, precision, and lift according to the use case.

Results

Patient Characteristics

We identified data from a select cohort of 40,490 patients on HD meeting eligibility criteria (11,166 patients who were COVID-19 positive and 29,324 who were unaffected and served as the control group). The prevalence of COVID-19 in the cohort (28% COVID-19 positive) was by design higher than the HD population. The prevalence of patients who were COVID-19 positive (about 50% COVID-19 positive) in the training and validation datasets was balanced by design for model building purposes. For the testing dataset used to evaluate final model performance, there was a 10% prevalence of patients who were COVID-19 positive on the basis of the designed data split that was made to estimate the prevalence observed in the national HD population (30,31).

In the cohort, there was a higher proportion of patients on HD with a SARS-CoV-2 infection who were of Black race, Hispanic ethnicity, and had diabetes (Table 2). Mean values for the 81 treatment and laboratory variables before a SARS-CoV-2 infection being identified in the subsequent ≥3 days (or concurrent index date in controls) are shown in Tables 3 and 4.

Table 2. - Demographics and comorbidities of patients on hemodialysis with and without an undetected severe acute respiratory syndrome coronavirus 2 infection identified in the subsequent ≥3 d
Variable Unaffected Patients Coronavirus Disease 2019+
Number of patients on HD 29,324 11,166
Age (yr), mean±SD 62.66±14.25 62.62±13.92
Male, n (%) 16,614 (567) 6149 (55)
White race, n (%) 12,021 (41) 4338 (39)
Black race, n (%) 7838 (27) 3354 (30)
Other race, n (%) 1223 (4) 372 (3)
Unknown race, n (%) 8242 (28) 3102 (28)
Hispanic ethnicity, n (%) 2849 (14) 1831 (23)
BMI (kg/m2) mean±SD 29.26±7.71 29.45±7.83
Dialysis vintage (yr) mean±SD 3.75±4.11 3.96±4.09
Diabetes, n (%) 19,186 (66) 8085 (73)
CHF, n (%) 6710 (23) 2595 (23)
Ischemic heart disease, n (%) 7647 (26) 2830 (26)
Central venous catheter access, n (%) 6799 (23) 2738 (25)
Age, sex, and catheter access variables were included in the ML prediction model to classify the risk of an individual HD patient having a SARS-CoV-2 infection being identified in the following ≥3 d. HD, hemodialysis; BMI, body mass index; CHF, congestive heart failure; ML, machine learning; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.

Table 3. - Clinical and treatment characteristics of patients on hemodialysis with and without an undetected severe acute respiratory syndrome coronavirus 2 infection identified in the subsequent ≥3 d
Variable Unaffected Patients, Mean±SD; N Coronavirus Disease 2019+ Patients, Mean±SD; N
Number of patients on HD 29,324 11,166
Pre-HD sitting SBP (mm Hg) a 148.31±22.83; 29,324 146.03±23.03; 11,166
Change in pre-HD sitting SBP (mm Hg) b −0.40±15.78; 29,324 −1.95±16.72; 11,166
Pre-HD sitting DBP (mm Hg) a 76.87±13.86; 29,322 75.44±13.58; 11,166
Change in pre-HD sitting DBP (mm Hg) b −0.32±9.03; 29,322 −0.88±9.48; 11,166
Pre-HD weight (kg) a 85.71±24.51; 29,323 85.09±24.51; 11,165
Change in pre-HD weight (kg) b −0.17±2.24; 29,323 −0.66±2.73; 11,165
Pre-HD body temperature (°F) a 97.56±0.61; 29,324 97.76±0.66; 11,166
Change in pre-HD body temperature (°F) b 0.07±0.56; 29,324 0.22±0.65; 11,166
Post-HD sitting SBP (mm Hg) a 140.40±21.60; 29,321 144.44±21.62; 11,166
Change in post-HD sitting SBP (mm Hg) b 0.43±14.98; 29,320 1.55±15.74; 11,166
Post-HD sitting DBP (mm Hg) a 73.91±12.58; 29,319 73.56±12.33; 11,166
Change in post-HD sitting DBP (mm Hg) b 0.15±8.49; 29,318 0.41±8.79; 11,166
Post-HD body temperature (°F) a 97.58±0.56; 29,318 97.70±0.62; 11,166
Change in post-HD body temperature (°F) b 0.03±0.50; 29,317 0.14±0.57; 11,165
Pre-HD respirations per min a 17.64±1.16; 29,324 17.72±1.15; 11,166
Change in pre-HD respirations per min b −0.001±0.97; 29,324 0.01±1.02; 11,166
Pre-HD pulse (BPM) a 79.00±12.11; 29,324 79.02±11.90; 11,166
Change in pre-HD pulse (BPM) b 0.11±7.26; 29,324 1.06±7.56, 11,166
Post-HD respirations per min a 17.56±1.15; 29,320 17.65±1.13; 11,165
Change in post-HD respirations per min b −0.007±0.95; 29,319 0.0004±0.99; 11,165
Post-HD pulse (BPM) a 75.80±11.23; 29,321 77.23±11.16; 11,166
Change in post-HD pulse (BPM) b −0.32±7.16; 29,320 1.30±7.87; 11,166
IDWG (kg) a 2.24±1.21; 29,083 1.95±1.29; 11,039
Change in IDWG (kg) b 0.01±0.90; 29,004 −0.26±1.09; 10,991
Post-HD weight loss (kg) a −2.26±1.07; 29,317 −2.06±1.07; 11,160
Change in post-HD weight loss (kg) b −0.01±0.68; 29,316 0.18±0.77; 11,159
Post-HD body temperature change a 0.01±0.66; 29,318 −0.06±0.70; 11,165
Change in post-HD body temperature change b −0.04±0.66; 29,317 −0.07±0.71; 11,165
Post-HD respirations per min change a −0.08±0.97; 29,320 −0.07±0.97; 11,165
Change in post-HD respirations per min change b −0.01±1.04; 29,319 −0.01±1.07; 11,165
Post-HD pulse change (BPM) a −3.20±8.86; 29,321 −1.79±8.77; 11,166
Change in post-HD pulse change (BPM) b −0.43±7.75; 29,320 0.24±8.06; 11,166
% HD treatments with nasal oxygen administered a 5.23±18.52; 29,324 5.67±19.16; 11,166
Change in % HD treatments with nasal oxygen administered b 0.37±13.40; 29,324 0.72±14.12; 11,166
All variables were included in the ML prediction model to classify the risk of an individual HD patient having a SARS-CoV-2 infection being identified in the following ≥3 d. (100°F−32) ×5/9=37.8°C. HD, hemodialysis; SBP, systolic blood pressure; DBP, diastolic blood pressure; BPM, beats per minute; IDWG, interdialytic weight gain; post-HD weight loss, post-HD minus pre-HD weight (kg); ML, machine learning; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
aMean values of HD treatment variables 1–7 d before the prediction date (i.e., 3 d before suspicion of SARS-CoV-2 infection in standard clinical practice).
bMean values of the difference in HD treatment variables 31–60 d to 1–7 d before the prediction date.

Table 4. - Laboratory characteristics of patients on hemodialysis with and without an undetected severe acute respiratory syndrome coronavirus 2 infection identified in the subsequent ≥3 d
Variable Unaffected Patients, Mean±SD, N Coronavirus Disease 2019+, Patients, Mean±SD, N
Number of patients on HD 29,324 11,166
Albumin (g/dl) a 3.79±0.40; 13,723 3.69±0.46; 5252
Change in albumin (g/dl) b −0.002±0.25; 13,139 −0.03±0.27; 5012
Creatinine (mg/dl) a 8.42±3.06; 13,323 8.41±3.14; 5113
Change in creatinine (mg/dl) b 0.08±1.40; 12,711 0.16±1.52; 4860
Bicarbonate (mmol/L) a 24.24±3.05; 13,395 24.22±3.22; 5137
Change in bicarbonate (mmol/L) b 0.02±2.97; 12,772 −0.16±3.12; 4864
BUN (mg/dl) a 56.21±18.53; 14,941 56.17±19.27; 5631
Change in BUN (mg/dl) b −0.21±15.50; 14,400 −0.13±16.55; 5416
URR a 74.92±6.52; 14,273 75.05±6.61; 5348
Change in URR b 0.09±5.89; 13,548 0.07±6.08; 5054
Sodium (mmol/L) a 137.50±3.37; 13,139 137.08±3.52; 5046
Change in sodium (mmol/L) b −0.10±2.83; 29,324 −0.25±3.10; 4772
Potassium (mmol/L) a 4.80±0.68; 16,051 4.78±0.70; 6217
Change in potassium (mmol/L) b 0.01±0.60; 15,499 −0.01±0.63; 6003
Phosphate (mg/dl) a 5.55±1.74; 15,489 5.37±1.71; 5913
Change in phosphate (mg/dl) b 0.01±1.48; 14,918 −0.03±1.46; 5692
Chloride (meq/L) a 98.66±4.14; 12,602 98.33±4.13, 4702
Change in chloride (meq/L) b −0.19±3.35; 11,708 −0.24±3.50; 4450
Calcium (mg/dl) a 8.89±0.69; 15,420 8.78±0.73; 5878
Change in calcium (mg/dl) b 0.02±0.58; 14,882 −0.07±0.60; 5659
Corrected calcium (mg/dl) a 9.06±0.66; 12,865 9.04±0.71; 4903
Change in corrected calcium (mg/dl) b 0.01±0.54; 12,148 −0.03±0.59; 4608
iPTH (pg/ml) a 489.46±454.13; 10,090 497.22±490.12; 3801
Change in iPTH (pg/ml) b −21.39±280.41; 7245 −21.84±296.17; 2734
Ferritin (ng/ml) a 1029.94±576.07; 8229 1197.32±900.22; 3138
Change in ferritin (ng/ml) b 52.90±505.99; 4400 142.00±739.89; 1589
TSAT (%) a 33.07±14.10; 13,051 31.29±14.42; 5008
Change in TSAT (%) b 0.17±15.33; 12,310 −1.59±16.54; 4689
Hgb (g/dl) a 10.76±1.24; 29,324 10.61±1.26; 11,166
Change in Hgb (g/dl) b 0.05±1.07; 29,324 0.01±1.13; 11,166
Platelet count (×109/L) a 195.49±72.47; 11,378 192.35±77.10; 4293
Change in platelet count (×109/L) b −1.93±49.23; 10,595 −7.82±55.06; 3963
WBC count (×109/L) a 6.93±2.36; 13,043 6.55±2.39; 5027
Change in WBC count (×109/L) b 0.03±1.76; 12,344 −0.36±1.93; 4733
% of neutrophils a 66.11±9.50; 17,215 66.59±9.53; 6941
Change in % of neutrophils b 0.06±6.72; 14,931 0.47±7.37; 5997
% of lymphocytes a 20.22±7.98; 17,215 19.76±7.96; 6941
Change in % of lymphocytes b −0.04±4.99; 14,931 −0.53±5.57; 5997
% of monocytes a 6.38±1.90; 17,215 6.69±2.14; 6941
Change in % of monocytes b 0.02±1.48; 14,931 0.37±1.82; 5997
% of eosinophils a 4.29±2.88; 17,212 3.95±2.84; 6939
Change in % of eosinophils b −0.10±2.03; 14,927 −0.40±2.28; 5995
% of basophils a 0.75±0.47; 17,206 0.73±0.45; 6934
Change in % of basophils b 0.05±0.54; 14,917 0.03±0.52; 5988
All variables were included in the ML prediction model to classify the risk of an individual HD patient having a SARS-CoV-2 infection being identified in the following ≥3 d. HD, hemodialysis; Hgb, hemoglobin; WBC, white blood cell; TSAT, transferrin saturation; URR, urea reduction ratio; iPTH, intact parathyroid hormone; ML, machine learning; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
aMean values of laboratory variables 1–14 d before the prediction date (i.e., 3 d before suspicion of SARS-CoV-2 infection in standard clinical practice).
bMean values of the difference in laboratory variables 31–60 d to 1–14 d before the prediction date.

Patients on HD who contracted COVID-19 had only subtle, clinically unremarkable distinctions in treatment and laboratory characteristics before being suspected to have a SARS-CoV-2 infection, compared with patients who were unaffected. Mean pre-/post-HD body temperatures (Table 3) and inflammatory markers (white blood cell count and differentials) (Table 4) before a SARS-CoV-2 infection being identified did not show a clinically relevant difference between groups. Patients on HD who had a SARS-CoV-2 infection identified in the following 3 days did appear to have somewhat higher ferritin levels compared with patients who were unaffected.

Prediction Model Feature Importance

Calculation of variable feature importance with SHAP values found the top three predictors of patients on HD having a SARS-CoV-2 infection were the change in interdialytic weight gain from the previous month, mean pre-HD body temperature in the prior week, and the change in post-HD pulse from the previous month (Figure 2A).

fig2
Figure 2.:
SHAP value plots for the machine learning (ML) model showing the extent each predictor contributes (positively or negatively) to each individual prediction. (A) Bar plot of the mean absolute SHAP values for the top 10 predictors in descending order. (B) SHAP value plot for the degree of the positive or negative effect of each individual measurement on the prediction (x-axis), with warmer colors representing higher observed values for that measurement, cooler colors indicating lower values for that measurement, and gray representing a missing value for that measurement. HD, hemodialysis.

The SHAP value plot in Figure 2B further shows the degree of positive or negative effect of each individual measurement for each individual prediction. Each dot corresponds to an individual patient, where the dot’s position on the x-axis represents that feature’s effect on the model prediction; in addition, the color indicates how high or low that feature’s value was. Features with missing values are indicated in gray.

For the top predictor of the change in interdialytic weight gain in the week before compared with the month before a SARS-CoV-2 infection, smaller (negative) values (cooler colors) were associated with a positive SHAP value, whereas larger values (warmer colors) were associated with a negative SHAP value. These results showed for each individual prediction, the model generally considered decreases in interdialytic weight gain from the previous month to be associated with a greater probability of an undetected SARS-CoV-2 infection, and an increase in interdialytic weight gain to be associated with a lower likelihood of an undetected SARS-CoV-2 infection. In other words, patients who do not gain as much weight as usual in between dialysis treatments are deemed more likely to have an undetected SARS-CoV-2 infection by the model.

Along with highlighting directional effects as previously stated, Figure 2B also highlights different distributions of effects that might not be apparent when viewing the mean absolute values as in Figure 2A. For example, the eighth most important variable, change in monocytes from the previous month, produces the largest (most positive) SHAP values out of all of the variables shown. This long, rightward tail along the x-axis indicates that, despite having a lower mean absolute value in comparison to other variables, for some individuals this is very important. Specifically, the model assessed that patients with increased monocyte levels from the previous month are deemed more likely to have a SARS-CoV-2 infection, whereas the SHAP values for those with similar or lower levels of monocytes do not significantly decrease the prediction.

Prediction Model Performance

The ML model had adequate performance in prediction of the 3-day risk for having an undetected SARS-CoV-2 infection. The ML model had an AUROC of 0.77, 0.67, and 0.68 in the training, validation, and testing datasets respectively (Figure 3). The ML model had an AUPRC of 0.24 in the testing dataset (Figure 4).

fig3
Figure 3.:
Area under the receiver operating characteristic curve (AUROC) plot for the machine learning (ML) model, showing the rate of true and false positives classified by the prediction model across probability thresholds. AUC, area under the curve.
fig4
Figure 4.:
Area under the precision-recall curve (AUPRC) plot for the machine learning (ML) model, showing the ratio of precision for corresponding recall values across probability thresholds.

Setting the threshold for classifying observations as positive or negative at 0.80 to minimize false positives, the precision for the ML model in the testing dataset was 0.52, showing 52% of patients predicted to have a SARS-CoV-2 infection actually had symptoms in the subsequent ≥3 days and were confirmed to have COVID-19. Given the high threshold, recall was 0.07, showing the model correctly predicted true positives for a SARS-CoV-2 infection in 7% of patients on HD who were positive. The lift was 5.3, suggesting model use is 5.3 times more effective in predicting a patient on HD who contracts COVID-19, compared with not having a model (Figure 5).

fig5
Figure 5.:
Lift curve for the machine learning (ML) model showing the lift value ( y -axis) by the proportion of the population predicted to have an undetected severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection ( x -axis).

Discussion

We successfully developed an ML prediction model using retrospective data, which appears to have suitable performance in identifying patients on HD at risk of having an undetected SARS-CoV-2 infection that is identified in the following ≥3 days. The top predictors of a patient having a SARS-CoV-2 infection were the change in interdialytic weight gain from the previous month, mean pre-HD body temperature in the prior week, and the change in post-HD pulse from the previous month.

Although some top predictors are not surprising, the observed distinctions were subtle. Without insights from the model considering an array of variables, it would not be clear where one should classify a higher or lower risk for an individual patient that is meaningful. For instance, assessing for a decrease in weekly interdialytic weight gain of about 0.3 kg alone may not be considered actionable, and the same is true for assessing for an increase of about 0.2°F (0.1°C) in weekly pre-HD body temperature, or an increase in pulse of about 1 beat per minute. Notably, the average pre-HD body temperature was 97.6°F (36.4°C) (primarily oral measurements) in our analysis and has been previously reported as 98.2°F (36.7°C) (37). Given 98.6°F (37°C) is the expected average in healthy populations, the lower body temperature of patients on HD is of importance with the rather low incidence of fever presenting in patients on dialysis with COVID-19 (11%–66% with fever [16,20,22]). Overall, the small changes observed for each individual variable suggest any one parameter alone has minimal value for detecting a patient’s risk of having COVID-19, especially because every affected patient will not have every symptom of COVID-19 consistently. However, the combinations of minor changes appear to be meaningful in the individualized ML model we developed, with each small change being one piece of the puzzle for each patient’s unique prediction.

Individual predictions can be further used to identify the risk level for dialysis clinics through the proportion of patients classified with an undetected SARS-CoV-2 infection. We anticipate using a combination of individual predictions along with reporting of the percent of patients at risk in each clinic may yield the greatest early insights on: (1) what otherwise asymptomatic patients on HD might be most appropriate for enhanced screening, COVID-19 testing, and triage to an isolation area, and (2) where providers can focus additional resource allocations to combat COVID-19. Furthermore, flagging patients as potentially infectious may cut through some of the “COVID fatigue” occurring during this prolonged pandemic. By adding this additional novelty and warning, the hope is additional care may be given in identifying of potential symptoms during screening. Prospective evaluation of ML model–directed mitigation is being piloted at the national network of dialysis clinics.

The authors propose a conceptual workflow for the application of the ML model predictions to assist with directing care to individual patients and with directing resource allocations to clinics (Figure 6). The model was trained using a target date of 3 days before patients presented with COVID-19 symptoms to alert clinicians at least one dialysis treatment earlier. Given this timeline, we believe it is prudent to run the prediction model on a per-treatment basis. The delivery of reports on individual patient predictions to clinic staff would optimally be delivered on interdialytic days, to provide the care team time to prepare for a more comprehensive screening by an advanced clinician at the next encounter and potential isolation of subsequent HD treatments. The delivery of reports on the percent of patients in each clinic at risk can be performed on a weekly basis to allow leadership and regional managers to meet with clinical managers and prepare for allocation of resources including additional staff, protective equipment, and isolation areas. We propose categorizing clinic-level reports to detail facilities with more than 5% of patients at risk for undetected SARS-CoV-2 infection.

fig6
Figure 6.:
Conceptual workflow for application of machine learning (ML) model predictions within current mitigation and testing practices at the provider. HD, hemodialysis; COVID-19, coronavirus disease 2019; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.

Mitigation efforts at the national dialysis network include universal RT-PCR testing of patients with symptoms of a flu-like illness, along with distinct isolation areas (rooms, shifts, clinics) for patients who are suspected to be infected and under investigation, and patients who are COVID-19 positive. We propose patients predicted to be at risk receive a comprehensive screening for signs and symptoms of a flu-like illness by an advanced practitioner (e.g., physician, physician assistant, nurse practitioner, experienced dialysis nurse) because there is a possibility of false positives. However, the comprehensive assessments should consider any minor sign or symptoms of a flu-like illness that may otherwise be considered normal on the basis of the patient’s uremia and medical history (38,39) to be a reason for suspicion of COVID-19. In addition to the prediction itself, the top reasons increasing the risk score can be provided by calculating the SHAP values (Figure 2B). This may help to provide additional insight into what the what a more comprehensive screening assessment should focus on for each individual patient. For example, if a patient is classified by the model at risk, with the top reason related to a decrease in interdialytic weight gain, the next screening before entry to the clinic could include assessment of any change in appetite or fluid intake. Patients who are high risk and suspected with any mild sign of a flu-like illness could be triaged to unique isolation areas for patients under investigation and receive RT-PCR testing. HD would be continued in a distinct isolation area until diagnosis of COVID-19 or not (determined by two negative RT-PCR tests >24 hours apart), whereby patients who are laboratory positive would be triaged to unique isolation areas for COVID-19, and patients who are negative would return to be treated with the general HD population (Figure 6), which is consistent with the providers’ practices without the model. Patients diagnosed with COVID-19 at the provider are treated in distinct isolation areas until they have two negative RT-PCR tests >24 hours apart, after which patients who have recovered are transferred back to receive HD with the unaffected HD population.

The developed model has the potential to provide a data-driven way for providers to identify individuals with undetected SARS-CoV-2 infections. The conceptual workflow provides a hypothetical strategy that can be adapted within the practice patterns of other providers, which may not include universal testing and require periods of isolation. Different strategies could utilize different thresholds for flagging patients, depending on the intervention and implications of false positives and false negatives. Considering the possibility of prolonged viral shedding observed in the general and dialysis populations (4041–42), the optimal period for isolation of patients on dialysis affected by COVID-19 appears to be longer than 14 days (42). In countries or areas with testing limitations, especially those with a high positive-to-negative testing ratio (e.g., >25% positive test rate), it may be reasonable to consider having separate isolation areas for patients predicted at risk, in addition to isolation areas for patients with symptoms of a flu-like illness. In this scenario, the 14-day timeframe for isolation of patients predicted to be at risk is anticipated to be appropriate if no signs or symptoms of a flu-like illness arise.

As more data are captured in the COVID-19 outbreak, further prediction models that can classify the risk of morbid/mortal outcomes in patients on dialysis affected by COVID-19 need to be developed. The potential applications of AI for COVID-19 have been previously detailed (43); the first priority was suggested as “early detection and diagnosis of the infection.” The robustness of data and an a priori selection of variables to be included in our ML model bring value through assessment of feature importance; this allows for interpretation of meaningfulness of predictors, although it does not determine causality. The selection of input variables was focused on biologic changes reflected in clinical presentations and biomarkers, allowing the model to be generalizable to all individual patients on HD in the overall population, and not specific to the characteristics of outbreaks or the local population where patients reside. Although this approach yields more generalizability for the model to be used in the HD populations worldwide, external factors such as local incidence rates or social determinants of health are anticipated to affect the likelihood of a patient contracting COVID-19 and can be considered as appropriate. Ultimately, this strategy has the potential to allow for COVID-19 to be detected sooner than patients on HD show symptoms, and for a localized HD population, earlier than it would be reported by national authorities.

A systematic review identified several models developed using data from China for early detection of COVID-19 in suspected individuals in the general population (27). One is an externally validated ML model that predicts COVID-19 in suspected asymptomatic patients (AUROC validation 0.872). Another effort used a prediction model (AUROC validation 0.966) to develop logic for an eight-variable COVID-19 risk chart. A further model with an AUROC of 0.938 was created to detect COVID-19 pneumonia in patients admitting to a fever clinic (44). Other models used genomic/computed tomography data to diagnose COVID-19 (27). An effort using data from China not included in prior reviews developed various ML models to predict (AUROC testing 0.87–0.95) and identify features indicative of COVID-19 status across age categories among people in the general population presenting to a clinic/hospital (45). This model found the most important features for prediction of COVID-19 at presentation were lung infection, cough, and pneumonia. Consistent variables used across models for predictions included age, body temperature, and flu-like illness symptoms (27,45). Another distinct effort reported in the literature included the development of ML and traditional models using only full blood count data to predict the likelihood of a COVID-19 among people in the general population presenting to the emergency department (AUROC training 0.80–0.86) of, or patients admitted at (AUROC training 0.94–0.95), a large hospital in Brazil. Although these models were all reported to have suitable performance, all were subject to bias due to nongeneralizable sampling of controls without COVID-19 and possible overfitting. We cannot rule out that our ML model may have similar bias, although it included a large sample and the testing dataset had relatively generalizable sampling for the dialysis population with respect to positives and negatives (30,31). Also, because we randomly selected a subset of patients for the negative arm who never had symptoms of COVID-19 and did not receive PCR testing, it is possible we might have unintentionally included a small number of patients who were asymptomatic. However, this would have required patients to have had an asymptomatic SARS-CoV-2 infection that aligned with the randomly sampled time window. Given the balanced class design of the training and validation data splits, it is unlikely this introduced a remarkable bias in the model during training and validation. Yet, there is a possiblility this could have introduced a minimal bias in evaluation of performance in the testing data because there were fewer patients who were positive to identify to offset any impact of a patient incorrectly labeled negative when positive. Additionally, the reported model performance may be on the conservative side when considering the constraints of the “ground truth” labels, because they relate to how patients who are positive are identified by conventional screening. The extent of this depends on how well the model identifies individuals not included in the training sample but might show similar patterns, and also depends on the intervention design. In any case, our model is unique in its ability to identify the risk of SARS-CoV-2 infection in patients without any suspicion of being affected with the disease.

The developed model holds promise to help providers through the COVID-19 pandemic and subsequent wave(s) of outbreak (44,45). We recommend model use as augmentation and not replacement of symptom screening, as AI modeling is never 100% accurate and model risk classifications need to be interpreted within the extent of the model’s performance. The developed AI model showed a clinically meaningful performance in prediction of individual patients on HD at risk of having an undetected SARS-CoV-2 infection ≥3 days before there would be any suspicion of the disease. Prospective testing is needed and underway at the national network of dialysis clinics. We proposed a conceptual workflow for application of ML model–directed mitigation and testing. These efforts should provide key insights for consideration by health care providers.

Disclosures

C. Monaghan, F. Maddux, H. Han, J. Larkin, L. Usvyat, S. Chaudhuri, and Y. Jiao are employees of Fresenius Medical Care in the Global Medical Office. E. Weinhandl, I. Dahne-Steuber, J. Hymes, K. Belmonte, K. Bermudez, and R. Kossmann are employees of Fresenius Medical Care North America. F. Maddux has directorships in the Fresenius Medical Care Management Board, Goldfinch Bio, and Vifor Fresenius Medical Care Renal Pharma. F. Maddux, I. Dahne-Steuber, J. Hymes, K. Belmonte, L. Usvyat, P. Kotanko, and R. Kossmann have share options/ownership in Fresenius Medical Care. L. Neri is an employee of Fresenius Medical Care Deutschland GmbH in the Europe, the Middle East, and Africa Medical Office. P. Kotanko is an employee of Renal Research Institute, a wholly owned subsidiary of Fresenius Medical Care; reports receiving honorarium from Up-To-Date; and is on the Editorial Board of Blood Purification and Kidney and Blood Pressure Research. All remaining authors have nothing to disclose.

Funding

Project and manuscript composition were supported internally by Fresenius Medical Care.

Acknowledgments

The authors like to acknowledge Mr. Vladimir M. Rigodon for assistance with the composition of the regulatory protocol for this analysis. Previous version of this manuscript appeared on preprint server MedRxiv, https://www.medrxiv.org/content/10.1101/2020.06.15.20131680v1.

Author Contributions

K. Belmonte, A. Dahne-Steuber, R.J. Kossmann, P. Kotanko, F.W. Maddux, C.K. Monaghan, and L.A. Usvyat conceptualized the study; S. Chaudhuri, H. Han, Y. Jiao, J.W. Larkin, C.K. Monaghan, and L.A. Usvyat were responsible for the data curation; C.K. Monaghan and L.A. Usvyat were responsible for the formal analysis; K. Belmonte, I.A. Dahne-Steuber, J.L. Hymes, R.J. Kossmann P. Kotanko, F.W. Maddux, C.K. Monaghan, and L.A. Usvyat were responsible for the methodology; C.K. Monaghan was responsible for the validation; J.W. Larkin, C.K. Monaghan, and L.A. Usvyat were responsible for the visualization; J.W. Larkin, C.K. Monaghan, and L.A. Usvyat wrote the original draft; S. Chaudhuri, R.J. Kossmann, J.W. Larkin, and L.A. Usvyat were responsible for the resources; S. Chaudhuri, J.L. Hymes, P. Kotanko, J.P. Kooman, R.J. Kossmann, J.W. Larkin, F.W. Maddux, and L.A. Usvyat provided supervision; and all authors reviewed and edited the manuscript. The interpretation, drafting, and revision of this manuscript was conducted by all authors. The decision to submit this manuscript for publication was jointly made by all authors, and the manuscript was confirmed to be accurate and approved by all authors.

C.K.M. and J.W.L. are cofirst authors.

References

1. Kliger AS, Silberzweig J: Mitigating risk of COVID-19 in dialysis facilities. Clin J Am Soc Nephrol 15: 707–709, 2020 https://doi.org/10.2215/CJN.03340320
2. Ikizler TA: COVID-19 and dialysis units: What do we know now and what should we do? Am J Kidney Dis 76: 1–3, 2020 https://doi.org/10.1053/j.ajkd.2020.03.008
3. Basile C, Combe C, Pizzarelli F, Covic A, Davenport A, Kanbay M, Kirmizis D, Schneditz D, van der Sande F, Mitra S: Recommendations for the prevention, mitigation and containment of the emerging SARS-CoV-2 (COVID-19) pandemic in haemodialysis centres. Nephrol Dial Transplant 35: 737–741, 2020 https://doi.org/10.1093/ndt/gfaa069
4. Mokrzycki MH, Coco M: Management of hemodialysis patients with suspected or confirmed COVID-19 infection: Perspective of two nephrologists in the United States. Kidney360 1: 273–278, 2020 https://doi.org/10.34067/KID.0001452020
5. Gallieni M, Sabiu G, Scorza D: Delivering safe and effective hemodialysis in patients with suspected or confirmed COVID-19 infection: a single-center perspective from Italy. Kidney360 1: 403–409, 2020 https://doi.org/10.34067/KID.0001782020
6. Roncon L, Zuin M, Rigatelli G, Zuliani G: Diabetic patients with COVID-19 infection are at higher risk of ICU admission and poor short-term outcome. J Clin Virol 127: 104354, 2020 https://doi.org/10.1016/j.jcv.2020.104354
7. Guo T, Fan Y, Chen M, Wu X, Zhang L, He T, Wang H, Wan J, Wang X, Lu Z: Cardiovascular implications of fatal outcomes of patients with Coronavirus Disease 2019 (COVID-19). JAMA Cardiol 5: 811–818, 2020 https://doi.org/10.1001/jamacardio.2020.1017
8. Li X, Xu S, Yu M, Wang K, Tao Y, Zhou Y, Shi J, Zhou M, Wu B, Yang Z, Zhang C, Yue J, Zhang Z, Renz H, Liu X, Xie J, Xie M, Zhao J: Risk factors for severity and mortality in adult COVID-19 inpatients in Wuhan. J Allergy Clin Immunol 146: 110–118, 2020 https://doi.org/10.1016/j.jaci.2020.04.006
9. Cheng Y, Luo R, Wang K, Zhang M, Wang Z, Dong L, Li J, Yao Y, Ge S, Xu G: Kidney disease is associated with in-hospital death of patients with COVID-19. Kidney Int 97: 829–838, 2020 https://doi.org/10.1016/j.kint.2020.03.005
10. Du RH, Liang LR, Yang CQ, Wang W, Cao TZ, Li M, Guo GY, Du J, Zheng CL, Zhu Q, Hu M, Li XY, Peng P, Shi HZ: Predictors of mortality for patients with COVID-19 pneumonia caused by SARS-CoV-2: A prospective cohort study. Eur Respir J 55: 2000524, 2020 https://doi.org/10.1183/13993003.00524-2020
11. Adams ML, Katz DL, Grandpre J: Population-based estimates of chronic conditions affecting risk for complications from coronavirus disease, United States. Emerg Infect Dis 26: 1831–1833, 2020 https://doi.org/10.3201/eid2608.200679
12. United States Renal Data System: 2019 USRDS Annual Data Report: Epidemiology of Kidney Disease in the United States, Bethesda, MD, National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, 2019. Available at: https://www.usrds.org/annual-data-report/previous-adrs/. Accessed June 10, 2020
13. Neumann ME: Latest data show 305 dialysis patient deaths due to COVID-19 in the US. Nephrology News & Issues. Available at: https://www.healio.com/nephrology/infection-control/news/online/%7B3a263aa9-ad59-4c3f-aab7-07b8395508e5%7D/latest-data-show-305-dialysis-patient-deaths-due-to-covid-19-in-the-us. Accessed April 22, 2020
14. CDC COVID-19 Response Team: Geographic differences in COVID-19 cases, deaths, and incidence - United States, february 12-april 7, 2020. MMWR Morb Mortal Wkly Rep 69: 465–471, 2020 https://doi.org/10.15585/mmwr.mm6915e4
15. Johns Hopkins Coronavirus Resource Center Johns Hopkins University School of Medicine. 2020. Available at: https://coronavirus.jhu.edu/data/mortality. Accessed October 15, 2020
16. ERA-EDTA: ERACODA - The ERA-EDTA COVID-19 database for patients on kidney replacement therapy, 2020. Available at: https://www.era-edta.org/en/wp-content/uploads/2020/04/ERACODA-Study-Report-2020-04-29.pdf. Accessed October 15, 2020
17. Wang H: Maintenance hemodialysis and coronavirus disease 2019 (COVID-19): Saving lives with caution, care, and courage. Kidney Med 2: 365–366, 2020
18. Jager KJ, Kramer A, Chesnaye NC, Couchoud C, Sánchez-Álvarez JE, Garneata L, Collart F, Hemmelder MH, Ambühl P, Kerschbaum J, Legeai C, Del Pino Y Pino MD, Mircescu G, Mazzoleni L, Hoekstra T, Winzeler R, Mayer G, Stel VS, Wanner C, Zoccali C, Massy ZA: Results from the ERA-EDTA Registry indicate a high mortality due to COVID-19 in dialysis patients and kidney transplant recipients across Europe. Kidney Int 98: 1540–1548, 2020 https://doi.org/10.1016/j.kint.2020.09.006
19. Kikuchi K, Nangaku M, Ryuzaki M, Yamakawa T, Hanafusa N, Sakai K, Kanno Y, Ando R, Shinoda T, Nakamoto H, Akizawa T; COVID-19 Task Force Committee of the Japanese Association of Dialysis Physicians; Japanese Society for Dialysis Therapy; Japanese Society of Nephrology: COVID-19 of dialysis patients in Japan: Current status and guidance on preventive measures. Ther Apher Dial 24: 361–365, 2020
20. Siordia JA Jr: Epidemiology and clinical features of COVID-19: A review of current literature. J Clin Virol 127: 104357, 2020 https://doi.org/10.1016/j.jcv.2020.104357
21. Xiong F, Tang H, Liu L, Tu C, Tian JB, Lei CT, Liu J, Dong JW, Chen WL, Wang XH, Luo D, Shi M, Miao XP, Zhang C: Clinical characteristics of and medical interventions for COVID-19 in hemodialysis patients in wuhan, China. J Am Soc Nephrol 31: 1387–1397, 2020 https://doi.org/10.1681/ASN.2020030354
22. Niiler E: An AI epidemiologist sent the first warnings of the Wuhan virus, 2020. Wired. Available at: https://www.wired.com/story/ai-epidemiologist-wuhan-public-health-warnings/. Accessed April 22, 2020
23. Bogoch II, Watts A, Thomas-Bachli A, Huber C, Kraemer MUG, Khan K: Pneumonia of unknown aetiology in Wuhan, China: Potential for international spread via commercial air travel. J Travel Med 27: taaa008, 2020 https://doi.org/10.1093/jtm/taaa008
24. McCall B: COVID-19 and artificial intelligence: Protecting health-care workers and curbing the spread. Lancet Digit Health 2: e166–e167, 2020 https://doi.org/10.1016/S2589-7500(20)30054-6
25. Alimadadi A, Aryal S, Manandhar I, Munroe PB, Joe B, Cheng X: Artificial intelligence and machine learning to fight COVID-19. Physiol Genomics 52: 200–202, 2020 https://doi.org/10.1152/physiolgenomics.00029.2020
26. Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, Bonten MMJ, Damen JAA, Debray TPA, De Vos M, Dhiman P, Haller MC, Harhay MO, Henckaerts L, Kreuzberger N, Lohman A, Luijken K, Ma J, Andaur CL, Reitsma JB, Sergeant JC, Shi C, Skoetz N, Smits LJM, Snell KIE, Sperrin M, Spijker R, Steyerberg EW, Takada T, van Kuijk SMJ, van Royen FS, Wallisch C, Hooft L, Moons KGM, van Smeden M: Prediction models for diagnosis and prognosis of covid-19 infection: Systematic review and critical appraisal [published correction appears in BMJ 369: m2204, 2020 10.1136/bmj.m2204]. BMJ 369: m1328, 2020 https://doi.org/10.1136/bmj.m1328
27. Fresenius Medical Care North America: COVID-19 resource and education center. Available at: https://fmcna.com/company/covid-19-resource-center/. Accessed August 31, 2020
28. Chen T, Guestrin C: XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, California, Association for Computing Machinery, 2016 pp 785–794 https://doi.org/10.1145/2939672.2939785
29. Centers for Medicare & Medicaid Services: Preliminary medicare COVID-19 data snapshot, 2020. Available at: https://www.cms.gov/research-statistics-data-systems/preliminary-medicare-covid-19-data-snapshot. Accessed October 14, 2020
30. Anand S, Montez-Rath M, Han J, Bozeman J, Kerschmann R, Beyer P, Parsonnet J, Chertow GM: Prevalence of SARS-CoV-2 antibodies in a large nationwide sample of patients on dialysis in the USA: A cross-sectional study. Lancet 396: 1335–1344, 2020 https://doi.org/10.1016/S0140-6736(20)32009-2
31. Shapley LS: A value for n-person games. In: Contributions to the Theory of Games II. Annals of Mathematics Studies, edited by Kuhn HW, Tucker AW, Vol. 28, Princeton, Princeton University Press, 1953, pp 307–317
32. Štrumbelj E, Kononenko I: Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst 41: 647–665, 2013 https://doi.org/10.1007/s10115-013-0679-x
33. Lundberg SM, Lee SI: A unified approach to interpreting model predictions. Proceedings from the Advances in Neural Information Processing Systems, Vol. 30, 2017. Available at: https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html. Accessed June 10, 2020
34. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I: From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2: 56–67, 2020 https://doi.org/10.1038/s42256-019-0138-9
35. Saito T, Rehmsmeier M: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 10: e0118432, 2015 https://doi.org/10.1371/journal.pone.0118432
36. Usvyat LA, Kotanko P, van der Sande FM, Kooman JP, Carter M, Leunissen KM, Levin NW: Circadian variations in body temperature during dialysis. Nephrol Dial Transplant 27: 1139–1144, 2012 https://doi.org/10.1093/ndt/gfr395
37. Gedney N: Long-Term hemodialysis during the COVID-19 pandemic. Clin J Am Soc Nephrol 15: 1073–1074, 2020 https://doi.org/10.2215/CJN.09100620
38. Gagliardi I, Patella G, Michael A, Serra R, Provenzano M, Andreucci M: COVID-19 and the kidney: From epidemiology to clinical practice. J Clin Med 9: 2506, 2020 https://doi.org/10.3390/jcm9082506
39. Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, Xiang J, Wang Y, Song B, Gu X, Guan L, Wei Y, Li H, Wu X, Xu J, Tu S, Zhang Y, Chen H, Cao B: Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: A retrospective cohort study. Lancet 395: 1054–1062, 2020 https://doi.org/10.1016/S0140-6736(20)30566-3
40. Fontana F, Giaroni F, Frisina M, Alfano G, Mori G, Lucchi L, Magistroni R, Cappelli G: SARS-CoV-2 infection in dialysis patients in northern Italy: A single-centre experience. Clin Kidney J 13: 334–339, 2020 https://doi.org/10.1093/ckj/sfaa148
41. Shaikh Aisha, Zeldis Etti, Campbell Kirk N, Chan Lili: Prolonged SARS-CoV-2 Viral RNA Shedding and IgG Antibody Response to SARS-CoV-2 in Patients on Hemodialysis. Clin J Am Soc Nephrol, 2020 10.2215/CJN.11120720 33055191
42. Vaishya R, Javaid M, Khan IH, Haleem A: Artificial Intelligence (AI) applications for COVID-19 pandemic. Diabetes Metab Syndr 14: 337–339, 2020 https://doi.org/10.1016/j.dsx.2020.04.012
43. Feng C, Wang L, Chen X, Zhai Y, Zhu F, Chen H, Wang Y, Su X, Huang S, Tian L, Zhu W, Sun W, Zhang L, Han Q, Zhang J, Pan F, Chen L, Zhu Z, Xiao H, Liu Y, Liu G, Chen W, Li T: A novel artificial intelligence-assisted triage tool to aid in the diagnosis of suspected COVID-19 pneumonia cases in fever clinics. Annals of Translational Medicine, 2021. Available at: https://atm.amegroups.com/article/view/6078. Accessed January 29, 2021
44. Leung K, Wu JT, Liu D, Leung GM: First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: A modelling impact assessment. Lancet 395: 1382–1393, 2020 https://doi.org/10.1016/S0140-6736(20)30746-7
45. Xu S, Li Y: Beware of the second wave of COVID-19. Lancet 395: 1321–1322, 2020 https://doi.org/10.1016/S0140-6736(20)30845-X
Keywords:

dialysis; artificial intelligence; coronavirus; COVID-19; dialysis; end stage kidney disease; machine learning; prediction; SARS-CoV-2

Copyright © 2021 by the American Society of Nephrology