Collinearity is related to stability problems in a regression model. The collinearity of the variables in the created MLR model was tested by calculating the variance inflation factor (VIF). The existence of collinearity can be determined if the largest VIF is greater than 10, and we excluded the independent variables with collinearity in the MLR model (11). We used MATLAB 2012a (Mathworks Inc, Natick, MA) to analyze machine learning, and all statistical analyses were performed using SPSS 20.0 (IBM Corp, Armonk, NY). All reported P values were two-sided and P < 0.05 was considered statistically significant.
Data set characteristics
The characteristics of the training and the validation sets for the absolute and the relative values are shown in Tables 1 and 2, respectively. The P values in Tables 1 and 2 show insignificant difference of all variables between the training and validation sets obtained by Mann–Whitney U test.
Feature selection for support vector regression
Table 3 shows the ranking of feature selection for the training set of the absolute and relative values using Spearman's correlation coefficient between each variable and blood loss in percent. New index, MAP, DBP, and perfusion index were placed top ranking from first to fourth in the relative values. All of these variables were also selected by the direct multicategory machine learning method using relative values in our previous study. In the comparison of the absolute and relative values, the correlation coefficients of the new index, perfusion index, temperature, and respiration rate in absolute values had lower value than the correlation coefficients in the relative values. However, the correlation coefficients of the SBP, shock index, pulse pressure, lactate concentration, and heart rate in relative values had lower values than the correlation coefficients in the absolute values. Moreover, DBP and MAP showed almost the same values of the correlation coefficients for the absolute and relative values.
Multivariate linear regression (MLR)
The final results of the MLR models for blood loss in percent with the absolute and relative values are shown in Table 4. The adjusted R2 of 0.873 for the MLR model with the relative values was slightly better than that of 0.844 with the absolute values. The MLR model with absolute values included lactate concentration, DBP, pulse pressure, and perfusion index. The MLR model with relative values included DBP, lactate concentration, perfusion index, heart rate, and pulse pressure. While the heart rate was excluded in the final MLR model with absolute values, the heart rate of the MLR model with relative values had the smallest value of standardized beta indicating the least effect on the performance of the model. Therefore, these results suggest that DBP, lactate concentration, perfusion index, and pulse pressure are important variables for predicting blood loss in percent of rats in hypovolaemic shock.
Performance of SVR and MLR models for the validation set
We obtained the accuracy, relative classifier information (RCI), and kappa index for the validation dataset to evaluate the performance of the three SVR models and an MLR one with the number of variables determined (Table 5). SVR model with linear kernel with relative values showed the best performance with accuracy of 88.5%, RCI of 0.754, and Kappa index of 0.839 with eight input variables. The SVR models with polynomial and radial basis kernel with both the absolute and relative values showed poor performance. The MLR models with relative values also showed quite a good performance with accuracy of 84.6%, RCI of 0.672, and Kappa index of 0.782 with five variables. The performance of the models with relative values was much better than that with absolute values because large individual difference of perfusion was reduced for the models with the relative values (3, 4).
Scatter plots of predicted blood loss in percent
Figure 2 shows the four scatter plots of predicted and actual blood loss in percent for the validation set, and criterion for ATLS shock class. The four boxes in the scatter plots indicate criterion of four ATLS shock classes. The alphabets in the four plots indicated wrong categorized rats. Figure 2A and B shows predicted blood losses in percent using SVR and MLR for absolute values, respectively while Figure 2C and D shows those using SVR and MLR for relative values. Figure 2C shows best performance with only three errors in the validation set, which are estimated one class higher than the actual class by SVR model with relative values. As error between predicted and actual blood loss in percent decreases, the accuracy of the model for ATLS shock class increases. The incorrectly categorized rats in class 3 are relatively more than other classes because its criterion box is smaller than others.
This study successfully discriminated four ATLS hypovolaemic shock classes in an animal model by predicting blood loss in percent using the regression models. The results demonstrated SVR with linear kernel with relative values as the best model for accurately predicting ATLS class. Moreover, in comparison of results between absolute and relative values, the classification model with relative values showed better performance than those with absolute values. By statistical analysis with MLR models, we found close associations between blood loss in percent and the four variables such as DBP, lactate concentration, perfusion index, and pulse pressure.
The SVR model with linear kernel of machine learning showed better performance than the MLR model. In clinical domain, superiority of machine learning was revealed by several studies (12, 13). SVR is known for the ability to tackle the standard problem of over-fitting, especially in multivariate settings (8). These characteristics of SVR of machine learning could lead to higher performance than the traditional MLR method. However, machine learning approaches have a distinct disadvantage over the MLR model. The variables are difficult to interpret in the machine learning model because of their “black box” nature (14). The regression coefficients from MLR models can be interpreted in a straightforward manner, and it would be important to see improvements in the interpretability of results from machine learning methods (15).
In our previous study, to predict the four ATLS classes, three popular machine learning algorithms with four feature selection methods for multicategory classification were applied to a rat model in acute hemorrhage (4). In the present study, classifying four ATLS classes by predicting blood loss in percent could perform the same objects of direct multicategory classification methods by using the variables such as vital signs instead of blood loss in the previous study. In the comparisons of performance between the present and previous studies, predicting blood loss in percent using SVR and MLR models (accuracy: 88.5%, 84.6%, respectively) was better than the direct multicategory classification method using the SVM-OVO (one versus one) model (accuracy: 80.8%) for the same validation data set. Information loss occurs when continuous data are grouped into discrete intervals (16). The rats in our previous study were grouped according to ATLS criterion based on the blood loss in percent to determine ATLS shock class, so the characteristics information loss of the blood loss in percent occurred. Moreover, the extension of SVM to multicategory problems is more complicated because of the iteration of binary classifiers (15). Therefore, the model for classifying ATLS shock class by predicting blood loss in percent using SVR or MLR model constructed in this study is more accurate and simpler.
DBP, lactate concentration, perfusion index, and pulse pressure were selected as important variables for predicting blood loss (in percent) in rats in hypovolemic shock via MLR. DBP and pulse pressure are closely related to blood loss and have already been utilized by ATLS. As a product of anaerobic glycolysis, lactate concentration indirectly indicates oxygen debt. Many studies demonstrated significant increases in serum lactate concentration in response to hemorrhage (3). Moreover, a new portable blood lactate analyzer (Lactate Pro 2 LT-1730; ARKRAY Inc, Kyoto, Japan) can measure blood samples of only 0.3 μL within 15 s in the field (4). Therefore, the measurement of lactate concentration as a good predictor of blood loss in the field is simple and easy, as well as noninvasive.
Perfusion index was significantly correlated with blood loss in Table 4, and the perfusion index decreased in response to hemorrhage. Choi et al. (3) and Kaiser et al. (17) reported that perfusion index responded to severe hemorrhage earlier than blood pressure. Noninvasive monitoring of the perfusion index could be an early and sensitive marker of vital tissue hypoperfusion, considering that, in circulatory failure, blood flow is diverted from less important tissues (e.g., skin, subcutaneous tissue, muscle, gastrointestinal tract) to vital organs (e.g., heart, brain, kidneys) (18). The advantages of laser Doppler flowmetry include less complexity, noninvasiveness, and the ability to continuously monitor microcirculatory blood flow in real time, although it is expensive.
In Table 3, the ranks of new index and perfusion index with relative values were first and fourth, respectively, but those with absolute values decreased to sixth and seventh. The perfusion index was recommended for relative changes rather than absolute values, due to large individual differences, as mentioned in our previous study (4). This disadvantage of the perfusion index in absolute values also deteriorates the new index, because the new index is defined as the ratio of lactate concentration to perfusion index. In the meanwhile, the DBP and MAP were not almost influenced by absolute or relative values, and the SBP and pulse pressure in absolute values had higher correlation coefficients than those in relative values. This indicates that the analysis of the relative change values caused information loss of SBP for association with blood loss in percent. Therefore, the shock index in absolute values showed higher correlation with blood loss in percent than the new index. Although the perfusion index showed the relatively low correlation with blood loss in percent, they were selected by the MLR model with both absolute and relative values (Table 4). Moreover, the classification performances of SVR with linear kernel and MLR with relative values were quite higher than those with absolute ones. However, as it is difficult to determine each individual's resting values in emergency situations for relative values, we proposed replacing resting variables with mean values for humans (4).
This study demonstrated possibility of classifying multiple outcomes by predicting continuous variable (blood loss) using the regression models, and showed superiority of these methods compared with direct multicategory classification such as support vector machine in our previous study (4). Predicting continuous variable and discriminating multicategory group by standard criterion for diagnosis could be simple and perform the same purpose of direct multicategory classification. This method could apply other disease prediction, such as classifying normal, osteopenia, or osteoporosis by predicting T-score of bone mineral density, and using diagnostic criterion stated by the World Health Organization (19). Moreover, it could also discriminate the prediabetes from diabetes by predicting a fasting plasma glucose level or HbA1c level with diagnostic criterion (20). Early diagnosis and intervention for prediabetes could prevent complications, prevent the transition to diabetes, and be cost-effective (21, 22).
However, this approach for multicategory classification also has disadvantages when compared with the direct multicategory classification models. First, this method cannot apply all multicategory classifications because it needs special conditions, which are continuous dependant variables and obvious diagnostic criterion. Second, the variables that were selected by the regression models to predict blood loss in percent were not directly associated with ATLS shock class. However, the blood loss in percent could be a mediating variable between four ATLS shock classes and the selected variables in this study, because the blood loss is a diagnostic criterion.
Lu et al. (23) investigated the buccal partial pressure of carbon dioxide (PCO2) in rats with hemorrhagic shock and compared the data with traditional vital signs and perfusion index. The buccal PCO2 differed significantly among four groups (no bleeding, 25%, 35%, and 45% blood loss) and approximately 10 min earlier than shock index, heart rate, SBP, and MAP; additionally, PCO2 correlated with perfusion index. Jefferson et al. (24) investigated a prediction model of hemorrhagic blood loss using mean blood pressure, PaO2, SBP, and base excess in 33 rats using machine learning. The model included PaO2 and base excess of biochemical responses, which are difficult to measure in the field, and did not investigate perfusion index or lactate concentration. In contrast, our study predicted blood loss and determined four ATLS shock classes using two times the number of rats that were in Jefferson's study.
Our study warrants to verify the reproducibility of perfusion index and lactate concentration measurements in prehospital setting in humans. Perfusion index and lactate concentration measurements in humans are simple and readily available because these require no special measurement skill, and that the measurement time is less than 2 min, including equipment setup time (4). Reproducibility of perfusion index measurement was obtained in our previous human study for diabetic neuropathy (9.5%, n = 125, reproducibility = standard deviation/mean × 100%) (25).
This study has four main limitations. First, the sample size was relatively small, particularly in the validation set. Second, we did not include measurements of coagulation or hemostasis using the Prothrombin Time or Partial Thromboplastin Time methods in the present study, which would be associated with blood loss. International normalized ratio that is a derivative of prothrombin time can be available in the prehospital setting with portable self-monitoring devices. In the future study, it would be useful to investigate the prediction model for blood loss with variables for functional coagulation. Third, we propose to use resting variables with normally distributed data ranges for humans in the model with relative changes, because resting values are not known, especially in emergency situations. Fourth, a larger animal model is warranted to provide more clinical relevance in the future. Given that we used rats to strictly control the hemorrhagic shock model, our experiment cannot be repeated for humans. However, this study showed the potential of perfusion index and lactate concentration for hemorrhage, which are not currently measured in emergency situations.
In conclusion, we introduced a new approach for discriminating ATLS shock class using regression models for predicting blood loss in percent. The regression model showed better performance than the direct multicategory classification method, which was shown in our previous study. Moreover, the simple MLR models with both absolute and relative values could give possibility of the clinical decision support system for ATLS shock class, and provide association between the independent variables and blood loss. The perfusion index and the new index are suggested as new variables in relative changes for classifying ATLS classes.
1. Spinella PC, Holcomb JB. Resuscitation and transfusion principles for traumatic hemorrhagic shock. Blood Rev
2009; 23 6:231–240.
2. Mutschler M, Nienaber U, Brockamp T, Wafaisade A, Wyen H, Peiniger S, Paffrath T, Bouillon B, Maegele M. TraumaRegister DGU. A critical reappraisal of the ATLS classification of hypovolaemic shock: does it really reflect clinical reality? Resuscitation
2013; 84 3:309–313.
3. Choi JY, Lee WH, Yoo TK, Park I, Kim DW. A new severity predicting index for hemorrhagic shock using lactate concentration
and peripheral perfusion
in a rat model. Shock
2012; 38 6:635–641.
4. Choi SB, Park JS, Chung JW, Kim SW, Kim DW. Prediction of ATLS hypovolemic shock class in rats using the perfusion
index and lactate concentration
2015; 43 4:361–368.
5. Zhang R, Huang GB, Sundararajan N, Saratchandran P. Multicategory
classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Trans Comput Biol Bioinform
2007; 4 3:485–495.
6. Hoffman H, Lee SI, Garst JH, Lu DS, Li CH, Nagasawa DT, Ghalehsari N, Jahanforouz N, Razaghy M, Espinal M, et al. Use of multivariate linear regression
and support vector regression
to predict functional outcome after surgery for cervical spondylotic myelopathy. J Clin Neurosci
2015; 22 9:1444–1449.
7. Saeys Y, Inza I, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics
2007; 23 19:2507–2517.
8. Basak D, Pal S, Patranabis DC. Support vector regression
. Neural Inf Process Lett Rev
2007; 11 10:203–224.
9. Parrella F: Online support vector regression
. Department of Information Science, University of Genoa, Italy, 2007. Available at: http://onlinesvr.altervista.org/
. Accessed August 18, 2015.
10. Guly HR, Bouamra O, Spiers M, Dark P, Coats T, Lecky FE. Vital signs and estimated blood loss in patients with major trauma: testing the validity of the ATLS classification of hypovolaemic shock. Resuscitation
2011; 82 5:556–559.
11. Cho KH, Kang JH, Ki SJ, Park Y, Cha SM, Kim JH. Determination of the optimal parameters in regression models for the prediction of chlorophyll-a: a case study of the Yeongsan Reservoir, Korea. Sci Total Environ
2009; 407 8:2536–2545.
12. Yoo TK, Kim SK, Kim DW, Choi JY, Lee WH, Oh E, Park EC. Osteoporosis risk prediction for bone mineral density assessment of postmenopausal women using machine learning. Yonsei Med J
2013; 54 6:1321–1330.
13. Hsieh CH, Lu RH, Lee NH, Chiu WT, Hsu MH, Li YC. Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks. Surgery
2011; 149 1:87–93.
14. Goetz JN, Brenning A, Petschko H, Leopold P. Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput Geosci
2015; 81 1:1–11.
15. Kruppa J, Liu Y, Biau G, Kohler M, König IR, Malley JD, Ziegler A. Probability estimation with machine learning methods for dichotomous and multicategory
outcome: theory. Biom J
2014; 56 4:534–563.
16. Shaw DG, Huffman MD, Haviland MG. Grouping continuous data in discrete intervals: information loss and recovery. JEM
1987; 24 2:167–173.
17. Kaiser ML, Kong AP, Steward E, Whealon M, Patel M, Hoyt DB, Cinat ME. Laser Doppler imaging for early detection of hemorrhage. J Trauma
2011; 71 2:401–406.
18. Lima A, Bakker J. Noninvasive monitoring of peripheral perfusion
. Intensive Care Med
2005; 31 10:1316–1326.
19. Brown TT, Qaqish RB. Antiretroviral therapy and the prevalence of osteopenia and osteoporosis: a meta-analytic review. AIDS
2006; 20 17:2165–2174.
20. Tuomilehto J, Lindström J, Eriksson JG, Valle TT, Hämäläinen H, Ilanne-Parikka P, Keinänen-Kiukaanniemi S, Laakso M, Louheranta A, Rastas M, et al. Finnish Diabetes Prevention Study Group: prevention of type 2 diabetes mellitus by changes in lifestyle among subjects with impaired glucose tolerance. N Engl J Med
2001; 344 18:1343–1350.
21. Bertram MY, Lim SS, Barendregt JJ, Vos T. Assessing the cost-effectiveness of drug and lifestyle intervention following opportunistic screening for pre-diabetes in primary care. Diabetologia
2010; 53 5:875–881.
22. Choi SB, Kim WJ, Yoo TK, Park JS, Chung JW, Lee YH, Kang ES, Kim DW. Screening for prediabetes using machine learning models. Comput Math Methods Med
23. Lu H, Zheng J, Zhao P, Zhang G, Wu T. Buccal partial pressure of carbon dioxide outweighs traditional vital signs in predicting the severity of hemorrhagic shock in a rat model. J Surg Res
2014; 187 1:262–269.
24. Jefferson MF, Pendleton N, Mohamed S, Kirkman E, Little RA, Lucas SB, Horan MA. Prediction of hemorrhagic blood loss with a genetic algorithm neural network. J Appl Physiol
1998; 84 1:357–361.
25. Kim SW, Kim SC, Nam KC, Kang ES, Im JJ, Kim DW. A new method of screening for diabetic neuropathy using laser Doppler and photoplethysmography. Med Biol Eng Comput
2008; 46 1:61–67.
Keywords:© 2016 by the Shock Society
Lactate concentration; linear regression; multicategory; perfusion; support vector regression; triage