Secondary Logo

Journal Logo

Original Article

Evaluation of intensive care unit performance in Lithuania using the SAPS II system

Vosylius, S.; Sipylaite, J.; Ivaskevicius, J.

Author Information
European Journal of Anaesthesiology: August 2004 - Volume 21 - Issue 8 - p 619-624


Outcome prediction and evaluation of intensive care unit (ICU) performance has become a standard tool for the estimation of effectiveness and quality of intensive care provision [1]. Over the last two decades, general severity of illness scoring systems became popular and provided an opportunity to make international comparisons of intensive care outcome. Scoring system defines the severity scores of illness which could be used for the prediction of hospital mortality risk applying logistic regression equations. One of the most widely used general severity of illness and prognostic scoring system is the simplified acute physiology score (SAPS) system. The second version of this scoring system was developed and validated for a large group of ICU patients from the Europe and US [2]. The SAPS II scoring system has been accepted as a measure of illness severity. It has been shown accurately to stratify risk of death in a wide range of disease states and clinical setting. This experience has resulted in the widespread use of the SAPS II scoring system as a tool for clinical audit between ICUs.

Our objective was to assess the ability of the SAPS II system to predict patient outcome and to evaluate an individual ICU performance; similar studies from Eastern Europe institutions have not been reported. We also wanted to compare the performance of our ICU in Lithuania with similar data from other studies that used the same methodology.


Permission for this study was obtained from our Hospital Ethics Committee. A prospective observational study was conducted in single mixed ICU at Vilnius University Emergency Hospital, Lithuania, between 1 February 1998 and 31 January 2000. Clinical and physiological data were collected applying the criteria and definitions described by Le Gall and colleagues [2]. The collection and storage of the data was performed manually using purpose-made forms and afterwards entered to the database using Microsoft® Access 97 software. All data were checked for illogical, extreme or unlikely values in the software. All data used for the study were then routinely collected for clinical purposes.

During the study period, data were collected on 2261 patients admitted consecutively to the ICU. Patients were excluded from the study if they were younger than 18 yr old, admitted for less than 8 h or readmitted to the ICU, or had missing data. For calculating standardized mortality ratio (SMR), only data from the first admission to the ICU during the same hospitalization were considered. One-hundred-and-four (8.6%) admissions were excluded for one or more of the exclusion criteria.

The following data were collected for each patient: patient characteristics (age and gender), location before ICU admission and surgical status (medical, emergency surgical or elective surgical). Patients were defined as surgical if they were operated upon 1 week before or after ICU admission. Physiological and laboratory variables necessary for the assessment of the severity of illness were collected on the first ICU day. The severity of illness was measured according to the SAPS II system. Length of stay in the ICU is the duration of care from admission to discharge from the ICU. Length of stay in the hospital is the duration of care from admission to the ICU and discharge from the hospital. The main outcome was survival status on discharge from the hospital, including deaths in the ICU and hospital wards after discharge from the ICU.

Statistical analysis

Univariate comparisons were performed to compare survivors and non-survivors at hospital discharge. All continuous variables were presented as medians with the 25th-75th quartiles range and analysed by the U-test. Categorical variables were expressed as actual numbers and percentages and compared using the χ2 analysis. All statistical tests were two-sided. P < 0.05 was considered significant for all tests.

The observed death rate was compared with the predicted death rate for the study population. Predicted hospital mortality rates for SAPS II were calculated, employing the logistic regression model suggested by Le Gall and colleagues [2]. The predicted risk of death for each patient was calculated using the SAPS II risk of death equation based on the patient's SAPS II score. The predicted death rate was the sum of SAPS II estimates of hospital mortality risk of individual patients divided by the number of patients in given groups of ICU admissions. The SMR, obtained by dividing observed number of deaths for each group by the predicted number, were used to compare observed with predicted mortality.

The accuracy of prediction was tested using Hosmer-Lemeshow C statistics and calibration curves. The admissions are ranked according to predicted risk of death with approximately equal numbers of patients. The records indicate the agreement between the observed and predicted mortality across risk ranges. Large C values and low P values (<0.05) suggest that the model does not correctly reflect the actual outcome. Calibration curve, using 10 equal, contiguous risk ranges, presents observed against predicted outcomes. The observed death rates are plotted against predicted death rates stratified by 10% risk ranges in a calibration curve.

The discrimination power that is the ability of the model to discriminate between survivors and non-survivors was assessed by calculating area under the receiver operating characteristic (ROC) curve, with estimates of standard error and 95% confidence intervals (CIs). The area under the ROC curve estimates the ability of the model to assign a higher risk of death to patients who die. The predicted and observed outcomes were compared using 2 × 2-decision matrices at five different decision criteria: predicted risk of death of 0.1, 0.2, 0.5, 0.8 and 0.9 [3]. The sensitivity, specificity, true-positive, true-negative and total correct classification rates, derived from the classification tables were recorded.


The study population consisted of 2067 ICU admissions. Patients' characteristics, severity of illness and surgical status are shown in Table 1. The most common primary reasons for the ICU admission were the need for active monitoring (36%), neurological (24%) and cardiovascular emergencies (16%), and trauma cases (16%). The majority of the patients were surgical with a clear predominance of emergency surgical patients. The most frequent anatomical sites of surgery were gastrointestinal and trauma category, followed by neurological and vascular surgery. The most frequent medical diseases were neurological and gastrointestinal diagnostic category.

Table 1
Table 1:
Patient and clinical characteristics of ICU patients.

There were 384 deaths in ICU (18.6%) and 524 deaths in hospital (25.4%). Fifty-two percent of patients stayed in the ICU up to 24 h, and 73% up to 3 days. Compared with the survivors, the non-survivors were older and more severely ill, had longer stay in the ICU. Hospital mortality was significantly higher for medical patients than for surgical patients (47.1% and 15.9%, P < 0.001). Emergency surgery patients had significantly higher hospital mortality rate than elective surgery patients (21.7% vs. 5.9%, P < 0.001).

The observed hospital mortality was significantly higher than the SAPS II predicted mortality rate (25.4% vs. 19.9%). The SMR for the whole study population was 1.28. Age and gender had no significant influence on SMR. The SMR was high for medical patients showing significant under-prediction of the observed mortality (1.52). In contrast, for surgical patients, the observed hospital mortality was closer to predicted mortality rate (1.05 and 1.11 for emergency and elective surgery, respectively).

The majority of admissions had low SAPS II probabilities of death. One-half of the patients had a predicted risk of hospital mortality of 0.1, and two-thirds had 0.3. The accuracy of risk prediction evaluated by Hosmer-Lemeshow statistics failed to confirm adequate calibration with the original SAPS II development database (χ2 = 56.98; df = 8; P < 0.001). The calibration curve (Fig. 1) for SAPS II equation applied to our data showed marked deviation from the diagonal for the ranges up to 0.7 risk of hospital death. Risk prediction was inaccurate in the low and middle strata of predicted risk, and the curve lay closest to the diagonal for the highest risk groups (>0.7 predicted risk).

Figure 1
Figure 1:
Calibration curve for the SAPS II hospital mortality model, comparing the observed hospital mortality (solid line) for patients grouped by predicted risk of hospital death. The line of ideal predictive ability (diagonal line) is where the number of observed and predicted deaths is equal.

The ability of the SAPS II system correctly to predict prognosis was tested by classification matrices (Table 2). The highest an overall correct classification was obtained using a decision criterion of 0.5, with the sensitivity of 86.2% and the specificity of 77.4%. Figure 2 shows the ROC curve for the SAPS II equation applied to our data. The area under ROC curve was 0.883 (standard error 0.008, 95% CI 0.866-0.899) and confirmed the good discrimination of SAPS II.

Table 2
Table 2:
Correct classification rate, sensitivity, specificity, true-positive and false-positive rates for SAPS II system.
Figure 2
Figure 2:
ROC curve for the SAPS II (curved line). The relationship between true-positive sensitivity and false-positive 1 - specificity is shown. Diagonal line is the line of chance performance. The area under the curve is 0.883.


This study is a large single institution prospective assessment of SAPS II system in the ICU in Lithuania. The application of severity scoring system in other countries and comparison with the original database may produce useful information in assessing the state of intensive care medicine. The interpretation of the ICU performance and comparisons of patient's groups using mortality prediction models producing SMR were used in various countries with different social, demographic, economic and medical environment. The evaluation of the ICU performance based on SAPS II severity of illness scoring has previously been reported from many Western European countries [2,4-14].

Our patients had a median SAPS II score of 29, which is within the reported range in other national and international studies [2,4-8]. Compared to the ICUs in the original SAPS II study [2], we found that patients admitted to our ICU were more frequently classified as surgical with significantly lower SAPS II score and death rate than medical patients.

The most often used measurement for the assessment of outcome is the ICU and hospital mortality rate. The overall mortality rate is insufficient in describing outcome and comparing groups of critically ill patients treated in different hospitals and countries. The observed hospital mortality rate (25.4%) in present study was within mortality range limits (from 16% to 34%) and in other studies depending on the case-mix, including age, chronic health status, admission diagnosis and severity of illness [4-6,10-12,14,15]. In comparison with the reports from Western Europe countries, the present study revealed that observed death rate was higher than predicted death rate (SMR 1.28) corrected for severity of illness. However, the large variations in SMRs (0.5-1.7) were found among the individual ICUs participating in multicentre studies using SAPS II model. Higher SMR was observed in Italy, Greece, Scotland and Tunisia [4,12,13,15]. In other studies, better ICU performance was found. In Austrian ICUs the SMR were significantly lower than 1.0 and the customization of SAPS II equation was applied [14]. In the European Intensive Care Units Study (EURICUS-I), conducted in 12 European countries, SAPS II over-estimated the risk of death also [6].

Analysis of our results showed the variation in SMRs for some groups of the ICU patients. The risk prediction for patients grouped by operative status and diagnostic category showed a significantly different pattern of mortality rate. The majority of patients with lower SAPS II scores and risk of death were from the operative group, while those patients with higher scores were predominantly medical. One-half of the patients had a predicted risk of death value <0.1: most of them were admitted postoperatively for active monitoring and basic observational care. We observed a specific case-mix in our ICU. Among the ICU patients there were a high proportion of patients admitted due to neurological emergencies with significant difference between observed and predicted mortality rates. The neurological group of patients had a higher degree of severity and unfavourable prognosis in some studies also [4,9,11].

The performance of the ICU mortality prediction models could be influenced by both clinical and nonclinical factors [16]. The inaccuracy in data interpretation can arise from local differences of clinical practice, case-mix or data collection. The observed mortality rate higher than predicted could be explained primarily by the quality of pre-hospital, in-hospital care and intensive care. Other important factors are resources limitation, diagnostic diversity, lead-time bias, teaching activity, staffing and technology availability [11,16-19]. Outcome prediction models are sensitive to data collection inaccuracies and incompleteness [20,21].

Whatever severity of illness scoring systems is chosen for hospital mortality prediction and evaluation of the ICU, it is essential to know the goodness-of-fit in these areas of application, as well as the discriminatory power. In our present study the SAPS II model has shown a good capability of discriminating survivors from non-survivors (area under the ROC of 0.883). This estimate is higher than the value (0.823) obtained from the original SAPS II model [2]. In the other studies, the area under ROC for the SAPS II model was from 0.744 to 0.888 [4-6,8,10,13]. The Hosmer-Lemeshow statistic revealed insufficient calibration, so the SAPS II model did not fit very well to the Lithuanian ICU population. Many other studies also reported poor calibration [5,6,11,12,15]. The predictive efficacy of the SAPS II in individual patients in our study was high with a correct classification rate of 84% (decision criterion of 0.5) is similar to reported rates by other studies [11,12].

In conclusion, the findings from the present study confirmed that the SAPS II system was a useful tool for the assessment of the ICU outcome in Lithuania. The duration of the study and the number of enrolled patients was sufficient to obtain reliable data. The SMR rather than the overall mortality rate or severity of illness score might be an objective measurement of the ICU performance. The SAPS II system showed a good ability to distinguish patients who die from the patients who live, but had a low degree of correspondence between the estimated probabilities of mortality and the actual mortality in our ICU. The SAPS II prediction model provided an opportunity to make an international comparison of intensive care.


Financial support for the study was from Open Society Institute, Regional Program, 1998. We thank Alius Bieliauskas for the preparation of the software for data collection, processing and statistical analysis.


1. Teres D, Lemeshow S. When to customize a severity model. Intensive Care Med 1999; 25: 140-142.
2. Le Gall JR, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study. JAMA 1993; 270: 2957-2963.
3. Civetta JM, Colton T. How to read a medical article and understand basic statistics. In: Civetta JM, Taylor RW, Kirby RR, eds. Critical Care, 3rd edn. Philadelphia, USA: Lippincott-Raven Publishers, 1997: 3-20.
4. Apolone G, Bertolini G, D'Amico R, et al. The performance of SAPS II in a cohort of patients admitted to 99 Italian ICUs: results from GiViTI. Gruppo Italiano per la Valutazione degli interventi in Terapia Intensiva. Intensive Care Med 1996; 22: 1368-1378.
5. Moreno R, Morais P. Outcome prediction in intensive care: results of a prospective, multicentre, Portuguese study. Intensive Care Med 1997; 23: 177-186.
6. Moreno R, Reis Miranda DR, Fidler V, Van Schilfgaarde R. Evaluation of two outcome prediction models on an independent database. Crit Care Med 1998; 26: 50-61.
7. Metnitz PGH, Vesely H, Valentin A, et al. Evaluation of an interdisciplinary data set for national intensive care unit assessment. Crit Care Med 1999; 27: 1486-1491.
8. Timsit JF, Fosse JP, Troche G, et al. Accuracy of a composite score using daily SAPS II and LOD scores for predicting hospital mortality in ICU patients hospitalized for more than 72 h. Intensive Care Med 2001; 27: 1012-1021.
9. Bodmann KF, Ehlers B, Habel U, et al. Epidemiological and prognostic data from 2054 patients of an internal medicine intensive care unit. Dtsch Med Wochenschr 1997; 122: 919-925.
10. Schuster HP, Schuster EP, Ritschel P, Wilts S, Bodmann KF. The ability of the simplified acute physiology score (SAPS II) to predict outcome in coronary care patients. Intensive Care Med 1997; 23: 1056-1061.
11. Markgraf R, Deutschinoff G, Pientka L, Scholten T. Comparison of acute physiology and chronic health evaluations II and III and simplified acute physiology score II: a prospective cohort study evaluating these methods to predict outcome in a German interdisciplinary intensive care unit. Crit Care Med 2000; 28: 26-33.
12. Katsaragakis S, Papadimitropoulos K, Antonakis P, Stergiopoulos S, Konstadoulakis MM, Androulakis G. Comparison of acute physiology and chronic health evaluation II (APACHE II) and simplified acute physiology score II (SAPS II) scoring systems in a single Greek intensive care unit. Crit Care Med 2000; 28: 426-432.
13. Livingston BM, MacKirdy FN, Howie JC, Jones R, Norrie JD. Assessment of the performance of five intensive care scoring models within a large Scottish database. Crit Care Med 2000; 28: 1820-1827.
14. Metnitz PGH, Valentin A, Vesely H, et al. Prognostic performance and customization of the SAPS II: results of a multicenter Austrian study. Intensive Care Med 1999; 25: 192-197.
15. Nouira S, Belghith M, Elatrous S, et al. Predictive value of severity scoring systems: comparison of four models in Tunisian adult intensive care units. Crit Care Med 1998; 26: 852-859.
16. Zimmerman JE, Wagner DP. Prognostic systems in intensive care: how do you interpret an observed mortality that is higher than expected? Crit Care Med 2000; 28: 258-260.
17. Bastos PG, Knaus WA, Zimmerman JE, et al. The importance of technology for achieving superior outcomes from intensive care. Intensive Care Med 1996; 22: 664-669.
18. Glance LG, Osler T, Shinozaki T. Effect of varying the case mix on the standardized mortality ratio and W statistic. A simulation study. Chest 2000; 117: 1112-1117.
19. Bosman RJ, van Straaten HM, Zandstra DF. The use of intensive care information systems alters outcome prediction. Intensive Care Med 1998; 24: 953-958.
20. Arts D, de Keizer N, Scheffer G-J, de Jonge E. Quality of data collected for severity of illness scores in the Dutch national intensive care evaluation (NICE) registry. Intensive Care Med 2002; 28: 656-659.
21. Apolone G. The state of research on multipurpose severity of illness scoring systems: are we on target? Intensive Care Med 2000; 26: 1727-1729.


© 2004 European Academy of Anaesthesiology