Secondary Logo

Journal Logo

Technology, Computing, and Simulation: Research Reports

The Reproducibility of Stewart Parameters for Acid-Base Diagnosis Using Two Central Laboratory Analyzers

Nguyen, Ba-Vinh MD*†; Vincent, Jean-Louis MD, PhD; Hamm, Jean Baptiste MD; Abalain, Jean-Hervé PhD§; Carre, Jean-Luc MD, PhD§; Nowak, Emmanuel PhD; Ahmed, Mehdi Ould MD*; Arvieux, Charles C. MD, PhD; Gueret, Gildas MD, PhD

Author Information
doi: 10.1213/ANE.0b013e3181b62664

Acid-base disorders are frequent in the intensive care unit. The classical interpretation of acid-base disturbances according to the Henderson-Hasselbalch concept is now challenged, and many clinicians prefer to use the Stewart model1,2 modified by Figge et al.3,4 This method is based on calculations of the strong ion difference (SID) from measurements of the concentrations of several electrolytes. However, there is some degree of approximation in all measurements, and this can be magnified when derived variables are calculated. Morimatsu et al.5 showed that chloride measurements made with point-of-care technology differed significantly from those made using central laboratory facilities, resulting in different SID values. We hypothesized that the calculated SID could differ considerably depending on the central laboratory analyzer used to measure the constituent electrolytes. If true, interpretation of a patient’s acid-base status using the Stewart method would vary according to which central analyzer was used and, therefore, to which hospital the patient was admitted. We determined plasma sodium, potassium, chloride, bicarbonate, magnesium, albumin, and phosphate, and the apparent SID (SIDapp), the effective SID (SIDeff), and the strong ion gap (SIG) using two different central laboratory automated blood chemistry analyzers and compared the results.


Data collection was anonymous and performed within an internal quality audit from the biochemical and biomolecular departments for which the institutional ethics committee (Clermont Tonnerre Hospital, Brest, France) waived the need for informed consent. We prospectively examined data from 179 blood samples taken from consecutive patients admitted to the intensive care unit after cardiac surgery between February and April 2006. The data needed for analysis were collected as part of routine standard patient care and were stored electronically. No additional blood sampling was required for this study. The two central laboratory automated blood chemistry analyzers used in this study were the Modular (Roche, Meylan, France) and the LX20 (Beckman, Villepinte, France). These analyzers have the same reference ranges for each variable except for bicarbonate (22–29 mmol/L for Modular and 24–32 mmol/L for LX20). Blood gas analysis was performed using the GEM® Premier™ 3000 (Instrumentation Laboratory, Lexington, MA) with a preheparinized 3-mL blood gas syringe (BD, Plymouth, United Kingdom), using dry calcium-balanced heparin. Because ionized calcium is calculated and not measured by either central laboratory analyzer, and lactate concentration needs a specific tube in our laboratory, these parameters were determined once with the GEM Premier 3000.

Blood samples were collected via arterial lines in serum tubes with gel (Vacuette® Serum Tube, Greiner Bio One, Kremsmünster, Austria, volume 8 mL) and taken to the laboratory immediately after collection. All the sample tubes were filled completely and their volumes were constant.

After centrifugation at 4200 rpm for 10 min, the same serum of each sample was analyzed twice using the LX20 and the Modular biochemical analyzer. Both analyzers use the same technology. Sodium was measured with an ion-selective electrode with a polyvinyl chloride membrane containing a neutral carrier, which provides a cavity for the capture of the sodium ion. Potassium was also measured by an ion-sensitive electrode with a polyvinyl chloride membrane, but the membrane was modified with the antibiotic valinomycin, which makes the electrode selective for the potassium ion. Finally, chloride was measured using an ion-selective electrode membrane with an ion exchanger, which pairs with chloride ions. Bicarbonate and magnesium concentrations were determined by colorimetry techniques, following the manufacturers’ recommendations. Albumin was also measured by colorimetry (bromocresol green with Modular and bromocresol purple with LX20). These analyzers undergo daily calibration and quality control checks. We compared serum sodium, potassium, chloride, bicarbonate, calcium, phosphate, magnesium, and albumin concentrations from the two analyzers. Ionized calcium was measured by a specific electrode. Lactate determination was accomplished by enzymatic reaction of lactate oxidase.

Stewart Method

Quantitative physical-chemical analysis was performed according to Stewart’s quantitative biophysical methods1,2 modified by Figge et al.3,4 This technique estimates unmeasured anion concentrations, incorporating the contribution of the respiratory status (Paco2) and electrolyte and plasma protein abnormalities with the acid-base imbalance. According to this theory, changes in blood pH are regulated by three independent variables: (i) the SID (difference between fully dissociated anions and cations); (ii) Paco2; and (iii) the total weak acid concentration (Atot) (consisting mainly of albumin and phosphate). Fencl and Leith6 demonstrated the clinical application of these principles, which resulted in the introduction of the term “strong ion gap” (SIG) by Kellum et al.7,8

The SIDapp is the difference between the sum of all measured strong cations and strong anions as follows (all concentrations in mEq/L):

The SIDeff represents the effect of the corrected Pco2 and the weak acids, albumin, and inorganic phosphate, on the balance of electrical charges in plasma. The formula for SIDeff, as determined by Figge et al.,3 is as follows:

where HCO3 is in mEq/L, albumin in g/L, and phosphate in mEq/L. We calculated the SIDeff using the plasma bicarbonate and phosphate concentrations determined by the two analyzers.

The difference between the calculated SIDapp and SIDeff constitutes the SIG:

In healthy humans, the SIG should, theoretically, equal 0 (electrical charge neutrality). If this is not the case, there must be unmeasured charges to explain this ion gap. A positive SIG value represents unmeasured anions (such as keto acids, urate, sulfate, citrate, pyruvate, acetate, and gluconate) that are present in the blood and account for the measured pH, the measured levels of strong and weak ions, and the need to maintain electroneutrality.

From the above observations, it is clear that an accurate analysis based on the Stewart-Figge methodology requires an accurate calculation of SIDapp and SIDeff, and that the SIG value depends heavily on the calculated SIDapp and SIDeff.

We calculated the SID using the serum sodium, potassium, chloride, and magnesium concentrations determined by the two analyzers. Lactate and ionized calcium were determined once because they were measured only by one biochemical analyzer. Normal ranges were based on previous studies using the Stewart method.9

We also compared these results with the traditional anion gap (AG), albumin corrected AG,10 and an abbreviated version of SIDapp.

Albumin corrected AG = AG + (0.25 × [44 − albumin]), where albumin is in g/L.

Statistical Analysis

Statistical analysis was performed using “R” software (R: a language and environment for statistical computing; R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, available at: We used a paired Student’s t-test for continuous data. Data are presented as means ± sd. Agreement between the two analyzers was assessed using the Bland-Altman analysis11 and intraclass correlation coefficient for continuous data,12 with Kappa coefficient for categorical data.13 Pearson correlation coefficients were calculated, and appropriate T values with n − 2 degrees of freedom were computed to test their significance. Fisher’s Z transform was used to obtain confidence intervals. A P value of <0.05 was considered to be statistically significant.


Measured and calculated mean values for each analyzer, bias, limits of agreements, and extreme values of the differences between analyzers are shown in Table 1.

Table 1
Table 1:
Mean Values, Bias, Limits of Agreement (LOA), and Ranges of Data for the Modular and LX20 Analyzers

Figures 1–3 represent the Bland and Altman diagrams for SIDapp, SIDeff, and SIG, respectively. Figure 4 shows the linear regression for SIG.

Figure 1
Figure 1:
Figure 1.
Figure 2
Figure 2:
Figure 2.
Figure 3
Figure 3:
Figure 3.
Figure 4
Figure 4:
Figure 4.

Table 2 shows the agreement of the two analyzers according to SIDapp values (low, normal, and high). Kappa coefficient for SIDapp was weak (0.50, 95% confidence interval [CI]: 0.37–0.61) (0.67, 95% CI: 0.56–0.77 for chloride). Thirty-one percent of the data points were in a different category with the other analyzer. Intraclass correlation coefficient was 0.74 for SIDapp (95% CI: 0.62–0.81), 0.86 for SIDeff (95% CI: 0.81–0.90), 0.33 for SIG (95% CI: 0.19–0.47), 0.38 for AG (95% CI: 0.25–0.49), and 0.35 for albumin corrected AG (95% CI: 0.22–0.48). The more variables in the formula, the weaker the correlation between the two analyzers (r2 = 0.77, 95% CI: 0.43–0.92, P < 0.001).

Table 2
Table 2:
Apparent Strong Ion Difference (SIDapp) Distribution Between Analyzers (P < 0.01)

Table 3 shows examples of measured and calculated data for one patient.

Table 3
Table 3:
Example of Stewart Parameters for One Patient from the Two Analyzers


Stewart has shown that the pH of a physiological solution is a function of Pco2, SID, and Atot. However, there is a problem with measuring SID and Atot in patients because not all strong ions and weak acids are part of a regular biochemical screen. Moreover, our study shows that two currently used central laboratory biochemical analyzers can give quite different results for the calculated Stewart variables.

Electrolyte Measurement

Even when biochemical analyzers are accurate, there may be some variability in the results,14 and clinicians must be aware of the magnitude of this variability to be able to interpret the results. If the variability in the results is small, the results can be considered as reliable. Even if a true quantity value is considered unique,15 there is always some variability in measurement in biology. In our study, the mean differences between the two central laboratory analyzers for sodium and chloride concentrations were statistically significant but not clinically relevant (Table 1). From a biological standpoint, the differences are unimportant and remain within the recommendations of quality requirement (<1% for sodium and 1.6% for chloride).16 Hence, for these measurements, the analyzers could be considered reliable and apparently interchangeable. However, hyperchloremia is largely responsible for metabolic acidosis17,18 and, in our study, although 89 patients had hyperchloremia (>105 mmol/L) with both analyzers, 26 patients (22% of all patients with hyperchloremia) had hyperchloremia with one analyzer but not with the other. This shows that the reproducibility for chloride measurement between the analyzers was in fact limited. Morimatsu et al.5 reported similar findings and concluded that the small differences in sodium and chloride concentrations affected the calculated SIDapp values and could lead clinicians to make different assessments of acid-base status in the same patient. The possible explanations these authors gave for the differences in results were differences in the time between sampling and analysis, different sample preparations, and different samples, because whole blood was analyzed with a blood gas analyzer and plasma using multichannel technology. In our study, serum was analyzed using the same volume from the same sample at the same time and under the same conditions after centrifugation, eliminating the influence of different volumes from different samples. In addition, aliquoting prevents errors from secondary hemolysis and indeed no interference due to hemolysis was observed as confirmed by the very good correlation for potassium between the two analyzers. Electrolyte concentrations can differ significantly between serum and plasma specimens, with potassium and phosphate displaying the largest differences at 8% and 7%, respectively.19 In addition, blood collection tubes from different manufacturers may contain different amounts and/or formulations of heparin that may impact electrolyte results. Sample volumes can influence electrolyte concentrations, and Carraro and Plebani20 reported that 13% of laboratory errors were due to tube filling errors and 8.1% to inappropriate containers. This induces preanalytic errors of measurement. In our study, serum tubes did not contain heparin, and blood gas analysis using heparin was performed once. The tubes were all the same, and the same serum with the same volume from each tube was analyzed by both analyzers.

Differences in SIDapp, SIDeff, and SIG

Metabolic acid-base disorders are classically described by the plasma concentration of bicarbonate and Paco2. Stewart et al.1,2,9 proposed that acid-base status should rather be described by three variables: Paco2, SIDapp, and weak acids. Among the factors included in the calculation of SIDapp, chloride and sodium variations are the most important.1,2,5,9,21 In our study, a small difference in electrolyte measurements influenced the SIDapp, SIDeff, and SIG (large difference in limits of agreement: 9.8, 6.5, and 11.8 mEq/L, respectively). Other studies have also reported large limits of agreement of estimated SIDapp (−3.4 to +9.5 mmol/L)5 and SIDeff (−4.85 to +4.71 mmol/L).4 A reliable measure of SIDapp is thus difficult to obtain in plasma.22 Furthermore, our study showed a weak correlation between the SIDapp obtained from the two analyzers (r2 = 0.54). Moreover, if the normal ranges of SIDapp defined by Fencl et al.9 are considered, the distribution of SIDapp values among low, normal, and high was significantly different between the two analyzers (Table 2). In contrast, there was a good correlation for SIDeff between the analyzers.

Because SIDapp and SIDeff are used to calculate the SIG, changes in SIDapp or SIDeff will also influence SIG. We found a large difference in the limits of agreement for SIG (−5.1 and +6.6 mEq/L) and a weak correlation between the analyzers (r2 = 0.12). Many methods have been used for calculation of SIG, and large differences in values have been reported, from close to zero7,23 to 3 mEq/L,24,25 8 mEq/L,9 10 mEq/L,26 or even 11 mEq/L.27 This has led investigators to make different conclusions about the prognostic significance of the Stewart method.27–34

In previous studies,25,35 significant clinical variations were considered to be 3.96 mEq/L for SIDapp, 3.78 mEq/L for SIDeff, and 3.1 mEq/L for SIG. Therefore, the limits of agreement required to avoid an impact on clinical decisions should be below these values. In our study, the limits of agreement for SIDapp, SIDeff, and SIG between the analyzers were larger than these values.

This variability in the SIDapp, SIDeff, and SIG results between two central laboratory analyzers or between studies can be explained by the combination of errors in each measurement. SIDapp, SIDeff, and SIG are calculated and not measured. The general relationship between the standard uncertainty U of a calculated parameter (e.g., SIDapp) and the uncertainty of each independent parameter on which it depends (for SIDapp: UNa, UK, UCl, UMg, UCa, Ulactate) can be calculated as the square root of the sum of the square of the uncertainties of each parameter. This result has to be multiplied by two to obtain a 95% level of confidence.14,36,37

As a square root function is a growing mathematical function, one can anticipate that the expanded uncertainty of measurement of a calculated parameter will increase with the number of the independent parameters on which it depends, as shown by our results. This factor can explain the poor reproducibility between the analyzers for calculated values influenced by several variables.

Quantification of the variability in SIDapp, SIDeff, and SIG can help determine how much this variability will affect the result. Without knowledge of the degree of variability, we may mistakenly interpret the significance of variations when making a diagnosis or in the assessment of the evolution of the pathophysiological state of a patient. For example, the patient shown in Table 3 had a base excess of −2.2 mEq/L. Results from the LX20 gave a low SIDapp, a low SIDeff, and a moderately high SIG. We can interpret these results as hyperchloremic acidosis with a moderate increase in unmeasured anions partially compensated by a hypoalbuminemic alkalosis. This could be secondary to excessive administration of saline solutions, and giving further saline fluids should be avoided. Modular results show high SIDapp and SIG and low SIDeff, reflecting a metabolic acidosis due to a large increase in unmeasured anions partially compensated by hypoalbuminemic alkalosis. In this case, there would be no contraindication to saline solutions. Thus, the therapeutic implications can be very different depending on the results from the analyzer. We can extrapolate these findings to suggest that the results of the Stewart parameters, and hence interpretation of a patient’s acid-base status, may depend on the hospital where the patient is treated or the central laboratory analyzer that the hospital uses.

Variability may be the result of consistent differences between the analyzers and perhaps the accumulation of small errors in multiple measurements. In a future study, it would be interesting to analyze whether a similar variability in Stewart parameters would have occurred from multiple measurements of each variable from the same analyzer.

Our results raise an important dilemma when using the Stewart method. We should be cautious with the use of the SIG because, even though more inclusive, it can cumulate measurement errors. The reproducibilities of more commonly used parameters, AG and albumin corrected AG, between the two analyzers are also surprisingly weak. The AG is less inclusive but has less cumulative error than SIG and importantly remains easier to calculate than SIG. From our results, use of SIG or albumin corrected AG is not entirely satisfactory. Classical AG is a better alternative; an abbreviated version of SIDapp (Na-Cl lactate) may be better (r2 = 0.63, bias not significant), but its role in clinical practice remains to be determined. To be useful in daily practice, an “ideal” parameter should follow pathophysiological principles, be as precise as possible, be easy to obtain near the bedside, be reproducible, and reliable. The Henderson-Hasselbalch approach is understandable from pathophysiological principles, is easy to perform at the bedside with only an arterial blood sample, and is reliable. Its disadvantage is that it does not allow for the important role of electrolytes and albumin in acid-base disorders. The Stewart approach, in contrast, is very difficult to understand, needs complex calculations that cannot be performed at the bedside, and our study shows that it is not reliable. Its theoretical advantage is that it considers the role of electrolytes in acid-base interpretation. Neither approach can identify unmeasured anions in acid-base disorders. Hence, there is no ideal approach for determination of acid-base balance, but we prefer the Henderson-Hasselbalch approach in our daily practice because it is more practical and reproducible than the Stewart approach.

Our observations demonstrate that even small analytical differences can become clinically significant when the differences are exaggerated via mathematical calculation. It reminds clinicians that they need to work with their laboratories to ensure that the laboratory methods and assays used have a reasonable degree of similarity to the analytical methods used in clinical studies in which clinical guidelines were developed. In fact, many clinical guidelines were developed during clinical trials in which outdated instrumentation and/or nonstandardized assays were used, leaving the generalizability of such results in question.


In conclusion, although the Stewart approach is supported by some, it is relatively complex and our observations question the precision of the measurements. If using the Stewart approach to interpret acid-base disorders, the potential variability of the measured parameters, particularly for calculated values, must be considered, because these variations may lead to different interpretations of clinical situations and possibly to different therapeutic management. These potential differences must also be considered when comparing studies from different hospitals.


1. Stewart PA. How to understand acid-base: a quantitative acid-base primer for biology and medicine. New York: Elsevier, 1981
2. Stewart PA. Modern quantitative acid-base chemistry. Can J Physiol Pharmacol 1983;61:1444–61
3. Figge J, Mydosh T, Fencl V. Serum proteins and acid-base equilibria: a follow-up. J Lab Clin Med 1992;120:713–9
4. Figge J, Rossing TH, Fencl V. The role of serum proteins in acid-base equilibria. J Lab Clin Med 1991;117:453–67
5. Morimatsu H, Rocktaschel J, Bellomo R, Uchino S, Goldsmith D, Gutteridge G. Comparison of point-of-care versus central laboratory measurement of electrolyte concentrations on calculations of the anion gap and the strong ion difference. Anesthesiology 2003;98:1077–84
6. Fencl V, Leith DE. Stewart’s quantitative acid-base chemistry: applications in biology and medicine. Respir Physiol 1993;91:1–16
7. Kellum J, Kramer D, Pinsky M. Strong ion gap: a methodology for exploring unexplained anions. J Crit Care 1995;10:51–5
8. Kellum JA. Acid-base disorders and strong ion gap. Contrib Nephrol 2007;156:158–66
9. Fencl V, Jabor A, Kazda A, Figge J. Diagnosis of metabolic acid-base disturbances in critically ill patients. Am J Respir Crit Care Med 2000;162:2246–51
10. Figge J, Jabor A, Kazda A, Fencl V. Anion gap and hypoalbuminemia. Crit Care Med 1998;26:1807–10
11. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307–10
12. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979;86:420–8
13. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas 1960;20:37–46
14. Giroud C, Dumontet M, Vassault A, Braconnier F, Ferard G. [Recommendations for expressing uncertainty of measurement of quantitative results in laboratory medicine]. Ann Biol Clin (Paris) 2007;65:185–200
15. Working Group 2 of the JCGM. International vocabulary of metrology—basic and general concepts and associated terms (VIM). 3rd ed. Geneva: International Organization for Standardization, 2006
16. Vassault A, Grafmeyer D, de Graeve J, Cohen R, Beaudonnet A, Bienvenu J. [Quality specifications and allowable standards for validation of methods used in clinical biochemistry]. Ann Biol Clin (Paris) 1999;57:685–95
17. Gunnerson KJ, Saul M, He S, Kellum JA. Lactate versus non-lactate metabolic acidosis: a retrospective outcome evaluation of critically ill patients. Crit Care 2006;10:R22
18. Story DA, Morimatsu H, Bellomo R. Hyperchloremic acidosis in the critically ill: one of the strong-ion acidoses? Anesth Analg 2006;103:144–8
19. Ladenson JH, Tsai LM, Michael JM, Kessler G, Joist JH. Serum versus heparinized plasma for eighteen common chemistry tests: is serum the appropriate specimen? Am J Clin Pathol 1974;62:545–52
20. Carraro P, Plebani M. Errors in a stat laboratory: types and frequencies 10 years later. Clin Chem 2007;53:1338–42
21. Zander R, Lang W. Base excess and strong ion difference: clinical limitations related to inaccuracy. Anesthesiology 2004;100:459–60
22. Staempfli HR, Constable PD. Experimental determination of net protein charge and A(tot) and K(a) of nonvolatile buffers in human plasma. J Appl Physiol 2003;95:620–30
23. Wilkes P. Hypoproteinemia, strong-ion difference, and acid-base status in critically ill patients. J Appl Physiol 1998;84:1740–8
24. Hayhoe M, Bellomo R, Liu G, Kellum JA, McNicol L, Buxton B. Role of the splanchnic circulation in acid-base balance during cardiopulmonary bypass. Crit Care Med 1999;27:2671–7
25. Liskaser FJ, Bellomo R, Hayhoe M, Story D, Poustie S, Smith B, Letis A, Bennett M. Role of pump prime in the etiology and pathogenesis of cardiopulmonary bypass-associated acidosis. Anesthesiology 2000;93:1170–3
26. Dubin A, Menises MM, Masevicius FD, Moseinco MC, Kutscherauer DO, Ventrice E, Laffaire E, Estenssoro E. Comparison of three different methods of evaluation of metabolic acid-base disorders. Crit Care Med 2007;35:1264–70
27. Cusack RJ, Rhodes A, Lochhead P, Jordan B, Perry S, Ball JA, Grounds RM, Bennett ED. The strong ion gap does not have prognostic value in critically ill patients in a mixed medical/surgical adult ICU. Intensive Care Med 2002;28:864–9
28. Balasubramanyan N, Havens PL, Hoffman GM. Unmeasured anions identified by the Fencl-Stewart method predict mortality better than base excess, anion gap, and lactate in patients in the pediatric intensive care unit. Crit Care Med 1999;27:1577–81
29. Carreira F, Anderson RJ. Assessing metabolic acidosis in the intensive care unit: does the method make a difference? Crit Care Med 2004;32:1227–8
30. Cuhaci B. Unmeasured anions and mortality in the critically ill: the chicken or the egg? Crit Care Med 2003;31:2244–5
31. Dondorp AM, Chau TT, Phu NH, Mai NT, Loc PP, Chuong LV, Sinh DX, Taylor A, Hien TT, White NJ, Day NP. Unidentified acids of strong prognostic significance in severe malaria. Crit Care Med 2004;32:1683–8
32. Hatherill M, Waggie Z, Purves L, Reynolds L, Argent A. Mortality and the nature of metabolic acidosis in children with shock. Intensive Care Med 2003;29:286–91
33. Kaplan LJ, Kellum JA. Initial pH, base deficit, lactate, anion gap, strong ion difference, and strong ion gap predict outcome from major vascular injury. Crit Care Med 2004;32:1120–4
34. Kellum JA. Clinical review: reunification of acid-base physiology. Crit Care 2005;9:500–7
35. Rehm M, Finsterer U. Treating intraoperative hyperchloremic acidosis with sodium bicarbonate or tris-hydroxymethyl aminomethane: a randomized prospective study. Anesth Analg 2003;96:1201–8
36. Ellison SLR, Rosslein M, Williams A. EURACHEM/CITAC Guide CG4: Quantifying uncertainty in analytical measurement. Middlesex: EURACHEM/CITAC, 2000
37. Guide to the expression of uncertainty in measurement. Geneva: International Standardization Organization, 1993
© 2009 International Anesthesia Research Society