Barker, Steven J. PhD, MD
Pulse oximetry is a standard of care for every patient in the operating room and is rapidly becoming a standard in all critical care settings. Patient motion can cause a variety of errors in pulse oximeters, including reduced accuracy, loss of signal, false desaturation alarms, and missed hypoxemic events. Patients who are hypoxic may be agitated and moving violently. Thus, a pulse oximeter that fails during motion will not provide saturation data on patients at the greatest risk of hypoxemia.
There have been several laboratory and clinical studies of pulse oximeter failure. An early retrospective study in 1991 found an overall “failure rate” in the operating room of approximately 1%(1). A later prospective study of 9578 recovery room patients found that patient motion was responsible for 56% of the 106 cases in which pulse oximetry was “completely abandoned”(2). This study also showed that pulse oximeters fail more often in sicker patients, with a 7.8% failure rate in ASA IV patients.
Patient motion not only causes a loss of current Spo2 data, but it can also create false alarms—displayed warnings of hypoxemia when the patient is well. A study in the pediatric critical care unit found that 71% of all pulse oximeter alarms were false (3). This frequent false-positive rate encourages nurses and other care providers to manually disable alarms, thereby risking failure to detect actual sudden hypoxemia.
Motion artifact thus creates both false-positive (false alarm) and false-negative (missed hypoxemia) errors. Changing alarm thresholds to reduce one of these errors will increase the other. In an early volunteer study of motion artifact, Langton and Hanning (4) found that hand vibration caused both false-alarm and missed hypoxemia errors in four 1990-era oximeters. A similar study showed that voluntary hand movements also produced false desaturation alarms and erroneous pulse rate readings (5). Voluntary movements are prone to between-subject variability of the motions, which can affect comparisons of different instruments.
Beginning in the 1990s, pulse oximeter manufacturers began making design improvements specifically aimed at reducing motion artifact. As varied approaches to motion artifact were developed and marketed, clinicians needed answers to the following questions. 1) Were these new signal-processing algorithms really better than the old technology? 2) Which technology worked best under the conditions of a specific clinical practice?
A number of studies have attempted to answer these questions in motion-resistant pulse oximeters. Two laboratory studies found that the Nellcor N-3000 Oxismart® produced fewer false alarms and dropouts during motion in volunteers breathing room air than did the older N-200 (6,7). However, neither of these studies investigated the performance of the instruments during actual hypoxemia.
In 1996, this author performed a laboratory study using 10 healthy volunteers who underwent transient hypoxemia while their hands were being moved by an automated motion table (8). The effects of random and periodic hand motions on oximeter accuracy were measured during room air breathing and during both slow and rapid arterial desaturations to an Sao2 value of 70%. This study compared the older Nellcor N-200 with two motion-resistant instruments: the Nellcor N-3000 and the Masimo SET® prototype. The results showed that the Masimo outperformed the other two instruments in terms of accuracy, rejection of false alarms, and continued performance during motions.
Since the time of that study, oximeter manufacturers have made further advances in technology. Several newer instruments are being marketed with claims of accuracy during motion. It is therefore appropriate to conduct a new study, comparing the more recent versions of motion-resistant oximeters during motion and hypoxemia. That is the purpose of this study.
Seventy healthy volunteers participated in this study, with informed consent and approval by the Human Subjects Review Committee. Each subject was monitored with 6 oximeter sensors: three on Digits 2, 3, and 4 of the moving “test” hand and 3 of the same make and model on the digits of the stationary “control” hand. Sensor assignments to the digits were rotated among the subjects. All sensors were of the disposable tape-on variety, to minimize sensor displacement during hand motion. Twenty different instruments were tested, as listed in Table 1.
The test hand was strapped to a motorized motion table, which produced repeatable, continuous hand motions (Fig. 1). This motion table, which was similar to that used in our previous study (8), moved the hand up and down while the elbow remained fixed. It could be configured so that the fingertips either tapped or rubbed on a smooth surface. The amplitude of the motion was ±2 cm, and the frequency was either fixed or randomly varied (aperiodic) within the range of 1 to 4 Hz. Our previous study found that these motions could cause any pulse oximeter to fail at least occasionally. On the basis of observations in the recovery room and intensive care unit (ICU), we believe that these motions are similar to movements of actual patients.
The peripheral perfusion of healthy, awake volunteers is clearly not that of typical surgical patients. To partially compensate for this fact, we maintained our laboratory temperature in the 16°C–18°C range and kept the subjects’ arms exposed. Skin temperature was monitored by a thermocouple on the fifth digit and was in the range of 19°C–26°C. Values of the perfusion index measured by the Masimo SET pulse oximeter were similar in our cooled subjects and actual patients in the recovery room and ICU.
Each subject underwent the following protocol. The oximeter sensors were applied as described previously and connected to their respective instruments. The oximeters were connected via serial data ports to a computerized data logger that recorded all pulse and saturation values once per second. After recording room air control values with both hands stationary, the motion table was activated, and 2 min of data were recorded for each of 2 motions: 1) fingers tapping at 3 Hz or at a frequency that varied randomly between 1 and 3 Hz and 2) fingers rubbing at the same frequencies. The frequencies of the tapping and rubbing motions were alternated with each subject. Once the two motions were completed and all Spo2 values returned to baseline, the sensors were moved to different test fingers and the series was repeated twice, so that all three test digits were monitored with each test oximeter.
In the next series of tests on each subject, a hypoxemic episode occurred during each motion period. For this purpose, the subjects breathed gases from a modified anesthesia machine via a tight-fitting mask and circle system. Inspired gas was a blend of nitrogen and oxygen that could be adjusted to any desired fraction of inspired oxygen (Fio2). Inspired and expired oxygen and carbon dioxide were monitored by a Datex Capnomac-Ultima™ gas analyzer. After the beginning of each motion, the Fio2 was adjusted to produce a rapid decrease in arterial saturation to a level of approximately 70%. As this value was reached, Fio2 was abruptly changed to 1.0 until the control hand Spo2 value had returned to baseline, at which time the motion was terminated. Each rapid desaturation-resaturation experiment was completed within 3 min. Continuous verbal contact was maintained with the subjects, who were instructed to remove the breathing mask if they experienced unpleasant symptoms. All were able to complete the protocol without difficulty.
The protocol during hypoxemia included the additional feature of disconnecting and reconnecting all test sensors from their respective instruments after the motion had begun. This disconnect-reconnect (DC/RC) experiment simulates the actual clinical scenario of placing an oximeter sensor on a patient who is already moving when the sensor is applied. The instrument should be able to acquire and process the light absorbance signal in real time while motion is occurring. Two DC/RC experiments were performed in a series of four motions with each subject. The full hypoxemic series thus included 1) nonmotion hypoxemia to assess differences in instrument, limb, and finger response times; 2) random tapping motion with DC/RC at start of hypoxemia; 3) 3-Hz tapping motion with DC/RC at start of hypoxemia; 4) 3-Hz tapping without DC/RC during hypoxemia; and 5) random rubbing without DC/RC during hypoxemia. This entire series of events was performed once per subject.
Our experiment was a methods-comparison study, in which two methods of simultaneously measuring the same variable are compared. The two methods compared here were 1) Spo2 measurements on the moving test hand and 2) simultaneous “gold standard” Spo2 measurements on the stationary control hand. Altman and Bland (9) have provided guidelines for analyzing the data resulting from such studies. Following their recommendations, we compared the test and control pulse oximeter measurements using the bias (mean difference between the two methods) and precision (sd of the difference). To calculate bias and precision, it is crucial to compare simultaneous values of the test and “gold standard” instruments. Because of the nonsteady-state nature of our hypoxemia tests, differences in circulation time can produce a time axis shift in the Spo2 versus time curves of the two hands. Therefore, we time shifted the Spo2 versus time curve for each oximeter to provide the best fit with the corresponding control-hand curve. This time shift was always <10 s, did not benefit any particular instrument, and affected no calculated values other than the bias and precision.
Bias and precision are relevant only when the test instrument is displaying a current Spo2 value—they do not account for dropouts, or periods during which no data are displayed. Therefore, we calculated the dropout rate (DR), or percentage of measurement time during which no current Spo2 values are displayed. To combine measures of accuracy and reliability, we also calculated the performance index (PI), defined as the percentage of time during which the instrument displayed a current Spo2 value that was within 7% of the simultaneous control value. The rather generous error margin of 7% was chosen because Spo2 changes rapidly over a wide range in these experiments. A smaller error margin will reduce the PI of all instruments, and a larger margin will increase it. Changing the error window from 5% to 10% changes the PI values but does not change the ranked order of any of the instruments. The PI for pulse rate accuracy (PR-PI) was defined in the same manner as for Spo2, but with a window of 10%.
The oximeter’s ability to detect hypoxemic events is quantified by sensitivity and specificity. If we define hypoxemia as a control Spo2 value of <90%, then sensitivity is the percentage of time that the test oximeter reads <90% during actual hypoxemia (i.e., the probability of detecting the hypoxemia). Conversely, specificity is the percentage of time that the test oximeter reports an Spo2 value more than 90% when the control value is also more than 90% (i.e., the probability of detecting normoxemia). We can increase the sensitivity of a particular instrument by simply increasing the threshold Spo2 value for the alarm condition. For example, we can set the hypoxemia alarm at 95% rather than 90%. Although this will increase sensitivity, it will decrease specificity, resulting in more false alarms. By varying the alarm threshold, we can generate a series of paired sensitivity and specificity values for the instrument. Plotting sensitivity versus (1 − specificity) by using these values generates the receiver operating characteristic (ROC) curve. We calculated this curve for each test instrument.
A typical plot of Spo2 versus time for motion during room air breathing is shown in Figure 2A, and a similar plot for motion during a hypoxemic episode is shown in Figure 2B. Table 1 shows composite values from all motion experiments of bias, precision, DR, saturation PI, PR-PI, sensitivity, and specificity. The instruments are listed in descending order of saturation PI. The newer-generation instruments, many of which are advertised as being motion resistant, are indicated by an asterisk. The older instruments are shown to provide a contrast with the performance of the newer technologies.
Figure 3 shows calculated ROC curves for the tested oximeters. The ideal instrument will have a ROC curve lying in the extreme upper left-hand quadrant of the graph. Detecting hypoxemia by flipping a coin will produce an ROC curve following the line of identity (dotted line;x =y). Some of the test instruments tested yielded ROC curves below this line.
The Spo2 versus time plots of Figure 2 show important characteristics of motion artifact errors. In each figure, the “control” curve shows the actual Spo2 values for comparison with the various test-hand curves. Some instruments will display an immediate false decrease in Spo2 shortly after the motion starts (see the N-395 curve in Fig. 2A). This artifact will cause a false alarm to be displayed. During an actual hypoxemic event, such as that of Figure 2B, some instruments will “freeze” at a fixed Spo2 value during the motion (see the N-200 curve in Fig. 2B). This frozen Spo2 value may sometimes be low enough to trigger an alarm, but it will not alert the clinician to the true degree of hypoxemia, nor will it show useful trend information or response to treatment.
As shown in Table 1, we found a wide range of oximeter performance in our study, with PI values varying from 94% to 28%. The top 5 instruments in PI were the Masimo SET (V2) (94%), the Agilent Viridia 24C (rev. B.0) (84%), the Agilent CMS (rev. B) (80%), the Datex-Ohmeda 3740 (80%), and the Datex-Ohmeda 3800 (79%). The Nellcor N-395 (V1620) finished seventh, with a PI of 69%. Sensitivity to hypoxemia ranged from 98% (Masimo) to 28% (BCI 3304), and specificity ranged from 93% (Masimo) to 15% (Criticare 5040). The instruments with lower sensitivity and specificity values also had small DR, meaning that they continued to display an Spo2 value that was grossly in error during motion. The Masimo SET oximeter yielded the highest values of all of the calculated performance statistics (Table 1).
The high Spo2 PI values of the Datex-Ohmeda units (3740, 3800, and AS/3) deserve comment, because these instruments are not marketed as motion resistant. All oximeters were tested in their default modes—the settings that apply when the instrument is first turned on. The default time-averaging settings used by Datex-Ohmeda are 12 seconds for the 3740 and 3800 and 10 seconds for the AS/3, whereas most other instruments use 5- to 8-second averaging. This longer averaging time yields a high value of Spo2 PI, which is a time-averaged variable, but the price is paid in the instrument’s ability to detect rapidly occurring hypoxemia. Note that the sensitivity is rather low for the 3740 (68%) and 3800 (63%) and that the specificity is very low for the AS/3 (45%). Sensitivity measures the ability to detect true hypoxemia, whereas specificity measures the ability to reject false alarms.
The more recent pulse oximeters outperform the older models in this study in terms of both accuracy and reliability during motion. Manufacturers are making frequent improvements in both hardware and software as more clinical experience becomes available. All of the instruments we tested were the current versions during the time period of our experiments, which was from March 1999 to September 2000. Our results will not reflect improvements that have been made since that time. The largest manufacturers of pulse oximeters were represented, but there are a number of other manufacturers that were not included.
Another volunteer study of oximeter performance during motion, which is not yet published in the peer-reviewed literature, has yielded different results (10). This study, performed in the laboratory of an oximeter manufacturer, used intermittent, voluntary hand movements rather than the continuous, machine-generated motions of our study. Only three instruments, all recent versions, were compared in this study. In our experience, we have found it difficult to obtain consistency between subjects when using voluntary movements. Because we compared 20 different instruments in our study, we could not test all 20 on every volunteer. Therefore, we used every possible means to ensure consistency between subjects.
Our laboratory volunteer protocol, like any other, can be criticized on the grounds that it does not accurately represent the clinical setting and hence may favor one instrument over others. However, the few case reports and clinical studies published appear to support the results of our experiments. Bohnhorst et al. (11) studied hypoxemic events in the neonatal ICU. They found that the Nellcor Oxismart (N-3000) missed 5.4% of hypoxemia episodes and 69% of bradycardia episodes, whereas the Masimo SET missed 0.5% and 7%, respectively. Hay et al. (12) have very recently compared the Masimo SET with the Agilent Viridia 24C, the Nellcor N-395, and the Novametrix MARS in neonatal ICU patients. They found that the Masimo SET generated fewer false alarms while providing better detection of actual hypoxemia than all of the other instruments. Torres et al. (13) studied children undergoing surgical repair of congenital heart defects and found a failure rate of 40% for the Nellcor N-395, compared with 10% for the Masimo SET. The ultimate benchmark for laboratory studies of any patient monitor is the comparison with clinical data.
In summary, our volunteer data provide strong evidence that newer-generation pulse oximeters exhibit improved performance during patient motion. In particular, the Masimo SET appears to provide superior performance during motion, with substantially higher values of PI, sensitivity, and specificity. The Masimo processes the light absorbance signals by using new, unique algorithms running in parallel, as described elsewhere in the literature (8,14). The clinical implications of this performance improvement are significant. Because awake, hypoxic patients tend to be agitated and moving, pulse oximeters are more likely to be affected by motion artifact when the patient is in distress. Motion-resistant or read-through-motion oximeters, particularly the Masimo, will be more capable of displaying accurate Spo2 values in this setting, which will improve our ability to detect life-threatening hypoxemia.
1. Freund PR, Overand PT, Cooper J, et al. A prospective study of intraoperative pulse oximetry failure. J Clin Monit 1991; 7: 253–8.
2. Moller JT, Pederson T, Rasmussen LS, et al. Randomized evaluation of pulse oximetry in 20,802 patients: I. Anesthesiology 1993; 78: 436–44.
3. Lawless ST. Crying wolf: false alarms in a pediatric intensive care unit. Crit Care Med 1994; 22: 981–5.
4. Langton JA, Hanning CD. Effect of motion artefact on pulse oximeters: evaluation of four instruments and finger probes. Br J Anaesth 1990; 65: 564–70.
5. Plummer JL, Zakaria AZ, Ilsley AH, et al. Evaluation of the influence of movement on saturation readings from pulse oximeters. Anaesthesia 1995; 50: 423–6.
6. Lie C, Kehlet H, Rosenberg J. Comparison of the Nellcor N-200 and N-3000 pulse oximeters during simulated postoperative activities. Anaesthesia 1997; 52: 450–2.
7. Plummer JL, Ilsley AH, Fronsko RR, Owen H. Identification of movement artefact by the Nellcor N-200 and N-3000 pulse oximeters. J Clin Monit 1997; 13: 109–13.
8. Barker SJ, Shah NK. The effects of motion on the performance of pulse oximeters in volunteers. Anesthesiology 1997; 86: 101–8.
9. Altman DG, Bland JM. Measurement in medicine: the analysis of method comparison studies. Statistician 1983; 32: 307–17.
10. Jopling MW, Mannheimer PD, Bebout DE. Sensitivity and specificity performance during motion artifact in three pulse oximeters designed for use in motion [abstract]. Anesthesiology 2000; 93 (3A):A585.
11. Bohnhorst B, Peter CS, Poets CF. Pulse oximeters’ reliability in detecting hypoxemia and bradycardia: comparison between a conventional and two new generation oximeters. Crit Care Med 2000; 28: 1565–1568.
12. Hay WW, Rodden DJ, Collins SM, et al. Reliability of conventional and new pulse oximetry in neonatal patients. J Perinatol 2002; 22: 360–6.
13. Torres A, Skender K, Wohrley J, et al. Assessment of two new generation pulse oximeters during low perfusion in children [abstract]. Crit Care Med 2001; 29: A117.
14. Goldman JM, Petterson MT, Kopotic RJ, Barker SJ. Masimo signal extraction pulse oximetry. J Clin Monit 2000; 16: 475–83.