Multiple recent studies have cited racial bias in measurement of oxygen saturation (Spo2) using pulse oximetry.1,2 Some studies concluded that occult hypoxemia, frequently defined as an Spo2 value >88% to 92% when arterial co-oximetry measured oxygen saturation (Sao2) is <88%, is more common in self-identified Black people when compared with White people. In emergency and critical care decision making, this “hidden” hypoxemia can impact clinicians’ ability to appropriately treat patients, and may be associated with increased morbidity and mortality.2
Our research team consists of high-acuity care clinicians and biomedical engineers with advanced expertise in pulse oximetry. We have used high-fidelity pulse oximetry to monitor hospitalized general care patients for more than a decade to dramatically decrease failure to rescue events from serious but treatable complications.3 Given the potential widespread clinical implications of recent pulse oximetry studies, we reviewed the recent literature investigating disproportionately inaccurate Spo2 measurement in Black people and available technical literature citing potential mechanisms of error (Supplemental Table for references, https://links.lww.com/JPS/A546).
The review identified inconsistent results in the trend of pulse oximetry measurement error between racial groups. Eight studies found the greater error with Black people versus White people, yet 2 studies found the greatest error with Asian people versus both White people and Black people, and one found no significant differences in errors between racial groups. The magnitude and impact of measurement error also varied across studies. Several studies noted that differences across races would not be considered clinically relevant. Measurement differences in the studies were often within standards set by the Food and Drug Administration and, if less than 1% as reported in multiple studies, would not even be discernible on the device display. One study showed a larger error in self-identified Black people, although mortality for the White cohort was higher, suggesting that differences in pulse oximetry measurement did not affect outcomes.
We noted widespread study method limitations as possible contributors to these observations. Many studies reviewed suffer from lack of appropriate control of potential sources of error and variation, notably the adequate synchronization of arterial and Spo2 measurement. Failure to identify/control for device manufacturer, lack of criterion standard calibration, small subgroup sample sizes, absence of controls for confounding patient characteristics, and documentation inaccuracies were also noted in recent studies. One common assertion across studies is identification of skin pigmentation as the source of measurement error. In this case, the use of self-identified race to segment populations would also contribute to inaccurate findings. Race is a social construct and thus is a crude indicator of skin color at the measurement site related to the combination of melatonin, carotene, arterial blood, and venous blood, among others. When skin pigmentation has been quantified, studies have often used the overly simplistic single light to dark Fitzpatrick scale. The studies rarely mentioned or assessed other well-known sources of Spo2 measurement error and artifact that must be managed to obtain accurate Spo2 readings, including ambient light, motion, conditions causing venous pulsations in the tissue bed, electromagnetic interference, poor peripheral perfusion, sensor placement and application to the skin, optical shunting, colored dyes, nail polish, and other hemoglobins (carboxyhemoglobin and methemoglobin). The degree of success in managing many of these issues is device dependent and varies between manufacturers, despite compliance with regulatory standards, as was demonstrated in a recent follow-up investigation.4 However, one might expect that many of the factors described previously appear as random effects in analysis, with limited effect on results. However, choice of device manufacturer, cultural aspects of patient care, and policy and procedures for device use are often systemic and could result in significant variation contributing to the inconsistencies noted across study results.
Sufficient explanations for how skin pigmentation could lead to Spo2 measurement error were not provided in the retrospective studies reviewed. However, a recent and timely editorial by Bickler and Tremper5 summarizes how this technology works and offers possible mechanisms for pulse oximeter error and bias. They describe how calibration curves are built using human volunteers breathing controlled hypoxic mixtures of gas, and the relationship between Sao2 and Spo2 allows the construction of a calibration graph, which is integrated into each device. They explain how skin pigmentation could act as a variable light filter, which could alter the absorbance of red and infrared light transmitted through the tissue and produce the error observed. They also highlight that the red light–emitting diodes used in pulse oximeters use a distribution of wavelengths that are nonuniformly absorbed by pigments in the skin. The implication is that a single calibration curve, even when human subjects of all skin pigment types are included, may not adequately solve for this source of error. The variation in how successfully each manufacturer is managing this source of error is of critical importance.
Although we agree with the conclusion drawn by Okunlola et al6 that “Spo2 errors in people with darkly pigmented skin are real and need further study,” clearly more exploration is needed to characterize, understand, and address differences in pulse oximetry measurements and associated outcomes in patients with different skin pigmentation.
In the near term, we believe that clinical confidence could be improved if manufacturers provide access to calibration studies they conducted, showing how they included adequate samples of subjects across skin pigmentation levels. We strongly advocate for the undertaking of prospective studies, testing skin pigment and low perfusion hypotheses of the observed error using the methods considered “best practices” for calibration studies and criterion standard comparisons. We recommend use of better standards for assessing skin pigmentation levels such as colorimetric/spectrophotometric methods (e.g., diffuse reflectance spectroscopy7).
Also at a minimum, future studies should account for the manufacturer device/sensor used and site of monitoring, and apply synchronous sampling of Spo2 and Sao2, with Sao2 being measured directly using co-oximetry. Furthermore, the definition of occult hypoxemia should be standardized to represent error greater than Food and Drug Administration acceptable limits for device measurement error (e.g., Sao2 <88 when Spo2 >92). Other study confounders should be controlled to the greatest extent possible (e.g., patient factors such as illness severity, nail polish, and peripheral artery disease, as well variation in study site, education and device use procedures, and clinical data collection/validation). Analysis plans should include methods appropriate for device performance assessment against a criterion standard, such as Bland-Altman graphs.
Every medical device has performance and use limitations, and pulse oximeters are no exception. Therefore, we also recommend that clinicians be educated regarding pulse oximetry technology, performance specifications, measurement interpretation, and appropriate use in clinical settings. Understanding device features such as waveform display, signal-to-noise indicators, and perfusion indices can allow clinicians to quickly adjust confidence in the Spo2 values and provide better patient care.
The problem of skin pigmentation–associated error in pulse oximetry is serious and warrants the medical community’s immediate attention. However, we are concerned that the current trend in publication of retrospective analyses with limitations described previously could lead to underuse of continuous noninvasive pulse oximetry in settings in which it has been shown to be lifesaving, leading to significant negative patient care consequences. We urge the medical community to move forward with additional studies to characterize the issues at hand using scientific best practices to ensure the safety and well-being of all patients.
1. Sjoding MW, Dickson RP, Iwashyna TJ, et al. Racial bias
in pulse oximetry
measurement. N Engl J Med
2. Wong AKI, Charpignon M, Kim H, et al. Analysis of discrepancies between pulse oximetry
and arterial oxygen saturation measurements by race and ethnicity and association with organ dysfunction and mortality. JAMA Netw Open
3. McGrath SP, McGovern KM, Perreard IM, et al. Inpatient respiratory arrest associated with sedative and analgesic medications: impact of continuous monitoring on patient mortality and severe morbidity. J Patient Saf
4. Barker SJ, Wilson WC. Racial effects on masimo pulse oximetry
: a laboratory study. J Clin Monit Comput
. 2022;12:1–8. doi:10.1007/s10877-022-00927-w. PMID: 36370242; PMCID: PMC9652601.
5. Bickler P, Tremper KK. The pulse oximeter is amazing, but not perfect. Anesthesiology
6. Okunlola OE, Lipnick MS, Batchelder PB, et al. Pulse oximeter performance, racial inequity, and the work ahead. Respir Care
7. Gordon RA, Branigan AR, Khan MA, et al. Measuring skin color: consistency, comparability, and meaningfulness of rating scale scores and handheld device readings. J Surv Stat Methodol