Video-Based Physiologic Monitoring During an Acute Hypoxic Challenge: Heart Rate, Respiratory Rate, and Oxygen Saturation : Anesthesia & Analgesia

Secondary Logo

Journal Logo

Critical Care and Resuscitation: Original Clinical Labratory Report

Video-Based Physiologic Monitoring During an Acute Hypoxic Challenge: Heart Rate, Respiratory Rate, and Oxygen Saturation

Addison, Paul S. PhD*; Jacquel, Dominique PhD*; Foo, David M. H. PhD*; Antunes, André PhD*; Borg, Ulf R. MD, PhD

Author Information
doi: 10.1213/ANE.0000000000001989
  • Open

The ability to acquire physiologic signals remotely from the patient has the potential to lower cost, modify work flow, reduce cabling, and hence free up the patient. One approach to this challenge is to extract physiologic signals from video image streams.1 This approach requires several advanced data processing techniques: first to extract a high-quality video photoplethysmogram (video-PPG) using image recognition and denoising methods and then to process the signal to isolate the pertinent signal components (eg, cardiac pulse, respiratory modulations). These components are then further analyzed to determine the physiologic parameter of interest (eg, heart rate [HR], respiratory rate [RR], oxygen saturation, etc).

The extraction of HR and RR from video has been performed previously using a range of signal processing techniques with reasonably accurate results.2–12 Many of these studies, however, involve healthy volunteers, are of an ad hoc nature or short duration, and were performed under relatively benign patient conditions. Some researchers have focused on the more difficult challenge of extracting oxygen saturation information from the video-PPG. Kong et al,13 for example, used a specialized two-camera system, each with a separate narrow band filter (660 and 520 nm). They found good agreement between their video system and reference measurements with respect to HR and oxygen saturation data. However, all their reference saturation values were between 97% and 99%, and the values measured using video were within 3 percentage points of the reference. In patients undergoing hemodialysis, Tarassenko et al7 demonstrated a technique for the extraction of oxygen saturation trending from video images using only a standard visible light (red, green, and blue [RGB]) camera. Their method involved identifying a region of interest (ROI) on the subject together with a second off-subject region of the image, which was used to correct for aliased frequency components caused by strong light flicker. Absolute measurement of oxygen saturation is not possible using the RGB setup and requires calibration using a “ground truth” reference from a pulse oximeter attached to the patient. Subsequent to calibration, the Tarassenko group tracked oxygen saturation trends during a localized obstructive sleep apnea event from one of the patients. More recent work by the same group14 tracked oxygen saturation changes in 5 healthy volunteers during controlled hypoxia under protocolized conditions. The authors identified the known limitations of many previous studies in this field, which suffer from not examining physiologically/clinically relevant Spo2 ranges. This narrow range is incompatible with the rigorous hypoxia protocol required for the thorough testing of such monitoring technologies through controlled desaturation events.15,16

In this study, we comprehensively tested the ability of a visible light (RGB) video camera to monitor three main vital signs: HR, RR, and arterial oxygen saturation. The study focused on video monitoring during physiologic changes occurring in a controlled porcine model of hypoxia. An algorithm was developed to extract video-PPG information from the image stream and process it to extract continuously reported HRs, RRs, and trends in oxygen saturation. The work extends current research in this area by (1) adhering to a standardized hypoxia protocol involving large, measurable changes in Spo2; (2) using a single ROI on the skin to determine each physiologic parameter of interest; and (3) striving for high uptimes (the percentage of time that each parameter could be calculated from the data) while retaining accuracy in the measurements.


Clinical Study

The data were collected opportunistically during a nonrelated study to measure the effect of various interventions on the reported values of another monitoring device. Part of this parallel study protocol involved two desaturation events conducted in close succession, and data from these events were collected for the video research work. The study also involved the acquisition of reference pulse oximetry data (HRp and Spo2) and the respiratory rate (RRp), which was preset on the ventilator (Narkomed 2B; Draeger Medical, Telford, PA). Approval was given for the use of video monitoring.

The study comprised 8 healthy pigs with a mean weight of 13.3 kg (standard deviation [SD] 1.6 kg) and mean age of 8.4 weeks (SD 0.5 weeks). Because we “piggybacked” onto a separate trial involving protocolized desaturation events within a porcine model, we did not prespecify a sample size. Each animal was ventilated with a tidal volume of 6 to 8 mL/kg, Fio2 was adjusted to maintain 95% arterial saturation, and positive end-expiratory pressure was set to 5 cmH2O. Respiratory rate was adjusted to maintain end-tidal CO2 between 38 and 45 mm Hg. This study comprised 2 discrete episodes of acute hypoxia (resulting in 16 distinct desaturation episodes). The animal was first ventilated with baseline ventilator settings with Fio2 set to 1.0 if not already there. The FiO2 was then rapidly decreased to 0.1 to create acute hypoxia using a mixture of nitrogen and oxygen. A target Spo2 of 40% ± 10% was reached and maintained for a minimal of 2 minutes. (This is a target ± defined limits given as instructions—as per the protocol—to the laboratory staff conducting the study.) Thereafter, the Fio2 was rapidly increased to 1.0.

The protocol was reviewed and approved by the Pre-Clinical Research Services Animal Care and Use Committee. The study was conducted in a good laboratory practice-like fashion in accordance with 21 Code of Federal Regulations Part 58 at an Association for Assessment and Accreditation of Laboratory Animal Care-accredited site. The following standards in terms of appropriate use of animals for biomedical research and/or training were adhered to: the US Animal Welfare Act amendment of 1976 (Title 9, Code of Federal Regulations, Chapter 1, Subchapter A, parts 1, 2, and 3) and the current US National Institute of Health’s Guide for the Care and Use of Laboratory Animals published by the National Research Council. Animals were anesthetized with propofol, fentanyl and isoflurane, and paralyzed with atracurium.

Data Acquisition and Analysis

A commercially available visible light video camera (Panasonic HDC-TM10 High Definition Video Camera) was used to acquire the video information. Video footage was acquired during each of the two desaturation events per animal. The camera was placed on a tripod, aimed at the face of the animal, and located at approximately 1 m from the snout of the animal. Because the room had no windows, all illumination came from the fluorescent room lights and operating room (OR) lights above the table. A schematic of the laboratory setup is shown in Figure 1. In all cases, the first desaturation was conducted with the room and OR lights on and the second desaturation conducted with only the OR lights switched on. A Nellcor OxiMax Max-A sensor (Medtronic, Boulder, CO) was applied to the lip of the animal to provide reference HR and oxygen saturation signals. Reference respiratory rates were determined from known ventilator settings on the anesthesia machine. A ROI around the snout was targeted, free from hair, where the hypoxia-induced skin coloration changes caused by the desaturations could be visible. The ROIs for the HR and breathing computations comprised between 33,216 and 205,088 pixels (ie, between 1.6% and 9.9% of the available 1920 × 1080 image). The ROIs for the SvidO2 calculation comprised between 38,340 and 183,300 pixels (ie, between 1.8% and 8.8% of the available image). We note that the particular choice of ROI can strongly affect the nature of the signal, as demonstrated in Figure 2 and addressed in the Discussion.

Figure 1.:
Schematic of laboratory setup. The camera is placed approximately 1 m away from the animal’s snout.
Figure 2.:
Distinct change to signal morphology of the video photoplethysmogram resulting from movement of the ROI. A, ROI (shown as a box in the superimposed zoomed in image of the nostril) providing a signal with distinct respiratory modulations. A 60-second segment of signal collected using this ROI is shown in the figure. B, ROI producing a signal exhibiting respiratory modulation suppression and cardiac component enhancement. ROI indicates region of interest.

The video image stream was captured at 25 frames per second, and the RGB signals were extracted from the selected ROI. Signals were processed using a fast Fourier transform applied over a sliding temporal window.17 The window was optimized for the two different physiologic signals: to a length of 22 seconds for the pulse component and 48 seconds for the respiratory component. The HR and RR were computed automatically from the resulting frequency spectra by an algorithm, which determined the physiologically relevant local peaks in each spectrum. Unlike Villarroel et al,5 we did not use a separate off-region ROI to cancel out aliased frequencies.

Changes in oxygen saturation were calculated using a ratio-of-ratios (ROR)18 derived from the red (R) and green (G) signals, where the two signals were first normalized by dividing their cardiac pulse amplitudes by the signal baseline values. The ROR method (based on the Beer-Lambert Law) is defined by the equation:

where IAC represents the cardiac pulsatile amplitude signal and IDC represents its respective DC component. λ1 and λ2 represent the 2 different wavelengths used in pulse oximetry or, in this case, the 2 different color channels in the video-derived oxygen saturation. We used λ1 for the R channel and λ2 for the G channel. The ROR for a given pair of wavelengths identifies changes in the oxygenated and deoxygenated absorption characteristics of blood and can then be used to determine the oxygen saturation of the arterial blood. Details of the derivation of the oxygen saturation can be found in Webster.18 Normally in pulse oximetry, the wavelengths chosen are localized in the red and near infrared part of the spectrum. When using two video camera channels, a broader spectrum of wavelengths is captured with overlapping spectra. Using a standard RGB camera only allows for a relative saturation value to be determined from this normalized ratio of the amplitude of two of the signals.7 Hence, the signal required calibration against known values from the reference pulse oximeter to provide an absolute value of SvidO2. This calibration was performed by calculating the coefficients A and B that minimized the error between the SvidO2 and the reference SpO2 signal for each individual desaturation episode7 using the following equation:

where A is an additive coefficient, RORvid is the video-based ROR, and B is a multiplicative coefficient. The signals were also aligned in time to account for differences in temporal reporting characteristics, that is, resulting from internal filtering and time delays. The values of the individual calibration coefficients for each of the desaturations episodes are tabulated in the Results section (Table 1).

Table 1.:
Calibration Coefficients for Each Desaturation Event

To demonstrate the requirement for individually calibrated desaturation curves for SvidO2, a standard “textbook” equation of the form

was also used (from Webster’s seminal book on pulse oximetry).18 The results calculated from this single equation are referred to as globally calibrated in this article as distinct from individually calibrated results using the error minimization method described previously with respect to equation 2.

The concordance between the reference Spo2 signal and the video signal was calculated to evaluate the trending ability. This assessment was done both for signals calibrated globally using a single standard equation and for individually calibrated desaturation curves.

We performed a linear mixed-effects model19 with the predictors (Spo2, HR, and RR) and the lighting condition as fixed effects. Idiosyncratic variation resulting from individual animal differences was added as a random effect. (Note that we also tested light as a random effect, which resulted in an estimate for the variance of the light effect equal to zero. This variance estimate was evidence of a degenerate model,20 which showed that the level of “between-group” variability was insufficient to warrant incorporating lighting condition as a random effect. We thus opted to model the lighting condition as a fixed effect to quantify its effect in our study.) A restricted maximum likelihood method was used to calculate the mixed-effects model with 95% confidence intervals (CIs) evaluated for each of the estimates using a normal approximation of the distribution of the restricted maximum likelihood estimators. P values were calculated via the maximum likelihood ratio (nonrestricted) for each of the estimates.

In addition, we performed bias and accuracy analyses, as specified in the ISO 80601-2-61:201121 standard for the evaluation and qualification of pulse oximeter devices. This analysis was less granular as it does not decouple individual contributions. However, ISO 80601 is the standard by which pulse oximeter equipment is tested for safety and essential performance. The standard defines the bias and accuracy as, respectively, the mean difference and the root mean squared difference between the test and reference values. That is,


The latter expression represents a combination of the systematic and random components of the error.

An “uptime” was determined for the video-based HR and oxygen saturation parameters. This parameter was defined as the percentage of time that each parameter could be calculated when there was a valid reference value posted by the pulse oximeter. Similarly, a video respiratory rate uptime was calculated as the percentage of time that it could be computed when there was a valid ventilator reference rate. Signal processing was performed in Matlab, version R2015b (Mathworks Inc, Natick, MA) package and the statistical analysis with R (R Development Core Team, 2016) using the nlme package.19


A total of 88 minutes of data was acquired during the 16 hypoxic episodes. Appendix 1, Figure A1 contains plots of the HR, RR, and oxygen saturation time series for the video and reference data for each of the individual desaturation events. Mixed-effect regression analysis of HR, RR, and Spo2 was performed in turn and resulted in relationships between the video and reference parameter as described subsequently. Note that in the equations, the lighting condition (LC) can have two possible values [0/1]. These are ascribed to the two studied LCs where 0 corresponds to both OR and room lights on and 1 corresponds to the OR lights on only. A summary of the results is provided in Table 2.

Heart Rate

The mixed-effect regression analysis of the HR resulted in the following relationship between the video and reference parameters:

Table 2.:
Estimates for Effects Calculated From the Mixed-Effects Model With Respective Confidence Limits and P Values
Figure 3.:
Scatterplot of HRvid against HRp. Trend lines represent the regression equation from the mixed-effect regression analysis. (Dashed line: regression for the first desaturation; dotted line: regression for the second desaturation.) Expected 1:1 correspondence line shown in a continuous thin line. A, All data plotted. Note that both regression lines corresponding to the two lighting conditions are plotted, but their separation cannot be distinguished at this scale. B, Data for both room and OR lights on (ie, first desaturation in each case). C, Data for only OR lights on (ie, second desaturation in each case). HR indicates heart rate; OR, operating room.

in units of beats per minute. The CIs and P values for the equation estimates are provided in Table 2(a). Note that equation 3 gives 2 regression lines, 1 for each LC. Figure 3 contains both regression lines calculated from the linear mixed effects superimposed on all the data (main plot) and split between the 2 LCs (smaller plots). The 2 regression lines on the main plot are effectively superimposed as a result of the small difference (of 0.136) between LCs. The SD of the intrasubject random effect was 0.328 (CI, 0.190–0.566), and the residual error was 0.958 (CI, 0.940–0.977). The overall uptime for video-based HR reporting was 100%. Note that the pulse oximeter did not post for a period of time during desaturation episode P18_D2 because the sensor became disconnected at the socket end. (This mechanism can be seen in the bottom right plot of Figure A1[a]). This data segment was excluded from the analysis, although the video pulse rate posted continuously through this period. The video and pulse oximeter HR signals during each episode are presented in Figure A1(a) in Appendix 1. Very close agreement between HRs is evident in the plots, where both rates are essentially superimposed on top of each other with only a few minor deviations between them.

Respiration Rate

The mixed-effect regression analysis of the RR resulted in the following relationship between the video and reference parameters:

Figure 4.:
Scatterplot of RRvid against RRvent. Trend lines represent the regression equation from the mixed-effect regression analysis. (Dashed line: regression for the first desaturation; dotted line: regression for the second desaturation.) Expected 1:1 correspondence line shown in continuous thin line. (Note that size of each circle corresponds to number of data points at that location—required for visualization as a result of many colocated RR data points in the plot. Color coding according to event is not possible as a result of multiple colocations of points.) A, All data plotted. Note that both regression lines corresponding to the 2 lighting conditions are plotted, but their separation cannot be distinguished at this scale. B, Data for both room and OR lights on (ie, first desaturation in each case). C, Data for only OR lights on (ie, second desaturation in each case). OR indicates operating room; RR, respiratory rate.

in units of breaths per minute. The CIs and P values for the equation estimates are provided in Table 2(b). As previously, equation 7 gives 2 regression lines, 1 for each LC, and the plots as per the RR results are also differentiated accordingly (Figure 4). In this case, the 2 regression lines on the main plot are also separated by a small difference (of 0.064) between LCs. The SD of the intrasubject random effect was 0.484 (CI, 0.286–0.818), and the residual error was 0.421 (CI, 0.413–0.429). The overall uptime for reporting RRvid was 100%. Figure A1(b) in Appendix 1 contains plots of RRv and the ventilator reference rate over time for each episode where close agreement between the 2 reported values can be observed.

Oxygen Saturation

Figure 5A shows the globally calibrated SvidO2 curves generated if using the single standard textbook equation for the calibration given by equation 3 in the Methods section. From visual inspection, this plot shows: (1) that the standard equation is not “tuned” to provide a direct measure of Spo2 from the video equipment; and (2) that calibrating to a single equation results in an wide range of possible Spo2 values corresponding to certain SvidO2 values, that is, at approximately 90% SvidO2, we can see that the whole range of Spo2 values from 28% to 100% is possible depending on which curve is followed. Thus, the absolute value of Spo2 is not determinable from the Svido2 at this range. In Figure 5B, each desaturation event is calibrated individually by minimizing the error between the reference Spo2 and the calculated SvidO2 as described in the Methods section. The resulting coefficients for each desaturation are provided in Table 1. The mixed-effect regression analysis of the individually calibrated Spo2 produced the following relationship between the video and reference parameters:

Figure 5.:
SvidO2 versus SpO2 with corresponding 4-quadrant concordance plots. A, Globally calibrated data; (B) trend lines represent the regression equation from the mixed-effect regression analysis. (Dashed line: regression for the first desaturation; dotted line: regression for the second desaturation.) Expected 1:1 correspondence line shown in a continuous thin line. Both desaturation events plotted. Note that both regression lines corresponding to the 2 lighting conditions are plotted, but their separation cannot be distinguished at this scale. C, Data and the regression line for the first desaturation event only; (D) data and the regression line for the second desaturation event only. Note that these plots contain the data for the individually calibrated data sets using the reference pulse oximeter.

in units of percentage oxygen saturation. The CIs and P values for the equation estimates are provided in Table 2(c). As previously, equation 8 gives 2 regression lines (Figure 5B), 1 for each LC, and the plots as per the oxygen saturation results are also differentiated accordingly (Figure 5C, D). In this case, the 2 regression lines on the main plot are also separated by a small difference (of 0.098) between LCs. The SD of the intrasubject random effect was 0.777 (CI, 0.472–1.277), and the residual error was 4.986 (CI, 4.889–5.084). The overall uptime for SvidO2 reporting was 100%. The accuracy and bias, as per ISO 80601-2-61:2011, were found to be 5.221 and 0.006, respectively, for all data. Figure A1(c) in Appendix 1 contains the individual plots of the desaturation curves for each of the 16 desaturation events. The oxygen saturation curves exhibit a distinct slow decrease in Spo2 to approximately 40% followed by a more rapid increase during resaturation, typical of such protocols.

We also compared the trending ability between globally calibrated and individually calibrated SvidO2 via the use of concordance plots (Figure 5A, B). This comparison showed that although the globally calibrated SvidO2 signal is not a direct measure of oxygen saturation levels, it is capable of identifying the trend in saturations seen in hypoxic events. A 10-second window was used for the calculation of the concordance for both data sets. We have omitted data within 0.5% (saturation units) of the origin, which is the reporting accuracy of the reference pulse oximeter device (as the reported Spo2 is an integer number). The same values were used for the exclusion box in the concordance plot in the globally calibrated data. The calculated values for the concordance rate are 94.4% for the globally calibrated data and 86.5% for the individually calibrated data. The smaller value of the concordance rate for the individually calibrated data is justified by the exclusion box not including a substantial number of points close to the origin, as the individual calibration shifted these points outside the box (as can be seen by comparing the concordance plots in Figure 5A, B). We also calculated the concordance rate for windows of 2 seconds (75.0% and 73.1% for the globally calibrated and individually calibrated data sets, respectively) and 5 seconds (74.7% and 74.3% for the globally/individually calibrated data sets, respectively). We did not calculate a 1-second window as this interval was equal to the sampling rate of the pulse oximeter.


Three main vital signs—HR, RR, and oxygen saturation—were extracted from a video-PPG during porcine desaturation episodes. These signals were compared with reference values, and good agreement was observed in terms of bias and accuracy. HR, RR, and calibrated oxygen saturation demonstrated very good agreement with the slope of the regression lines close to 1 in all cases. In addition, the SvidO2 exhibited an accuracy of just over 5%, a value consistent with the ISO 80601 Standard for Pulse Oximeter Equipment, which states that “Spo2 accuracy of pulse oximeter equipment shall be a root-mean-square difference (RMSD) of less than or equal to 4.0.” However, the current study used a calibration based on the reference device to determine the SvidO2 values and involved anesthetized (motionless) animals, both of which likely aided the saturation performance measures. The reader should also be aware that the SvidO2 results are highly dependent on the calibration method, which involves individually calibrating each desaturation signal (Table 1). Figure 6A shows the individually calibrated signals for the first desaturation events only. Figure 6B shows the signals for the second desaturation event calibrated with the corresponding values of the first desaturation. As can be seen from the plot, the calibrations calculated for the first desaturation event are not necessarily valid for the second desaturation event. This discrepancy may be attributed to the calibration method, which minimizes the RMSD during the hypoxic event by curve-fitting rapidly changing saturations. Because the video and pulse oximeter-derived values have different response characteristics (as a result of different internal filtering in their respective algorithms), the best fitting parameters may vary considerably between different desaturation episodes. In an ideal case, slowly varying the oxygen saturation during the experiment would allow a longer signal duration for different saturation values, which could potentially filter out such undesired variations.

Figure 6.:
SvidO2–Spo2 scatterplots where the calibration used for the first desaturation was used for the second desaturation. A, First desaturation (D1) with individual calibrations and (B) the corresponding second desaturation (D2) plots using the calibration values of the first desaturation.

All monitoring devices must address the tradeoff between uptime (ie, fewer signal dropouts) and measurement accuracy.22 As a result, all monitoring algorithms will eventually decide not to post when signal quality falls below a threshold of acceptability and the data used to compute the reported value become too old to be clinically relevant. In this study, we targeted both high uptimes and signal accuracy. However, it soon became apparent that both 100% uptime and high accuracy for HR and RR were attainable. We therefore tuned the algorithm to attain 100% uptime for all three parameters. The ability to achieve 100% uptimes across all measurements was attributable, in part, to the relatively still nature of the anesthetized animal. Although a performance improvement might have been achievable by reducing these uptimes slightly (particularly for the case of oxygen saturation), we felt that the expectation for a monitoring device in practice would be 100% uptime for a motionless subject. In addition, we believe that the performance results achieved using a maximal uptime approach provide a useful benchmark for such technologies going forward.

Although motion was not an issue in our controlled animal model, we found markedly varying image quality across animals in terms of both the mean image quality and dynamic lighting “noise” issues. The overall quality of each of the 16 video image clips varied substantially. It was obvious from manual visual inspection that the best video-PPGs were generally produced from video with the following three characteristics: (1) bright images with highly discernable features; (2) the field of view zoomed into a localized area around the snout; and (3) the ROI perpendicular to the line of sight of the camera. (These characteristics were all present in the image of Figure 2.) Dark, zoomed-out images with the snout ROI at an angle to the perpendicular generally produced the most challenging results.

The analysis performed using linear mixed-effects allowed us to decouple the effect of subject variation and account for the effect of lighting changes. The results achieved by the mixed-effects model for the influence of the lighting effect on the variable of interest (HR, RR, and Spo2) demonstrated very little effect of using the room lights. This finding was reflected in the small values for the LC parameter (–0.136, 95% CI, –0.188 to –0.083] for the HR; 0.064 (95% CI, 0.039–0.088) for the RR; and –0.098 (95% CI, – 0.375 to 0.180) for SvidO2. Such a weak lighting effect may be attributed to the individual calibration for each saturation signal, which normalizes the specific LC during each desaturation to the reference signal. The SD of the intersubject grouping random effect was 0.328 (95% CI, 0.190–0.566) for the HR; 0.484 (95% CI, 0.286–0.818) for the RR; and 0.777 (95% CI, 0.472–1.277) for SvidO2. These SD values are all within 1 measurement unit for each parameter. The within-group standard error was 0.958 (95% CI, 0.940–0.977) for the HR; 0.421 (95% CI, 0.413–0.429) for the RR; and 4.986 (95% CI, 4.889–5.084) for SvidO2. These numbers suggest that the amount of error is small for HR and RR, whereas a larger amount of natural error exists for oxygen saturation.

We did not have equipment available to measure the spectra or intensity of the ambient LCs. In addition, we did not randomize LCs for the first and second desaturation events. Doing so would have allowed a clearer assessment of the effect of lighting on our study. However, we were collecting data contemporaneously with another study and had to share protocols. Future studies involving lighting effects should address this issue.

Localized dynamic lighting noise did have more of an effect than the mean room lighting levels. Such localized noise took the form of rapid changes in lighting level as a result of research staff moving in the vicinity of the animal (to take measurements and adjust settings as required by the parallel ongoing study), thus changing the intensity of the reflected light illuminating the ROI. Both motion and light intensity artifacts will need solutions for such methods to be clinically useful. One approach might be to identify and model conditions found in the clinical setting and incorporate these effects into subsequent algorithms.23 Considerably, more processing would be involved in such an algorithm including preprocessing and postprocessing code modules as well as alarm management systems, hardware interface routines, and signal acquisition processes.24

For the analysis work, we chose two separate ROIs on the skin: one for the determination of rates (RRvid and HRvid) and the other for the determination of saturation (SvidO2). These choices were the result of different researchers working on each of these two tasks. It is worth noting that we have found the constitution of the video-PPG to be extremely sensitive to the chosen ROI. This finding is highlighted in Figure 2, where it may be seen that the two ROIs shown in the vicinity of the nostril result in signals with distinctly different morphologies: one with dominant respiratory components and the other with dominant cardiac components. (The green color channel is shown here as an example.) These ROIs are each approximately 5 mm × 5 mm in extent and are within a relatively close proximity of each other (<10 mm) and thus emphasize the important effect that ROI selection has on the signal. A further example is provided in Figure 7, which shows a selection of 16 example ROIs for a single animal used in the determination of SvidO2. The corresponding fits to the Spo2 reference are also shown with their respective bias and RMSD statistics. We determined an optimal ROI separately for each animal through trial and error (not automated) based on minimizing the error between SvidO2 and spo2. For the case shown in Figure 7, the optimal ROI corresponded to region 11. In future work, we may further decouple all three signals, choosing the optimal ROI in each case to maximize the performance of the physiologic parameter of interest.

Figure 7.:
Changes to measured SvidO2 signal resulting from ROI location. A, Sixteen candidate ROIs on the snout. B, The 16 derived SvidO2 signals plotted against the SpO2 reference. Bias and RMSD statistics shown. C, The combined plot of all 16 SvidO2 signals with 1 exhibiting lowest accuracy (from ROI 11) shown bold in red. RMSD indicates root mean square difference; ROI, region of interest.

The analysis algorithm for the HR and RR each used a sliding window (of 22 and 48 seconds, respectively) to perform a local frequency analysis from which the relevant dominant peak was selected. Cardiac components manifest in the traditional PPG waveform (ie, from a pulse oximeter) as dominant pulse waveforms, whereas respiratory modulations may manifest in the form of amplitude, baseline, and/or frequency (ie, respiratory sinus arrhythmia) components as described previously.25,26 In this study, we observed dominant respiratory baseline components. Although the etiology of these components is unclear, they may originate from venous return at the ROI in phase with large thoracic pressure changes generated by the positive pressure ventilation, changing the baseline of the reflected light intensity. However, from detailed inspection of the video images, we also noted that the ventilated animal is subject to slight motions synchronous with the ventilator. These motions will be superimposed on the acquired signal. To what extent each of these factors plays in the final signal composition is unclear and, although the algorithm is agnostic to the origin of the baseline modulation, it would be of interest to parse out the relative contributions.

Although our study provides proof of principle, it remains unclear whether such video-based monitoring can eventually replace one or all of the three vital signs considered in the clinical environment. In this regard, some environments may be more suitable than others. For example, spot checks by a clinician using a pulse oximeter may allow intermittent calibration of a camera system to allow continuous monitoring to take place with reasonable accuracy between checks. The home environment is another potential area of interest. The technologies considered in this work may also be combined with other video monitoring methods relevant to the clinical environment, including the monitoring of pulse transit times,27 bed occupancy,28 gait,29 and fall detection.30


Figure A1(a).:
Time series of video and pulse oximeter heart rates (HRvid= black; HRp= green). The left-hand signals correspond to both room and operating light illumination. The right-hand signals correspond only to operating light illumination. The dropout in pulse oximeter heart rate in the bottom right plot is the result of a disconnected probe at the connector end (ie, the sensor did not fall off).
Figure A1(b).:
Time series of video and pulse oximeter respiratory rates (RRvid= black; RRp= green).
Figure A1(c).:
Time series of video and pulse oximeter SpO2 values (SvidO2= black; SpO2= green).

Heart rate, respiratory rate, and oxygen situations for each hypoxic episode are plotted in Figures A1(a), A1(b), and A1(c), respectively. The plots each contain the video-based measurement and its associated reference signal.


In this study, we found that vital sign monitoring of HR, RR, and oxygen saturation trends using standard visible light camera equipment may accurately identify expected changes during acute hypoxic conditions in the anesthetized porcine model. Although absolute values for HR and RR were possible, absolute oxygen saturation values required frequent updating of the calibration coefficients as they changed over time and with desaturation events. Our data were acquired at reasonably low levels of motion, which is a known confounder. Future work is needed to refine such technologies and identify potential clinical uses.


Name: Paul S. Addison, PhD.

Contribution: This author helped conceive and design the study, design the protocol, analyze the data, prepare the manuscript, and revise the manuscript.

Conflicts of Interest: Dr Addison is an employee of Medtronic, a global health care company, which sponsored this study.

Name: Dominique Jacquel, PhD.

Contribution: This author helped analyze the data and prepare the manuscript.

Conflicts of Interest: Dr Jacquel is an employee of Medtronic, a global health care company, which sponsored this study.

Name: David M. H. Foo, PhD.

Contribution: This author helped conceive the study, analyze the data, and prepare the manuscript.

Conflicts of Interest: Dr Foo is an employee of Medtronic, a global health care company, which sponsored this study.

Name: André Antunes, PhD.

Contribution: This author helped analyze the data, prepare the manuscript, and revise the manuscript.

Conflicts of Interest: Dr Antunes is an employee of Medtronic, a global health care company, which sponsored this study.

Name: Ulf R. Borg, MD, PhD.

Contribution: This author helped conceive and design the study, design the protocol, review the analysis, prepare the manuscript, and revise the manuscript.

Conflicts of Interest: Dr Borg is an employee of Medtronic, a global health care company, which sponsored this study.

This manuscript was handled by: Avery Tung, MD, FCCM.


1. McDuff DJ, Estepp JR, Piasecki AM, Blackford EB. A survey of remote optical photoplethysmographic imaging methods. Conf Proc IEEE Eng Med Biol Soc. 20156398–6404.
2. Bousefsaf F, Maaoui C, Pruski A. Continuous wavelet filtering on webcam photoplethysmographic signals to remotely assess the instantaneous heart rate. Biomed Signal Process Control. 2013;8:568–574.
3. Sun Y, Hu S, Azorin-Peris V, Kalawsky R, Greenwald S. Noncontact imaging photoplethysmography to effectively access pulse rate variability. J Biomed Opt. 2013;18:061205.
4. Kwon S, Kim H, Park KS. Validation of heart rate extraction using video imaging on a built-in camera system of a smartphone. Conf Proc IEEE Eng Med Biol Soc. 20122174–2177.
5. Villarroel M, Guazzi A, Jorge J, et al. Continuous non-contact vital sign monitoring in neonatal intensive care unit. Healthc Technol Lett. 2014;1:87–91.
6. Poh MZ, McDuff DJ, Picard RW. Advancements in noncontact, multiparameter physiological measurements using a webcam. IEEE Trans Biomed Eng. 2011;58:7–11.
7. Tarassenko L, Villarroel M, Guazzi A, Jorge J, Clifton DA, Pugh C. Non-contact video-based vital sign monitoring using ambient light and auto-regressive models. Physiol Meas. 2014;35:807–831.
8. Scalise L, Bernacchia N, Ercoli I, Marchionni P. Heart Rate Measurement in Neonatal Patients Using a Webcamera. In: International Symposium on MeMeA Proceedings. Mexico: IEEE.2012:1–4.
9. Li MH, Yadollahi A, Taati B. A non-contact vision-based system for respiratory rate estimation. Conf Proc IEEE Eng Med Biol Soc. 20142119–2122.
10. Cennini G, Arguel J, Akşit K, van Leest A. Heart rate monitoring via remote photoplethysmography with motion artifacts reduction. Opt Express. 2010;18:4867–4875.
11. Aarts LA, Jeanne V, Cleary JP, et al. Non-contact heart rate monitoring utilizing camera photoplethysmography in the neonatal intensive care unit—a pilot study. Early Hum Dev. 2013;89:943–948.
12. Kumar M, Veeraraghavan A, Sabharval A. Distance PPG: robust non-contact vital signs monitoring using a camera. Biomed Opt Express. 2015;6:1565–1588.
13. Kong L, Zhao Y, Dong L, et al. Non-contact detection of oxygen saturation based on visible light imaging device using ambient light. Opt Express. 2013;21:17464–17471.
14. Guazzi AR, Villarroel M, Jorge J, et al. Non-contact measurement of oxygen saturation with an RGB camera. Biomed Opt Express. 2015;6:3320–3338.
15. Bickler PE, Feiner JR, Rollins MD. Factors affecting the performance of 5 cerebral oximeters during hypoxia in healthy volunteers. Anesth Analg. 2013;117:813–823.
16. Shah N, Ragaswamy HB, Govindugari K, Estanol L. Performance of three new-generation pulse oximeters during motion and low perfusion in volunteers. J Clin Anesth. 2012;24:385–391.
17. Cooley JW, Tukey JW. An algorithm for the machine calculation of complex Fourier series. Math Comput. 1965;19:297–301.
18. Webster JG. Design of Pulse Oximeters. 1997.Boca Raton, FLCRC Press.
19. Bates D, Maechler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Software. 2015;67:1–48.
20. Pinheiro J, Bates D, DebRoy S, Sarkar D; R Core Team. 2016.Nlme: Linear and Nonlinear Mixed Effects Models. R Package Version 3. 1–128
21. ISO 80601-2–61:2011. Medical electrical equipment—Part 2-61: particular requirements for basic safety and essential performance of pulse oximeter equipment. International Organization for Standardization; 2011.
22. Kästle SW, Konecny E. Determining the artefact sensitivity of recent pulse oximeters during laboratory benchmarking. J Clin Monit Comput. 2000;16:509–522.
23. Jopling MW, Mannheimer PD, Bebout DE. Issues in the laboratory evaluation of pulse oximeter performance. Anesth Analg2002;94:S62–S68.
24. Addison PS. A review of signal processing used in the implementation of the pulse oximetry photoplethysmographic fluid responsiveness parameter. Anesth Analg. 2014;119:1293–1306.
25. Addison PS, Watson JN, Mestek ML, Mecca RS. Developing an algorithm for pulse oximetry derived respiratory rate (RRoxi): a healthy volunteer study. J Clin Monit Comput. 2014;26:45–51.
26. Addison PS, Watson JN, Mestek ML, Ochs JP, Uribe AA, Bergese SD. Pulse oximetry-derived respiratory rate in general care floor patients. J Clin Monit Comput. 2014;29:113–120.
27. Shao D, Yang Y, Liu C, Tsow F, Yu H, Tao N. Noncontact monitoring breathing pattern, exhalation flow rate and pulse transit time. IEEE Trans Biomed Eng. 2014;61:2760–2767.
28. Martinez M, Stiefelhagen R. Automated Multi-camera System for Long Term Behavioral Monitoring in Intensive Care Units. Proceedings of IAPR Conference on Machine Vision Applications, May 20–23, 2013.Kyoto, Japan.
29. Lv Z, Xing X, Wang K, Guan D. Class energy image analysis for video sensor-based gait recognition: a review. Sensors. 2015;15:932–964.
30. Rougier C, Meunier J, St-Arnaud A, Rousseau J. Robust video surveillance for fall detection based on human shape deformation. IEEE Trans Circuits Syst Video Technol. 2011;21:611–622.
Copyright © 2017 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the International Anesthesia Research Society.