Closer monitoring of respiratory pathophysiology by measuring minute ventilation (MV), tidal volume (TV), and respiratory rate (RR) may better identify those most at risk for adverse acute outcomes in a variety of clinical settings.1–5 Intermittent, manual measurements of RR and subjective clinical assessment are the current standard of care for monitoring respiratory status, but are notoriously inconsistent, depending on the provider’s level of experience.6 Technologies, such as pulse oximetry, RR sensors, and capnography, do not meet this need for improved respiratory monitoring.3,4,7,8 No current monitor conveniently, noninvasively, and accurately measures and reports the respiratory volume measurements of MV, TV, and RR in nonintubated patients.
Abnormal RRs are associated with a variety of adverse events, and although intermittent observation by skilled clinical personnel can provide accurate quantification of RR and a number of recently developed devices provide continuous RR measurements within certain variables, RR alone often provides insufficient clinical information to trigger early interventions in respiratory compromise. Spirometry can measure respiratory volumes in nonintubated, compliant patients at a single point in time; however, it is impractical for noncooperative patients and does not allow for continuous measurements, as it requires the use of a nose clip and appropriately positioned mouthpiece or a tightly fitting mask.9 Similarly, MV and TV data are presented on standard ventilators, but are not available after extubation. A point-of-care device that provides quantitative, noninvasive, continuous measurements of respiratory volumes and rates in nonintubated patients would quantify adequacy of ventilation and resultant gas exchange. Such a device could enable early detection of respiratory deterioration, facilitate timely interventions, minimize respiratory complications, improve patient safety, and reduce health care costs.
A noninvasive Respiratory Volume Monitor (RVM) has been developed to provide continuous measurements of MV, TV, and RR, a real-time respiratory volume curve and provides trends of the measured variables. The studies reported here evaluate the performance of an RVM to measure MV, TV, and RR in a diverse population over a 24-hour period, throughout a range of RRs by comparing the accuracy and precision of the RVM to a monitoring turbine spirometer. We hypothesized that the RVM delivers clinically relevant and unbiased measurements with accuracy (defined as the square-root of the mean-squared-difference between RVM and spirometry) within 20% of a conventional spirometer over the course of 24 hours. We used analysis of variance (ANOVA) and an equivalence test to confirm this hypothesis. Secondary studies assessed the potential clinical utility of the RVM by correlating continuous measurements during a model of obstructed breathing.
These studies were approved by the Schulman Associates IRB, Cincinnati, OH. All subjects responded to an IRB-approved advertisement and gave written informed consent. Inclusion criteria were English-speaking men and women aged 18 to 99 years. Exclusion criteria were persons seen in the emergency department or hospitalized for a respiratory illness within 30 days before the study and pregnant or lactating females. Each subject completed health history forms. Subjects included those of varied demographics (Table 1), anthropometrics (Table 2), and substantive cardiopulmonary comorbidities, i.e., asthma, chronic obstructive pulmonary disease (COPD), obstructive sleep apnea (OSA), and congestive heart failure (CHF).
Two electrode pads are used by the RVM (ExSpiron™, Respiratory Motion, Inc., Waltham, MA). In the recommended placement, 1 electrode pad comprising 3 electrodes is placed along the sternum and the other electrode pad comprising 3 electrodes is placed across the right midaxillary line at the level of the xiphoid. Spacing is determined by the described anatomical landmarks (Fig. 1). Both Wright/Haloscale Respirometer (nSpire Health, Inc., Longmont, CO, “Wright”) and SpiroAir-LT (Morgan Scientific, Inc., Haverhill, MA, “Morgan”) spirometry measurements were collected through a single-use mouthpiece and filter while the patients wore a disposable noseclip (all disposable products obtained from A-M Systems, Sequim, WA).
The RVM displays numerical MV, TV, and RR data, as well as a real-time respiratory volume curve, and provides trends of the measurements. Note that RVM reports TV measurements that are in fact average TVs, calculated by dividing MV by RR (
). With the recommended electrode placement and calibration algorithms, strong correlations (0.96 ± 0.16, mean ± 95% confidence interval [CI] for regular and erratic breathing) between RVM and spirometric measurements have been demonstrated.10
For the primary study, we estimated that at least 13 subjects would be needed to demonstrate equivalence of ±15% with 90% statistical power (Appendix).11 However, we decided to collect data from a larger, more diverse cohort. Forty voluntary human subjects were recruited, and 31 successfully completed both days of the study. Subjects were tested in the supine position at 2 time points: on day 1 and, 24 hours later, on day 2. On day 1, the electrode pads were applied in the recommended configuration (see Instrumentation) and the subject underwent 3 or more minute-long practice tests, breathing through the Wright spirometer. The technician then measured the subject’s expired MV with the Wright (MVWright) while simultaneously collecting RVM data. The technician then entered the MVWright collected during this minute into the RVM. The calibration routine was then activated on the RVM system to associate the MVWright with the MV collected by the RVM (MVRVM) during the same minute timeframe. After this calibration, the subject underwent 10-minute breathing tests, during which measurements were simultaneously recorded from both the Wright and RVM. Subjects rested for about 1 minute between tests, while the operator reset the Wright and prepared it for the next test. At the end of day 1 testing, the participants were asked to leave the electrode pads on and not to bathe or allow the electrode pads to get wet or sweaty until their return. On day 2, calibration data from day 1 were used. After 3 or more practice tests, the subject underwent 10 breathing tests, in the supine position, with simultaneous Wright and RVM measurements collected as on day 1. At the end of the second session, the electrodes were removed and skin checked for irritation.
To optimize data collection from the Wright, we noted that ending the breathing tests at exactly 60 seconds caused part or all of the last breath to be truncated in the Wright readings. Therefore, because the Wright only measured exhalation, the test was initiated during inspiration and was terminated as inspiration began after the last exhalation, which occurred after the 60-second period. The exact test duration (t) was electronically recorded by the RVM and a stopwatch and the number of breaths (Nb) were counted by the operator. These data were used to calculate MV, TV, and RR as follows:
, where Vexp is the volume measured by the Wright.
Alterations in Breathing Protocols
To address likely clinical scenarios, we also evaluated RVM performance during breathing at a variety of RRs and breathing patterns (slow, erratic, fast, and against a closed glottis). The conduct of these studies was similar to the primary study, with the following exceptions: (1) performed on a single day; (2) subjects were coached to alter breathing pattern (see below); (3) the Morgan spirometer was used to obtain simultaneous, continuous digital respiratory volume curves; and (4) data were collected for 30 seconds (shortened due to the closed-cell nature of the Morgan).
To evaluate the RVM performance during normal, fast, slow, and erratic breathing patterns, 51 adult subjects (37 men, 14 women, average age 31.1 years, average body mass index [BMI] 25) were studied in 93 sessions (42 subjects took part in 2 sessions, 9 subjects took part in 1 session), generating 2252 trials, with low (4–6), normal (8–20), and high (30–40) RRs as well as with erratic breathing. During the assessment of RVM performance at various RRs, subjects were coached to alter their breathing patterns by following a metronome. During erratic breathing trials subjects were instructed to breathe “erratically” or “unevenly” and to attempt to continuously vary the rate and depth of breathing.
To investigate the effect of airway obstruction on RVM measurements, 10 adult subjects (7 men, 3 women, average age 28.8 years, average BMI 24) performed 3 obstructed breathing trials by breathing normally for 15 seconds and then on cue to continue to breathe against a closed glottis for the remaining 15 seconds of the test, with visible chest movement as noted by the experimenter, but without actually exchanging air as measured by the Morgan spirometer.
RVM and Wright measurements were compared using Bland-Altman analysis.12 A repeated measures single-factor ANOVA was used to assess differences in measurement variability between devices. A Lilliefors test was used to assess the normality of the distributions of relative errors or TV and MV. The null hypothesis was that the relative errors are normally distributed, and the alternative hypothesis was that they are not. The test failed to reject the null hypothesis at a significance level of 0.05 (P = 0.07 for MV and 0.54 for TV), confirming that the relative errors are normally distributed. Paired t tests were used to assess the effect of “time from calibration” on the results (comparing accuracy, bias, and precision of day 1 vs day 2). All analyses were performed in Matlab R2009b (Mathworks, Natick, MA).
The data comprised 10 sequential tests of approximately 60 seconds for each subject on both day 1 and day 2 for a total of 20 tests. Each 60-second test was analyzed to determine MV, TV, and RR.
Data analyses for all subjects were approached in 3 ways:
- across all 20 tests from both days,
- across all 10 tests from day 1, and
- across all 10 tests from day 2.
In the ANOVA, the calculations were based on individual RVM and spirometer measurements. For the primary ANOVA analyses, the null hypothesis was that the devices had the same mean (zero bias, [δ = 0]). We performed an equivalence test, as a secondary analysis, with a null hypothesis that the devices differed (Appendix). With a bound on relative error of 15%, the alternative hypothesis was that the mean relative error was zero and the devices agreed (95% CI about the mean lies between −15% and 15%).
Calculations for bias, precision, and accuracy were based on the relative errors between RVM and spirometer as follows:
“Bias”, “precision,” and “accuracy” were calculated over different test subsets: all tests, “day 1” for all subjects, “day2” for all subjects using the following definitions:
- “bias” is a mean value of all errors in the selected subset,
- “precision” is the standard deviation of the errors, and
- “accuracy” is the square-root of the mean-squared-error.
In the second study, we analyzed the digital respiratory traces from the RVM and the Morgan spirometer. Using a linear regression, we calculated the correlation coefficient between the 2 respiratory traces for each trial. Figure 2A shows 4 examples of respiratory trace pairs. The data in Figure 2B were generated by binning correlation coefficients corresponding to a given RR and computing the mean and CI for each bin (i.e., the data point at 20 breaths/min was based on all breathing trials with RR between 19.5 and 20.5).
In the third study, we used RVM traces during obstructed breathing trials to calculate 2 TV measurements per-trial per-person: (1) while patients were instructed to breathe into the spirometer and (2) after closing the glottis and continuing attempted breathing. The point at which we assumed the glottis was fully closed was 1 second after the point at which the Morgan stopped registering volume changes above 50 mL. Each subject performed 3 closed-glottis trials, and we calculated 2 subject-averaged TVs. We then performed a paired, 1-sided t test between the subject-averaged TVs to test for statistically significant decrease.
Performance Evaluation over 24 Hours
The primary study simultaneously measured MV, TV, and RR with the RVM and the Wright in 31 subjects in 2 visits (10 trials each) 24 hours apart. We computed the individual trial differences in MV between the Wright and the RVM (y-axis in Fig. 3, A and D, “Difference in MV”) and the best estimate of the actual MV (x-axis in Fig. 3, A and D, “Average MV”) as well as subject averages for day 1 (blue dots) and day 2 (red dots). The Bland-Altman plots in Figure 3, A and D show the distribution of these averaged MV measures. The middle black dashed line shows the average difference (−0.16 L/min for day 1, −0.15 L/min for day 2). The upper and lower dashed lines depict the 95% prediction interval (±2 SD, −1.6 to 1.3 L/min for day 1, −1.7 to 1.4 L/min for day 2). Note that the spread of MV differences (corresponding to the spread of measurement errors) is smaller at lower MVs as demonstrated by the tighter clustering of data around the mean difference line for MVs on the order of 7 to 10 L/min (−0.28 ± 1.13 L/min for MV between 7 and 10 L/min vs 0.01 ± 1.75 for MV >10 L/min, F test, F [31, 27] = 2.38, P = 0.01).
Analogous analysis performed for average TV measurements acquired over a 30-second period are shown in Figure 3, B and E. The average TV difference between the 2 devices across both days is −22.8 mL with 95% CI, −149.7 to 104.1 mL. As with MV, there is tighter clustering of TV errors at lower TVs (<1100 mL). Few subject TV differences exceeded 120 mL and those larger differences all occurred at TVs >1 L and constituted <10% error at these volumes. Also all individual average MV or TV measurements had <20% relative error (Fig. 3, A–E, green lines).
The differences between the RVM and the stopwatch-registered RR are shown in Figure 3C. No average subject difference exceeded 0.7 breaths/min, and in fact the population average RR difference was 0.0 breaths/min with 95% prediction interval: −0.2 to 0.2 breaths/min.
No significant difference was found between individual MV, TV, and RR RVM measurements and spirometer MV, TV, and RR measurements (F [1, 30] < 0.15, P > 0.7 for all 3 variables) based on repeated measures ANOVAs. The secondary equivalence test rejected the null hypothesis of nonequivalence and accepted the alternative hypothesis of equivalence. The 95% CI for relative error for MV was (−0.025 to 0.075), for TV was (−0.026 to 0.071), and for RR was (−0.010 to 0.014). All these lie well within the interval (−0.15 to 0.15) and all have associated equivalence test P values of <0.001.
We assessed the accuracy, precision, and bias of the RVM’s measurement of MV, TV, and RR over a 24-hour period. Paired 2-tailed t tests between the distribution of the average errors in MV and TV on day 1 and on day 2 showed no significant differences in the average MV or TV measurement errors across 24 hours (P > 0.6). Note that the change in the average RR bias from day 1 to day 2 (−0.55% vs 0.14%, P = 0.01) is of no clinical importance. The data also show no significant measurement bias of MV, TV, or RR (2.2% and −0.2% ± 0.3%, P > 0.09 in all 3 cases, 2-sided t test). Analogous analyses of precision and accuracy found no significant difference across days for either MV or TV (P > 0.3 in all 4 cases). The average measurement precision was 7.2% and 7.1% for MV and TV, whereas the average accuracy was 9.3% and 9.0%, respectively. Note that the upper 95% confidence limits on the accuracy of MV and TV RVM measurements were 15.6% and 15.3%, respectively. Table 3 summarizes the results.
The accuracy of the RVM’s MV and TV measurements was found to be unrelated to BMI (R2 < 0.006, P > 0.6 by linear regression in both cases). Figure 4, A and B shows individual subject accuracy as a function of BMI. Figure 4, C and D shows no significant correlation between RR and the measurement accuracy (R2 < 0.08, P > 0.1 by linear regression for both MV and TV).
Evaluation of Abnormal Breathing
Figure 2A depicts example traces recoded during low, normal, high, and erratic breathing trials. On each of the depicted examples the Morgan (red) is highly correlated with RVM (blue), r > 0.96 for all 4 cases. Figure 2B shows the correlation coefficients for all 2252 trials, binned as a function of RR, as in Figure 2, C and D. Throughout the range of 4 to 40 bpm and during erratic breathing, RVM measurements are highly correlated with Morgan measurements, (r = 0.96, 95% CI, 0.93–0.99). Across all subjects, median correlation coefficients ranged from 0.91 to 0.99. The high correlation between Morgan and RVM measurements confirms that the RVM is capable of continuously monitoring respiratory status during normal and erratic breathing, and at high and low breathing rates.
Figure 5A shows 3 example recordings from 3 different subjects breathing against a closed glottis. The Morgan traces (red) show that regular breathing patterns during the first half of each trial essentially disappear during the second half, confirming that air exchange was ineffective. Note that the RVM trace (blue), accurately measures normal TVs during the first half of each trial and then the measured TVs decrease drastically. Figure 5B shows a summary of the average TVs measured during normal and obstructed breathing in each subject. The measured TVs decrease systematically and significantly from an average of 1311 mL (95% CI, 1117–1505 mL) to an average of 192 mL (95% CI, 118–265 mL) (P < 0.0001, 1-sided paired t test).
A broad cohort of subjects were recruited and specific studies selected to evaluate the capabilities of the RVM system over a range of patients and breathing patterns seen in clinical practice. Data demonstrate that the RVM measures MV, TV, and RR in ambulatory subjects over 24 hours with average error <10%. The average measurement accuracy (the square-root of the mean-squared-error between the RVM and a spirometry standard) is 9.3% and 9.0% for MV and TV, respectively (95% CI, <16% for both), whereas the average precision (the standard deviation of this error) is 7.2% and 7.1%, respectively (95% CI, <12% for both). Note that the average volumetric differences (MV and TV) between the RVM are 155 and 28 mL, respectively. It is also noteworthy that the population average RR difference between the RVM measurements and the manual counting of individual breaths, by the technician, during the 24-hour study was 0.0 breaths/min (95% CI, −0.2 to 0.2) breaths/min, demonstrating that the RVM is extremely accurate in measuring RR. Given the potential application to patients with obesity and OSA, data were analyzed, and the effect of BMI on the accuracy of RVM measurements of MV and TV were found to be not significant (P > 0.6 for both). To evaluate the performance of the device under a range of conditions encountered in the clinical environment, the RVM was tested throughout a range of RRs and during erratic breathing. At 4 and 36 breaths/min (the extremes of the rates anticipated to be encountered in the clinical setting), and during erratic breathing, RVM and Morgan measurements were found to be highly correlated (r = 0.98 ±0.01 [mean ± 95% CI] at 4 breaths/min, 0.97 ± 0.01 at 36, and 0.93 ± 0.04 during erratic breathing).
Importantly, RVM differentiates obstructive breathing.13,14 As measured with the Morgan closed-cell spirometer, no actual air movement occurred during these attempted breaths against a closed glottis. During the obstructed breathing portion of each trial, unsuccessful respiratory efforts led to artifacts related to air motion within the chest cavity due to maximum respiratory effort against the closed glottis. These artifactual measurements by the RVM were demonstrated to be in the range of anatomic dead space (average of 166.5 mL for this group [1 mL/lb ideal body weight per Radford method])15 and below the range of adequate TV. Although not intuitively obvious, the data support this conclusion and can be attributed to the fact that the fundamental technology and algorithms are based on the change in impedance dZ/dt between the electrode sets. This change in impedance has both chest wall and intrathoracic components. In the situation of chest wall motion without air exchange, the only air that can be moved into the small airways and influence the RVM measurements can come from the large airways, i.e., anatomic dead space. These results suggest that, whereas a health care professional observing a patient may believe chest expansions constitute breathing, a consistently low TV or MV reading could be used as an early indicator of impending respiratory failure. The highly accurate continuous RR measurement also provides substantive clinical value.
Respiratory complications and deaths continue to occur despite existing monitoring. Pulse oximetry is currently the leading technology used to assess respiratory status in nonintubated patients.16 However, oxygen saturation (SpO2) is not an indicator of adequate ventilation,3,4 and SpO2 levels decline only after respiratory decompensation has begun.3,17 With the realization that SpO2 cannot provide an indication of the adequacy of ventilation in a timely fashion, the 2011 American Society of Anesthesiologists’ guidelines require monitoring of not just oxygenation, but also ventilation status. Capnography has been developed to assess respiration through the measurement of carbon dioxide (CO2) tension of expired and inspired gas.7,8 Some reviews suggest the benefits of end-tidal CO2 (PETCO2) measurement in predicting impending respiratory compromise over pulse oximetry, but there are clear disadvantages of reliance on PETCO2.8,18,19
RR monitoring by existing methods, fails to indicate ineffective respirations. Opioid administration can be associated with both ineffective obstructed breaths and smaller TVs. With the increasing prevalence of OSA and the ubiquitous use of narcotics in the postoperative period, it is important to monitor not just RR, but the overall effectiveness of breathing. By continuously monitoring MV, the RVM system enables health care providers to observe trends of respiratory status over time, providing information to assist with decision support and the potential for development into a closed-loop system in conjunction with a patient-controlled analgesia pump.
Although MV and TV values are reported on standard ventilators, there is currently no other technology providing this fundamental respiratory information in nonintubated patients. Qualitative clinical signs such as chest excursion, auscultation of breath sounds, and the presence of PETCO2 are useful; however, the addition of RVM measurements of MV and TV permit a more comprehensive and quantitative assessment of respiratory status. The RVM’s capability to continuously report RR with extremely high accuracy can also be useful in patient management. The RVM addresses opinions voiced by organizations focused on patient safety, including the American Society of Anesthesiology, that quantitative monitoring of respiratory status is superior over qualitative methods.20 With this functionality, RVM-derived respiratory data can provide reproducible and comparable data across patients in a wide range of breathing patterns. This can promote better evaluation of respiratory status in multiple situations, such as in the postanesthesia care units, intensive care units, emergency departments, and during rapid responses.
The use of MV measurements can be demonstrated through prediction of extubation success, when the time it takes a patient to return to baseline MV is inversely proportional to successful extubation.21 Additionally, determination of MV postextubation using RVM could potentially aid in faster determination of the necessity and timing of reintubation, or provide an early indication to institute noninvasive ventilation (NIV), in the form of continuous positive airway pressure or biphasic positive airway pressure. It has been noted that the earlier NIV is initiated, the better the primary outcome of preventing intubation or reintubation.22–26 The RVM’s continuous quantitation of TV and MV could facilitate such earlier decision making. Further analysis of the details of the RVM respiratory curves could prove useful in the prediction and prevention of primary respiratory failure and even aid in the diagnosis of the etiology of the respiratory decompensation. In the growing population of patients with obesity and OSA, erratic breathing patterns are common, especially under the influence of common anesthetics.27 These studies demonstrate that RVM accuracy is not influenced by BMI and that the RVM device maintains tight correlation with the Morgan spirometer during various breathing patterns.
The potential impact of the RVM on workload could provide a cost benefit to hospitals. Multiple studies have shown the proportionality of nursing workload to patient acuity.28–31 Monitoring of vital signs and titration of medications for intensive care patients consumes the majority of nursing time.29 RVM has the ability to streamline time-consuming respiratory assessments and reduce nursing workload and associated costs while simultaneously achieving a higher quality of patient monitoring. Without modification, the device would be expected to be limited in its use in certain patient populations such as those with thoracic burns or fail chests, or after total or partial lung resections.
There are several limitations to this study. Although readings from the RVM system in other studies have been shown to have a high degree of correlation with spirometry in the absence of additional calibration steps,10 in this study, the RVM system was calibrated to the individual patient, which adds a step to the measurement process. To perform this individual calibration, initial 1-minute Wright spirometry and RVM measurements were collected simultaneously and the MV reading from the Wright was incorporated into the RVM system. For the study, subsequent RVM readings on day 1 and day 2 were then compared with Wright spirometer measurements. In clinical use, the RVM device may be calibrated with any Food and Drug Administration–approved spirometer or ventilator to optimize its accuracy. Another limitation is that despite the fact that the study was performed on ambulatory volunteers who had multiple comorbidities and represented a broad sample of the population, studies on hospitalized or postoperative patients were not performed. Several additional studies are underway in postoperative and intensive care patients to evaluate the accuracy and clinical utility of the RVM measurements.
By providing an earlier indication of respiratory compromise, use of the RVM could potentially reduce complications, help prevent intubation, assist in the decision for timely reintubation, impact intensive care unit disposition and/or reduce length of stay. Ongoing postextubation studies are evaluating changes in MV and TV associated with narcotic administration and possible airway obstruction, and the need for respiratory interventions, including NIV with biphasic positive airway pressure or continuous positive airway pressure. To determine whether the RVM can provide an earlier indication of respiratory distress, future studies will relate both RVM and pulse oximetry data to clinical course. Other studies will examine whether RVM data can assist in defining the need and timing of respiratory interventions and improve outcome. These studies and additional outcomes-based studies will ultimately define the ability of the RVM to reduce respiratory-related morbidity and mortality.
The RVM’s transthoracic impedance measurements of RR have a bias of 1.8% and measurements of MV and TV have a relative error of <10% (9.3% and 9.6%) and RR has a relative error of 1.8% in spontaneously breathing subjects. If ongoing hospital studies confirm these findings, the RVM may improve patient safety and decrease health care costs.
Name: Christopher Voscopoulos, MD.
Contribution: This author helped write the manuscript.
Attestation: Christopher Voscopoulos approved the final manuscript.
Conflicts of Interest: Christopher Voscopoulos has received travel expenses to conferences from Respiratory Motion, Inc.
Name: Jordan Brayanov, PhD.
Contribution: This author helped design and conduct the study, analyze the data, and write the manuscript.
Attestation: Jordan Brayanov approved the final manuscript.
Conflicts of Interest: Jordan Brayanov worked for and has equity interest in Respiratory Motion, Inc.
Name: Diane Ladd, DNP.
Contribution: This author helped write the manuscript.
Attestation: Diane Ladd approved the final manuscript.
Conflicts of Interest: Diane Ladd consulted for Respiratory Motion, Inc. and has received travel expenses to conferences from Respiratory Motion, Inc.
Name: Michael Lalli, BSE.
Contribution: This author helped design and conduct the study, and analyze the data.
Attestation: Michael Lalli approved the final manuscript.
Conflicts of Interest: Michael Lalli worked for and is a shareholder in Respiratory Motion, Inc.
Name: Alexander Panasyuk, PhD.
Contribution: This author helped analyze the data.
Attestation: Alexander Panasyuk approved the final manuscript.
Conflicts of Interest: Alexander Panasyuk worked for and has equity interest in Respiratory Motion, Inc.
Name: Jenny Freeman, MD.
Contribution: This author helped design and conduct the study, analyze the data, and write the manuscript.
Attestation: Jenny Freeman approved the final manuscript.
Conflicts of Interest: Jenny Freeman worked for and has equity interest in Respiratory Motion, Inc.
This manuscript was handled by: Dwayne R. Westenskow, PhD.
Based on preliminary data from our primary study (24 hours), the standard deviation (σ) of the relative error in minute ventilation, tidal volume, and respiratory rate for an individual subject did not exceed 15%, a value that we will use as a proxy for σ.
Using the paired-difference equivalence test power formula:11
We calculate the number of subjects n needed to demonstrate that the 2 device measurements are equivalent (δ = 0) within 15% (B = 0.15). To obtain a 2-sided alternative with 90% power we set zα = 1.96, and zβ = 1.65. We can then calculate the required n = 13. This suggests that with 13 subjects we would be able to demonstrate equivalence (with 90% power) if the differences in the measurements between the RVM and the Wright are within 15%. Note that, if we had decided to demonstrate equivalence at a clinically acceptable 20% instead, we would need only 8 subjects. However, to better reflect a broader patient population, we enrolled 40 subjects.
The 31 subjects who completed the study, each measured 20 times, autocorrelated with r = 0.8, provide a total effective sample size of 76 observations with then n = 38 independent observations per arm (variance inflation factor is 1 + 19 [0.8] = 2.52 using compound symmetry formula 2.4112). From pilot data the relative errors for minute ventilation, tidal volume, and respiratory rate measures have standard deviations of ≤0.15. Assuming an equivalence bound, B = 0.15, the equivalence test with null hypothesis, H0, that the devices have different means, has over 90% power to reject H0 and conclude that the devices are equivalent (formula 3.211). With the bias, δ = 0.07, the power is 90%: using the paired-difference equivalence test power formula11 with n denoting effective sample size in each arm:
where the bound B = 0.15, type I error of 5%, .δ = 0.07, zα = 1.96, and zβ = 1.65. With no bias (δ = 0) the test has 99% power.
Appendix Figure 1. Statistics of the measurement errors. Colors are consistent with Figure 3 in the paper. Error bars represent 95% confidence interval (±2 SEM). A, Measurement bias across subjects and across days. The data show no significant difference in the average measurement error across 24 hours in either minute ventilation(MV) or tidal volume (TV) (P > 0.6, paired 2-sided t test). The data also show no significant measurement bias of MV, TV, or respiratory rate (RR) (P > 0.09 in all 3 cases, 2-sided t test). B and C show the population precision and accuracy. The average measurement precision, defined as the standard deviation of the error between the 2 devices, is 7.2% and 7.1% for MV and TV, respectively, whereas the average accuracy (the square root of the mean-squared-error) is 9.3% and 9.0%, respectively. There is no significant difference in either precision or accuracy across days for either MV or TV (P > 0.3 in all 4 cases).