Secondary Logo

Journal Logo

Variable Accuracy of Wearable Heart Rate Monitors during Aerobic Exercise

GILLINOV, STEPHEN; ETIWY, MUHAMMAD; WANG, ROBERT; BLACKBURN, GORDON; PHELAN, DERMOT; GILLINOV, A. MARC; HOUGHTALING, PENNY; JAVADIKASGARI, HODA; DESAI, MILIND Y.

Medicine & Science in Sports & Exercise: August 2017 - Volume 49 - Issue 8 - p 1697–1703
doi: 10.1249/MSS.0000000000001284
APPLIED SCIENCES
Free

Purpose Athletes and members of the public increasingly rely on wearable HR monitors to guide physical activity and training. The accuracy of newer, optically based monitors is unconfirmed. We sought to assess the accuracy of five optically based HR monitors during various types of aerobic exercise.

Methods Fifty healthy adult volunteers (mean ± SD age = 38 ± 12 yr, 54% female) completed exercise protocols on a treadmill, a stationary bicycle, and an elliptical trainer (±arm movement). Each participant underwent HR monitoring with an electrocardiogaphic chest strap monitor (Polar H7), forearm monitor (Scosche Rhythm+), and two randomly assigned wrist-worn HR monitors (Apple Watch, Fitbit Blaze, Garmin Forerunner 235, and TomTom Spark Cardio), one on each wrist. For each exercise type, HR was recorded at rest, light, moderate, and vigorous intensity. Agreement between HR measurements was assessed using Lin's concordance correlation coefficient (rc).

Results Across all exercise conditions, the chest strap monitor (Polar H7) had the best agreement with ECG (rc = 0.996) followed by the Apple Watch (rc = 0.92), the TomTom Spark (rc = 0.83), and the Garmin Forerunner (rc = 0.81). Scosche Rhythm+ and Fitbit Blaze were less accurate (rc = 0.75 and rc = 0.67, respectively). On treadmill, all devices performed well (rc = 0.88–0.93) except the Fitbit Blaze (rc = 0.76). While bicycling, only the Garmin, Apple Watch, and Scosche Rhythm+ had acceptable agreement (rc > 0.80). On the elliptical trainer without arm levers, only the Apple Watch was accurate (rc = 0.94). None of the devices was accurate during elliptical trainer use with arm levers (all rc < 0.80).

Conclusion The accuracy of wearable, optically based HR monitors varies with exercise type and is greatest on the treadmill and lowest on elliptical trainer. Electrode-containing chest monitors should be used when accurate HR measurement is imperative.

The Heart and Vascular Institute, Cleveland Clinic, Cleveland, OH

Address for correspondence: Milind Y. Desai, M.D., Department of Cardiovascular Medicine, Cleveland Clinic/Desk J1-5, Cleveland, OH 44195; E-mail: desaim2@ccf.org.

Stephen Gillinov and Muhammad Etiwy are co-first authors and contributed equally to the work.

Submitted for publication January 2017.

Accepted for publication March 2017.

Over the last decade, there has been a proliferation of commercially available HR monitors and wearable fitness devices. Targeting a larger audience than the elite athletes who use HR monitoring to inform their training and assess aerobic fitness, companies have entered the market of population health, offering a variety of wearable HR and activity monitoring systems to the public. Annual worldwide sales of such devices are projected to reach 100,000,000 units and $50 billion by 2019 (5,13,14).

Although many consumers purchase these wearable fitness trackers to catalog their HR response to exercise, others use them with the hope that they will improve health via weight loss and/or increased aerobic fitness (3,4,10,12,13). However, surveys document substantial attrition in the use of fitness wearables, with up to one-third of individuals discontinuing their use within 6 months of purchase (12).

In clinical practice, physicians and trainers frequently see patients who report physiologic and behavioral data obtained from their wearable devices; such data often include energy expenditure, steps taken, sleep/wake times, and HR. Controlled studies demonstrate variable accuracy of activity trackers, with error margins approaching 25% for some devices (3,7,10,12,17). A previous study suggests a somewhat lower error margin with selected HR monitors (19).

Current questions regarding the accuracy of wearable HR monitors are particularly relevant, given the recent consumer shift from HR monitors that rely on chest straps with electrodes that measure cardiac electrical activity toward more convenient wrist-worn monitors that use optical sensing technology similar to that used for pulse oximetry. Although the accuracy of chest strap monitors has been confirmed in various previous reports (6,18), there is a paucity of data validating the accuracy of wrist-worn, optically based HR monitors (12). A recent study from our group suggests that wrist-worn monitors fail to provide accurate readings during treadmill exercise; however, that study examined only one form of exercise (treadmill walking/running) and included first generation wrist-worn monitors, one of which is no longer commercially available (19). Broad assessment of the monitors' accuracy is important both for the individuals who rely on these monitors to guide their athletic, physical, and rehabilitative activity and for the physicians to whom these individuals report their HR readings for the purpose of potentially guiding therapy.

The objective of this study was to assess the accuracy of five commonly used, currently commercially available, optically based wearable HR monitors in an appropriately powered study under various forms of aerobic exercise conditions.

Back to Top | Article Outline

METHODS

Participants

This prospective study recruited 50 healthy adults 18 yr or older through fliers and Internet notices at Cleveland Clinic from June 2016 through August 2016. Participants were screened to ensure that they were able to safely perform an 18-min exercise protocol, including a treadmill, an elliptical trainer, and a stationary bicycle. The screening tool was adapted from the National Academy of Sports Medicine's screening questionnaire (11). Study exclusion criteria included known cardiovascular or lung disease, presence of a cardiac pacemaker, treatment with beta-blockers or heart rhythm medications, and self-reported chest pain, dizziness, or loss of balance. The protocol was approved by the Institutional Review Board of the Cleveland Clinic, and all subjects provided written informed consent. The study was registered at clinicaltrials.gov (NCT02818244).

Back to Top | Article Outline

HR Monitors

All participants wore standard ECG leads (Mason-Likar electrode placement of torso-mounted limb leads), a Polar H7 chest strap monitor, and a Scosche Rhythm+ on the forearm. In addition, each participant was randomly assigned by a computer program to wear two different wrist-worn HR monitors, one on each wrist; this enabled the assessment of each type of wrist-worn monitor in 25 subjects. The wrist-worn monitors assessed included Fitbit Blaze (Fitbit), Apple Watch (Apple), Garmin Forerunner 235 (Garmin), and TomTom Spark Cardio (TomTom). Four units of each type of monitor were purchased from retail outlets and studied in random order. Each of these optically based wearable monitors measures HR via an optically obtained plethysmogram that is processed according to proprietary algorithms.

Back to Top | Article Outline

Exercise Protocol

In each subject, HR was assessed using five different monitoring modalities (ECG, Polar H7 chest strap, Scosche Rhythm+, and two different wrist-worn monitors). The Mason–Likar electrode placement allowed the assessment of modified leads I, II, and III on ECG. An aggressive electrode preparation was performed at each site, which included cleansing with alcohol and light abrasion to reduce resistance and optimizing signal quality. ECG was monitored on a Quinton Q-tel RMS telemetry system, and hard copy rhythm strips were obtained to measure HR. ECG-based HR was determined by visual assessment under direct supervision by a cardiologist. In addition, the HR was measured when performing four different types of exercise at varying intensities; these included treadmill, stationary bicycle, elliptical trainer with arm levers, and elliptical trainer without arm levers. The order of exercises was assigned randomly.

Exercise protocols for each piece of equipment were as follows:

  • Treadmill
    • ○2 mph for 1.5 min
    • ○3.5 mph for 1.5 min
    • ○6 mph for 1.5 min
  • Stationary bicycle
    • ○25 W for 1.5 min
    • ○55 W for 1.5 min
    • ○125 W for 1.5 min
  • Elliptical (without arm levers)
    • ○Light for 1.5 min: crossramp = 1, resistance = 1, cadence = 60–70 min−1
    • ○Moderate for 1.5 min: crossramp = 1, resistance = 5, cadence = 90–100 min−1
    • ○Vigorous for 1.5 min: crossramp = 10, resistance = 10, cadence = 90–100 min−1
  • Elliptical (with arm levers)
    • ○Light for 1.5 min: crossramp = 1, resistance = 1, cadence = 60–70 min−1
    • ○Moderate for 1.5 min: crossramp = 1, resistance = 5, cadence = 90–100 min−1
    • ○Vigorous for 1.5 min: crossramp = 10, resistance = 10, cadence = 90–100 min−1

The treadmill settings of 2, 3.5, and 6 mph correspond to workloads of 2.5, 3.7, and 10.2 METs, respectively. For a 70-kg individual, the bicycle settings of 25, 55, and 125 W correspond to 2.4, 3.7, and 8.8 METs, respectively. Because there are no standard workload settings for elliptical trainers, we identified three settings that were judged to represent light, moderate, and vigorous activity.

Each subject spent 4.5 min at each of the four exercise stations and then rested for 2 min between different exercise stations; therefore, total exercise time was 18 min, and total time of each trial was 24 min. HR signals for all devices were checked at the beginning of each exercise/rest segment to ensure device function. HR was recorded from HR monitors at the completion of each 1.5 min exercise segment and at the end of each 2 min rest period; preliminary studies in three subjects confirmed that HR had reached a steady state at these time points. At each time point, HR was recorded by two trained research personnel (SMG and ME), one situated on each side of the subject. HR recordings from all devices and ECG were obtained for a period of approximately 5 s. Values were entered into an IRB-approved database.

Back to Top | Article Outline

Statistical Methods

Sample size

Sample size was based on the use of Lin's concordance correlation coefficient (rc) to compare HR measurements with wearable, optically based HR monitors to those obtained with the ECG, which is considered the standard (10). On the basis of previous work, we deemed an rc > 0.8 to represent acceptable accuracy in HR measurement (20). Generation of 25 pairs of data for each device (i.e., device and ECG) was necessary to provide 90% power to determine a difference from rc of 0.82 to rc of 0.93.

Back to Top | Article Outline

Analysis plan

Paired differences

Using the ECG-determined HR as the standard, each of the HR monitoring systems was assessed for accuracy by calculation of the difference between the measures and compared. The paired differences, both relative and absolute, were calculated as (HRecg – HRdevice) for each device under the various conditions. The absolute percent differences were calculated as ([HRecg – HRdevice]/HRecg × 100).

Back to Top | Article Outline
Agreement

The Bland–Altman analysis was performed to assess agreement for each device with ECG (2). In addition, Lin's concordance correlation coefficients (rc) and associated 95% confidence intervals were calculated to provide a measure of agreement for each device with ECG. The concordance correlation coefficient (rc) measures the degree to which the paired observations fall on the identity line (9).

Back to Top | Article Outline
Multivariable testing

Repeated-measures mixed model ANOVA was used to test the overall effect of the fitness devices while adjusting for covariates and taking into account multiple measurements for each subject. In addition to HR device and exercise condition (activity type and intensity), factors in the final adjusted model included age, gender, body mass index, wrist size, and days of typical aerobic exercise per week.

Data were analyzed using SAS version 9.4 (SAS Institute Inc., Cary, NC) and R software version 3.2.3 (15).

Back to Top | Article Outline
Presentation

Continuous variables are reported as mean ± SD, with median and percentile values. Categorical variables are reported as percent and frequency.

Back to Top | Article Outline

RESULTS

Subjects

The study randomized 50 subjects (mean ± SD age = 38 ± 12 yr, 27 [54%] females, 6 [12%] non-Whites) (Table 1, Fig. 1). Subjects were examined for the presence of tattoos on the wrist; none had tattoos in this location. All subjects engaged in regular aerobic exercise (including walking), and 82% reported that they exercised regularly to the point of perspiration. Subjects' mean ± SD resting HR on ECG was 86 ± 18 bpm.

FIGURE 1

FIGURE 1

Back to Top | Article Outline

Aggregate results

Of the 4000 possible HR measurements, 3985 were recorded (99.6%). Across all ECG tracings, there was minimal artifact and in no situation did ECG artifact interfere with visual HR determination. Missing data were attributable to failure of the device to record HR (eight for Apple Watch, four for Fitbit, two for Scosche Rhythm+, and one for Garmin Forerunner 235.

Measured HR ranged from 51 to 184 bpm. Average differences from the ECG standard were less than 1 bpm for the Polar H7 under all exercise conditions but extended to nearly 20 bpm for other monitors (Table 2). The average differences from the ECG standard were calculated as both relative error (which averages positive and negative differences from the ECG standard) and the absolute value of error, regardless of direction. HR values on the wrist-worn monitors varied from the ECG standard by approximately 2% to nearly 20%, depending on the monitor and the activity (Table 2).

Bland–Altman analysis revealed that all monitors had some measurements that did not reflect HR accurately (Fig. 2); however, this variation was not linked to specific HR values, meaning that variability was not influenced by the HR magnitude. The Apple Watch had 95% of differences fall within −17 and 20 bpm of the ECG, whereas TomTom Spark Cardio and Garmin Forerunner 235 had 95% of values fall within −24 and 31 bpm and −27 and 33 bpm, respectively. The corresponding values for Scosche Rhythm+ and Fitbit Blaze were −31 and 38 bpm and −30 and 45 bpm, respectively.

FIGURE 2

FIGURE 2

Under all conditions combined, when compared with ECG, the Polar chest strap had the highest agreement with ECG with an rc of 0.99. Among wrist-worn monitors, the Apple Watch performed best with rc = 0.92, followed by the TomTom Spark Cardio (rc = 0.83) and Garmin Forerunner 235 (rc = 0.81). The Scosche Rhythm+ and Fitbit Blaze had rc = 0.75 and rc = 0.67, respectively (Fig. 3).

FIGURE 3

FIGURE 3

TABLE 1

TABLE 1

TABLE 2

TABLE 2

The results of the mixed model confirmed that among the optically based HR monitors, the Apple Watch was the most accurate, with no statistical difference from ECG (P = 0.22), even after adjustment for all other factors. The other optically based HR monitors often underestimated the true HR (P < 0.0001). Subject factors (age, gender, body mass index, wrist circumference, and days of typical aerobic exercise per week) were not associated with HR monitor accuracy.

Back to Top | Article Outline

Agreement with ECG during various types of exercise

The Polar H7 Chest Strap performed well during all different aerobic exercise modalities (rc = 0.99), but other HR monitors' agreement with ECG varied with the type of exercise (Table 2). At rest, all monitors had rc > 0.88. With the treadmill, all devices provided acceptable agreement (rc = 0.88–0.93) except the Fitbit Blaze (rc = 0.76). While biking, Garmin Forerunner 235, Apple Watch, and Scosche Rhythm+ had the highest agreement with ECG (rc > 0.80). On the elliptical trainer without using the arm levers, only the Apple Watch provided readings that agreed with the ECG (rc = 0.94). None of the optically based HR monitors provided good agreement with ECG during elliptical trainer use with the arm levers engaged (rc < 0.80).

Although HR monitor agreement with ECG varied with the type of exercise, it did not vary with the intensity of exercise from easy to moderate on each piece of equipment. However, when moving to vigorous exercise, only the Apple Watch had readings with similar agreement to ECG to those obtained with less intense exercise; all other monitors had less agreement during vigorous activity (P < 0.003).

Back to Top | Article Outline

DISCUSSION

The results of this study demonstrate that optically based wearable HR monitors are less accurate than electrode-containing chest strap monitors. In addition, the accuracy of these monitors varies with the type of aerobic activity. These findings raise questions concerning the role of such monitors in individuals' management of their health, assessment of their fitness, and guidance of their fitness regimens.

Introduced in the 1980s, chest strap–based HR monitors function much like an ECG, sensing cardiac electrical activity. Several studies confirm the accuracy of most of these HR monitors under conditions of both rest and moderate exercise (6,8,18). Although chest strap–based HR monitors have been favored by elite athletes because of their proven accuracy, they are relatively inconvenient and have not been widely adopted by the public. By contrast, the recent introduction of convenient, wrist-worn HR monitors that include the capability for wireless transmission has stirred widespread public interest in HR monitoring. However, as reported by major media outlets, individuals' experiences with the newer class of HR monitors suggest that their accuracy may be poor, particularly during exercise (16). This controversy has reached the courtroom in the form of a class action lawsuit alleging that the Fitbit device is inaccurate and potentially harmful (14).

The new wrist-worn HR monitors do not measure cardiac electrical activity; rather, they rely on photoplethysmography. The monitor illuminates the skin with an LED and then measures the amount of light reflected back to a photodiode sensor; this enables detection of variations in blood volume associated with the pulse of blood caused by each cardiac contraction. Potential sources of error with optically based monitors include motion artifact from physical movement, misalignment between the skin and the optical sensor, variations in skin color/tone, ambient light, and poor tissue perfusion (1). The accuracy of such monitors during exercise is controversial, some studies suggesting that wrist-worn HR monitors perform best at rest or slow walking, and others asserting assert good accuracy even during vigorous exercise (1,5). In a recent study examining subjects on a treadmill, we found variable accuracy between different optically based HR monitors; however, when compared with an ECG, the tested monitors all had a concordance correlation coefficient exceeding 0.80 (19).

Extending that work, the current study assessed the performance of wearable HR monitors using varying aerobic exercise modalities (treadmill, stationary bicycle, and elliptical trainer with and without arms) and at different levels of intensity. Recognizing that people engage in a variety of types of exercise beyond walking on a treadmill, the primary purpose of the current study was to assess the monitors' agreement with ECG during different forms of aerobic activity. Distinct from the previous study, the current study enrolled a new cadre of subjects and assessed several monitors that had not previously been tested. Although all monitors performed well in subjects at rest, their accuracy varied with different exercise modalities. Certain monitors were better suited for the stationary bicycle and the elliptical trainer (without arm motion), and this may be a result of variable tolerances for motion artifact associated with different exercises. In particular, none of the optical monitors performed well when assessing HR in subjects using the elliptical trainer with arm motion, likely a result of motion artifact related to arm movement (1). By contrast, a chest strap containing an electrically based monitor provided accurate measurements, regardless of exercise intensity or modality.

Although this study is the largest of its kind and included nearly 4000 HR measurements, it has limitations. The current study methodology (e.g., visual recording of HR on ECG) may have contributed to some error as compared with a more rigorous approach wherein time stamped raw device data were extracted. The results apply only to the HR monitors tested. These monitors were chosen because of their apparent popularity with the public, and each monitor was the manufacturer's most recent offering at the time of the study; however, they represent an opportunistic sample of the wide range of available HR monitors. Continuous HR assessment, which is currently not feasible with all devices, would enable more detailed comparisons. The devices were assessed in young, healthy volunteers exercising in a laboratory setting. Results may vary for different subsets of individuals, including cardiac patients. Although we accounted for participant factors including age and BMI, the relatively narrow distribution of age and BMI in this study of young, healthy volunteers does not enable us to rule out a potential effect of these factors on the accuracy of HR measurement. In addition, these results may not be representative of those obtained during more vigorous exercise or during different activities (e.g., running on pavement, swimming, or other sports participation).

Back to Top | Article Outline

CONCLUSION

This study demonstrates that optically based wrist-worn HR monitors vary in their accuracy and that their accuracy is activity dependent. Individuals who use such monitors should be aware of the possibility of inaccurate measurements and that some monitors (i.e., the Apple Watch) provide greater agreement with ECG than do other monitors. Apparently, spurious HR measurements should be confirmed by simple palpation to measure HR or, if readily available, by ECG. When accurate HR monitoring is essential, an electrically based chest strap monitor should be used.

This study was supported by the Mary Elizabeth Holdsworth Fund at the Cleveland Clinic.

The Mary Elizabeth Holdsworth Fund had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

There are no relevant conflicts of interest to disclose.

The results of the present study do not constitute endorsement by the American College of Sports Medicine. The results of the study are presented clearly, honestly, and without fabrication, falsification, or inappropriate data manipulation.

Back to Top | Article Outline

REFERENCES

1. Alzahrani A, Hu S, Azorin-Peris V, et al. A multi-channel opto-electronic sensor to accurately monitor heart rate against motion artefact during exercise. Sensors (Basel). 2015;15(10):25681–702.
2. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–10.
3. Case MA, Burwick HA, Volpp KG, Patel MS. Accuracy of smartphone applications and wearable devices for tracking physical activity data. JAMA. 2015;313(6):625–6.
4. Diaz KM, Krupka DJ, Chang MJ, et al. Fitbit®: an accurate and reliable device for wireless physical activity tracking. Int J Cardiol. 2015;185:138–40.
5. El-Amrawy F, Nounou MI. Are currently available wearable devices for activity tracking and heart rate monitoring accurate, precise, and medically beneficial? Healthc Inform Res. 2015;21(4):315–20.
6. Laukkanen RM, Virtanen PK. Heart rate monitors: state of the art. J Sports Sci. 1998;16(Suppl):S3–7.
7. Lee JM, Kim Y, Welk GJ. Validity of consumer-based physical activity monitors. Med Sci Sports Exerc. 2014;46(9):1840–8.
8. Léger L, Thivierge M. Heart rate monitors: validity, stability, and functionality. Phys Sportsmed. 1988;16(5):143–51.
9. Lin LI. A concordance correlation coefficient to evaluate reproducibility. Biometrics. 1989;45(1):255–68.
10. Murakami H, Kawakami R, Nakae S, et al. Accuracy of wearable devices for estimating total energy expenditure: comparison with metabolic chamber and doubly labeled water method. JAMA Intern Med;176(5):702–3.
11. National Academy of Sports Medicine Data Collection Sheet [cited 2016 April 5]. Available from: http://http://www.nasm.org/docs/pdf/nasm:par-q-(pdf-21k).pdf.
12. Patel MS, Asch DA, Volpp KG. Wearable devices as facilitators, not drivers, of health behavior change. JAMA. 2015;313(5):459–60.
13. Piwek L, Ellis DA, Andrews S, Joinson A. The rise of consumer health wearables: promises and barriers. PLoS Med;13(2):e1001953.
14. Profils S. Do wristband heart trackers actually work? A checkup [cited 2016 April 4]. Available from: http://http://www.cnet.com/news/how-accurate-are-wristband-heart-rate-monitors/.
15. R package epiR for calculating concordance correlation coefficients [cited 2016 April 4]. Available from: http://cran.r-project.org/web/packages/epiR.
16. Stern J. Fitness bands with heart-rate tracking are missing a beat. Wall Street Journal. December 16, 2014.
17. Swan M. Emerging patient-driven health care models: an examination of health social networks, consumer personalized medicine and quantified self-tracking. Int J Environ Res Public Health. 2009;6:492–525.
18. Terbizan DJ, Dolezal BA, Albano C. Validity of seven commercially available heart rate monitors. Measurement in Physical Education and Exercise Science. 2002;6(4):243–7.
19. Wang R, Blackburn G, Desai M, et al. Accuracy of wrist-worn heart rate monitors. JAMA Cardio. 2016;2(1):104–6.
Keywords:

WEARABLE HEART RATE MONITORS; ACCURACY; RANDOMIZED TRIAL

© 2017 American College of Sports Medicine