Secondary Logo

Journal Logo

Original Research

Validation of Heart Rate Monitor-Based Predictions of Oxygen Uptake and Energy Expenditure

Montgomery, Paul G1,2; Green, Daniel J1; Etxebarria, Naroa1; Pyne, David B1,3; Saunders, Philo U1; Minahan, Clare L2

Author Information
Journal of Strength and Conditioning Research: August 2009 - Volume 23 - Issue 5 - p 1489-1495
doi: 10.1519/JSC.0b013e3181a39277
  • Free



The measurement of oxygen consumption (o2) and energy expenditure in the laboratory is well established for individual endurance-type sports such as running, cycling, and rowing. In contrast, the ability to measure or monitor changes in o2 specific to a team-based field or a court-based sport is technically difficult. Information on the changes in o2 and energy expenditure during play would provide insight for coaches and scientists about the metabolic demands of the sport and the adaptations that occur with training. Moreover, the determination of o2 in a laboratory is time consuming in the context of a busy training schedule and financially challenging to test an entire team.

The accuracy (typical error [TE]) of laboratory-based o2 systems should be <3% (26). A comprehensive review of the relevant literature shows that many of these systems, if not calibrated correctly, show error values up to 12% (18). However, the growing awareness of test reliability, along with the guidelines of acceptable tolerances (26), highlights the need for scientists to quantify the level of accuracy in laboratory equipment. Portable metabolic measurement systems have been developed and validated (4,6,17,18); however, these have their limitations and are generally impractical for team-based sports. Several studies have showed a high degree of error from 2 to 22% across low- and high-intensity workloads ((18)1). Errors in these systems arise from both mechanical and sampling issues and the inherent biological variability in subjects from test to test.

Heart rate (HR) is a reasonable surrogate measure of o2 and energy expenditure, given its linear relationship with these parameters at submaximal exercise intensities (5). A correlation coefficient of 0.91 showed that after adjusting for age, gender, body mass, and fitness, it is possible to estimate the energy expenditure during physical activity from HR values (15). However, the estimation of o2 and energy expenditure from HR values is limited in the team sport setting, where steady state conditions are infrequent. The recent development of HR monitoring systems, which incorporate algorithm-based predictive software to assess o2 and energy expenditure, is appealing to many sport practitioners. Heart rate-based monitoring of these variables could be useful in quantifying physiological responses to the training and competitive environment. However, the reliability and validity of these systems with highly trained athletes need independent evaluation before widespread use.

The Suunto personal HR monitoring system includes software for estimation of o2 and energy expenditure based on methods developed by Firstbeat Technologies, Ltd (Jyväskylä, Finland). Basically, neural networks were constructed for estimation of o2 and energy expenditure from R-R heart beat intervals, R-R-derived respiration rate, and the on-and-off o2 dynamics during various exercise conditions (8,21-24). Although the investigators acknowledge the limitations in the accuracy of the predictions when individual values for maximal HR and o2 are included, they give little information on the validity against pulmonary gas exchange values or correction factors to account for variation in these estimates. Recently, evaluation of the Firstbeat software in predicting o2 and energy expenditure across 25 low- to high-intensity daily tasks revealed a mean underprediction of 1.5 ml·kg−1·min−1 and 27 kcal (113 kJ) (25). No substantial difference was observed in the low-intensity tasks, with the variation increasing to 3.5 and 2.2 ml·kg−1·min−1 for the moderate- and high-intensity tasks, respectively. Although informative, whether these magnitudes of variation in predictive capacity are maintained for higher level intensities of exercise commonly undertaken by elite athletes is unclear. The purpose of this study was to determine the validity and variation of the Suunto HR system compared with pulmonary gas exchange values for the estimation of o2 and energy expenditure during submaximal- and maximal-intensity treadmill running in well-trained runners.


Experimental Approach to the Problem

Each participant completed a 2-component (submaximal and maximal) treadmill running test where pulmonary gas exchange was measured and recorded over 30-second intervals throughout the testing period. We used an open-circuit, computerized, metabolic cart comprising Ametek O2 and CO2 analyzers as described previously by Pierce et al. (20). The analyzers were calibrated with 3 α gases of known concentration (BOC Gases Australia) before each test. Calibration was accepted at ±0.03% of the target value. The accuracy of the analysis system was compared with an automated o2 calibrator for open-circuit indirect calorimetric systems (11). Estimated o2 values were within ±5% as specified by guidelines of the National Sport Science Quality Assurance Program (Australian Sports Commission, Canberra, Australia). During the test, each participant wore a commercially available HR monitoring device (Suunto; Vantaa, Finland), and HR was recorded continuously during the entire testing period. The peak HR was recorded every 30 seconds during each component. Validity of the Suunto software to estimate o2 and energy expenditure was compared against the criterion values of the metabolic cart. Three levels of the Suunto software analysis were evaluated. The first level of analysis required the input of the participant's basic personal information (BI) of age, body mass, height, gender, and level of activity. The software then predicted maximal HR and o2. The second level of analysis used the same basic personal information with the addition of the maximal HR (BIhr) as determined from the treadmill test. The third level of analysis added the laboratory-measured o2peak to the maximal HR and basic personal information (BIhr + v). In total, the HR recordings for each participant's test were analyzed 3 times to determine any improvement in accuracy for the software estimations. Estimations of energy expenditure were calculated from equations and tables of energy equivalents for the oxidation of fat-carbohydrate mixtures as described previously (7).


Ten male (age 29.8 ± 4.3 (mean ± SD) years, body mass 70.0 ± 7.7 kg, height 1.79 ± 0.51 m, o2peak 65.9 ± 9.7 ml·kg−1·min−1, and maximum HR 189 ± 8 b·min−1) and 7 female (25.6 ± 3.6 years, 59.6 ± 2.9 kg, 1.69 ± 0.39 m, 57.0 ± 4.2 ml·kg−1·min−1, and 189 ± 11 b·min−1) well-trained runners, who had been training continuously for the previous 6 months, volunteered to participate in the study. The study was approved by the Ethics Committee of the Australian Institute of Sport, and all subjects were verbally informed of the study requirements and signed an informed consent document before commencement.


Subjects were required to fast overnight before performing submaximal and maximal treadmill testing (0600-0800 hours). At the beginning of the test, all subjects were seated passively (semisupine) on the treadmill for a 5-minute period while resting expired gases and HR were recorded. The treadmill protocol included 2 exercise components: (a) submaximal, a series of at least five 4-minute exercise intervals (stages) performed below the individual's gas exchange threshold, and (b) maximal, a short incremental run to exhaustion. Depending on the participant's running ability, the first stage of the submaximal component commenced at a predetermined running speed (range 9-15 km·h−1), at a set gradient of 1%. The predetermined running speed was set at an intensity that would allow the subject to complete at least five 4-minute submaximal stages of increasing intensity and reach a blood lactate concentration of 4 mmol·L−1. A 1-minute rest period was taken between stages for collection of a capillary blood sample. Running speed for each subsequent stage increased by 1 km·h−1. At the completion of the submaximal component, subjects kept the mouthpiece in place and remained seated on the treadmill while expired gases were collected for a further 10 minutes. After the 10-minute period, subjects were allowed 1 minute to prepare for the maximal component of the test, which consisted of 1-minute stages commencing at the same initial running speed as the submaximal component. Running speed was increased by 1 km·h−1 every minute until volitional fatigue, with the treadmill gradient held constant at 1% for the duration of the test. After the maximal component of the test, subjects remained seated with the mouthpiece in place for another 10 minutes. Data from the last 60 seconds of each of the 4-minute submaximal stages were used to determine the associated steady state O2 consumption, and o2peak was calculated from the highest value recorded during any 60 seconds of the maximal running component.

Statistical Analyses

Simple descriptive statistics are reported as mean and SD. Raw values for o2 and energy expenditure from the metabolic cart and Suunto were log transformed to account for any nonuniformity of effects and errors. Validity was expressed as the standard error of the estimate (SEE) and the coefficient of variation (CV). Reliability was determined from 8 subjects who completed a retest within 7 days of their initial test; the results were expressed as TE and CV. Precision of estimation was indicated with 90% confidence limits where applicable. Bias between the practical and criterion measures was assessed by linear regression. The correlations between the criterion and predicted measurements were calculated with a Pearson correlation coefficient and expressed as an r value. The criteria for interpreting the magnitude of correlation were as follows: r < 0.1, trivial; r = 0.1-0.3, small; r = 0.3-0.5, moderate; and r > 0.5, large. The smallest worthwhile change in an outcome measurement was established with a small effect size (0.2 × between-subject SD) as described previously (14).


The validity of the predicted o2 measurements from the Suunto system expressed with 90% confidence limits is shown in Table 1. The validity of the o2 and energy expenditure values predicted by the Suunto system improved across the 3 levels of analysis with the sequential addition of the measured physiological information. There was little difference between the o2 estimates for BI, BIhr, and BIhr + v and the corresponding percent coefficient of variation (%CV). The degree of bias compared with the criterion o2 showed an underestimation by the Sunnto system, with the difference improving from −10.9% for BI, −3.9% for BIhr, to −0.4% for BIhr + v.

Table 1
Table 1:
SEE and CV inJOURNAL/jscr/04.02/00124278-200908000-00018/ENTITY_OV0312/v/2017-07-20T235400Z/r/image-pngo2 measures across 3 levels of analysis within the Suunto software compared with the criterion measures from the metabolic cart.*†

The validity of the predicted energy expenditure measurements from the Suunto system is shown in Table 2. Values of energy expenditure generated by the software were also underestimated in comparison with criterion gas analysis. The mean error of the estimated energy expenditure, compared with the criterion gas measure, showed small improvements from BI to BIhr and BIhr + v with corresponding %CV of 13.6, 12.2, and 12.7%, respectively. The degree of bias compared with the criterion showed an improvement in the underestimation over the 3 levels of analysis.

Table 2
Table 2:
SEE and CV in energy expenditure measures across 3 levels of analysis within the Suunto software compared with the criterion measures of the metabolic cart.*†

The reliability of the Suunto system for o2 expressed as the TE is shown in Table 1. Small improvements in TE were seen across the 3 levels of analysis of BI, BIhr, and BIhr + v, with TE values of 0.64, 0.72, and 0.57 ml·kg−1·min−1, respectively, and corresponding %CV of 1.4, 1.8, and 1.3. For energy expenditure, there were small improvements in the TE values of 1.49, 2.70, and 1.38 kJ for BI, BIhr, and BIhr + v, with little difference in the corresponding %CV values of 2.3, 4.3, and 2.3.


Providing key physiological information on o2 and energy expenditure has great appeal for practitioners monitoring long-term changes in athletes. This study has shown that during submaximal- and maximal-intensity treadmill running, the estimates of o2 from the Suunto HR system typically vary by ∼6% in comparison with criterion measures of a calibrated expired gas analysis system. This relative inaccuracy can be improved when known (measured) maximal values of HR and o2 are included into the software analysis. However, even with the addition of measured HR and o2 values, the level of error is inferior to laboratory-based methods. Estimates derived via the Suunto system are therefore not directly interchangeable with those from laboratory-based analysis systems. Nevertheless, the Suunto system should be useful for identifying moderate or large changes in oxygen demand and energy expenditure in field settings.

The smallest worthwhile change concept is useful for assessing the practical or clinical significance of effects in a sports setting (14). Quantifying the test-retest reliability of a performance test or measurement tool generates the TE. The TE permits the utility of a test to be interpreted via the signal to noise ratio. Where the TE (noise) is less than the smallest worthwhile change (signal), then the ability of the tool to detect worthwhile change is good. Conversely, if the TE is substantially greater than the smallest worthwhile change, a researcher or practitioner is less confident in detecting worthwhile changes in the laboratory or field. On this basis, given a TE of ∼0.6 ml·kg−1·min−1 and a smallest worthwhile change of ∼2.3 ml·kg−1·min−1, it follows that the Suunto system is useful for identifying moderate or large changes in estimated o2. However, the margin of error is too large for a practitioner to be confident of detecting subtle (but worthwhile) changes observed in a highly trained athlete during serial monitoring. Similarly, the TE of ∼2.1 kJ and smallest worthwhile change of ∼1.7 kJ for energy expenditure indicate that the system is not as useful in predicating energy expenditure.

In a team-based situation, the Suunto system may be adequate for assessing the moderate to large changes in within-player fitness measures taken at preseason through competition and off-season periods. Substantial changes in o2 have been observed during these training phases in various team sports (2,9,12,13,16). The system should also have utility in assessing the physiological responses and categorizing the energy demands of various field (or court) training sessions. Distinguishing low-, moderate-, and high-intensity training drills is useful information for conditioning coaches in team-based settings, as it allows training to be modified according to the accumulated load and intensity experienced during a series of drills. The Suunto system has acceptable reliability, which allows practitioners to compare drills or sessions that have the same temporal and training characteristics, for differences in intensity and physiological demand. The system may also have utility in monitoring the long-term development of the aerobic capacity of junior athletes as they progress to senior levels and for injured players undertaking rehabilitation programs. Given the limited amount of technology available for use in the field for team sports, the Suunto system seems to be an advanced method of quantifying activity compared with current practices.

The utility of the Suunto system can be improved by providing additional individual athlete inputs to the software before estimation of o2 and energy expenditure values. However, practitioners need to account for the amount of bias in the estimates. At the basic personal information level, there is substantial uncertainty (up to 10%) in the precision of the estimates of o2 and energy expenditure. The large amount of bias associated with the basic level decreases the confidence in the results, but as the bias improves across the additional levels, the results have a higher degree of utility. Presumably, the error associated with the basic level relates to the quality and quantity of the subjects in the original studies of which the algorithms are based on (23,24). The use of a more homogenous subject group, who were highly trained, may decrease the amount of error. The outcomes from the subsequent levels of analysis highlight the close relationship between HR and o2 (3): The prediction becomes more agreeable with the criterion measure when the measured values of these variables are included. There is only minimal improvement with the addition of the measured o2 values (Figure 1). Given that the software only requires a single figure for maximal values of HR and o2, determined at the conclusion of the maximal test, it is of interest that the software is able to make relatively accurate predictions of submaximal o2 values across several running speeds.

Figure 1
Figure 1:
JOURNAL/jscr/04.02/00124278-200908000-00018/ENTITY_OV0312/v/2017-07-20T235400Z/r/image-pngo2 comparisons for Gas analysis (▪) and Suunto software (▴). The top panel (BI) represents basic information of age, weight, gender, and body mass. The middle panel (BIhr) represents the addition of the measured maximal HR to basic personal information. The bottom panel (BIhr + v) represents the inclusion of the measured maximal HR and JOURNAL/jscr/04.02/00124278-200908000-00018/ENTITY_OV0312/v/2017-07-20T235400Z/r/image-pngo2. Error bars are SD. Software predictions for JOURNAL/jscr/04.02/00124278-200908000-00018/ENTITY_OV0312/v/2017-07-20T235400Z/r/image-pngo2 are underestimated across all running speeds when only basic information is included. HR = heart rate.

Our estimates for o2 at higher intensity running are in agreement with previous reports of o2 for “high-intensity” daily tasks (25). We observed that the values for higher intensity running are underestimated by ∼2 ml·kg−1·min−1. In general settings, this margin of error would be acceptable as the broad categorization of the energy cost for daily tasks may not require the precision of laboratory-based measures and would allow practitioners involved with energy balance to make suitable dietary or activity-related decisions. A moderate degree of accuracy may be acceptable for those practitioners and researchers who are interested in the demands of lower level activities or daily living activities. However, given that the Suunto system is marketed as a sports training tool, the benefit for higher level recreational and elite athletes is somewhat limited for assessing small serial changes in physiological measures during a training season.

Energy expenditure in the current study was underestimated during the submaximal component of the treadmill test across all levels of analysis. Although the SEEs of energy expenditure showed large improvements over the 3 levels of analysis, the degree of bias did not improve, showing an underestimation against the criterion measure. Similarly, the CV did not change substantially with the inclusion of all measured variables (Figure 2). This finding demonstrates the (in)effectiveness of the algorithms when all measured information is included; the large variation at the basic level of assessment has implications for those using the system in the absence of measured maximal values.

Figure 2
Figure 2:
Energy expenditure (kJ) comparisons for Gas analysis (▪) and Suunto software (▴). The top panel (BI) represents basic information of age, weight, gender, and body mass. The middle panel (BIhr) represents the addition of the measured maximal HR to basic personal information. The bottom panel (BIhr + v) represents the inclusion of the measured maximal HR and JOURNAL/jscr/04.02/00124278-200908000-00018/ENTITY_OV0312/v/2017-07-20T235400Z/r/image-pngo2. Error bars are SD. HR = heart rate.

During the maximal stage of the test, estimations of energy expenditure were overestimated at the initial running speeds. One possible explanation for this finding may be that the HR was slightly elevated as a response to the previous submaximal component. We observed an elevated HR of ∼5 b·min−1 (3%) between the first 4 stages of the maximal component compared with the corresponding stages (i.e., same speeds) of the submaximal component. Although we stipulated a 10-minute rest period between the submaximal and maximal components of the test, this was insufficient to reestablish a resting HR. Even though subjects were permitted to drink ad libitum between stages, elevated HR may reflect cardiac drift associated with dehydration (19) or an altered autonomic tone (10) carried over from the previous submaximal component. These ancillary elevations in HR are a limitation to the Suunto system, as misleading estimations of energy expenditure could be made from results obtained from HR not essentially associated with the underlying exercise demand.

Practical Applications

The Suunto HR monitoring system is useful for field-based estimations of o2 and, to a lesser extent, energy expenditure. However, the Suunto HR monitoring system lacks the precision needed to be a viable alternative to a calibrated metabolic cart in the laboratory setting. Although the system shows a high degree of reliability in a test-retest situation with a TE of ∼1.5%, the magnitude of error shown in this study of ∼6% compared with an accurate metabolic cart is outside the acceptable limits of <3% for quantifying small changes or differences in energy demand. The estimation of energy expenditure seems more problematic with error values of ∼13%. The system may have some utility in field-based assessments of gross changes in aerobic capacity or assessing the energy demands of a training/competition session, as a practitioner can be assured of the accuracy between sessions. Although the measures provided by the Suunto software are underestimated in comparison with criterion measures of HR and energy expenditure, these predictions are certainly no worse than information reported for portable o2 systems. The Suunto system appears suitable for assessing energy balance estimations in daily tasks: However, practitioners should be aware of the bias associated with the software and account for this in their reporting and exercise programming. The ease of use of the system, with telemetry functionality, facilitates immediate feedback of changes in both o2 and energy expenditure, which should provide new insight into field-based assessments.


The authors thank the subjects for their time and effort expended in the experimental procedures of the study. Grateful appreciation is also given to the National Sport Science Quality Assurance Program (Australian Sports Commission, Canberra, Australia) for their financial support toward the completion of this study.


1. Astorino, TA, Tam, PA, Rietschel, JC, Johnson, SM, and Freedman, TP. Changes in physical fitness parameters during a competitive field hockey season. J Strength Cond Res 18: 850-854, 2004.
2. Atkinson, G, Davison, RC, and Nevill, AM. Performance characteristics of gas analysis systems: what we know and what we need to know. Int J Sports Med 26(Suppl 1): S2-S10, 2005.
3. Berggren, G and Christensen, Eh. Heart rate and body temperature as indices of metabolic rate during work. Eur J Appl Physiol 14: 255-260, 1950.
4. Brehm, MA, Harlaar, J, and Groepenhof, H. Validation of the portable VmaxST system for oxygen-uptake measurement. Gait Posture 20: 67-73, 2004.
5. Brooks, GA, Fahey, TD, White, TP, and Baldwin, KM. Cardiovascular dynamics during exercise. In: Exercise Physiology: Human Bioenergetics and its applications. Mountain View, Ca: Mayfield Publishing Company, 2000. pp 317-339.
6. Duffield, R, Dawson, B, Pinnington, HC, and Wong, P. Accuracy and reliability of a Cosmed K4b2 portable gas analysis system. J Sci Med Sport 7: 11-22, 2004.
7. Elia, M and Livesey, G. Energy expenditure and fuel selection in biological systems: the theory and practice of calculations based on indirect calorimetry and tracer methods. In: Metabolic control of eating, energy expenditure, and the bioenergetics of obesity. Simopoulos, AP, ed, New York: Karger, 1992. pp. 68-131.
8. Firstbeat. VO2 estimation method based on heart rate measurement. Firstbeat Technologies Ltd, White paper, Jyväskylä, 2005.
9. Gabbett, TJ. Changes in physiological and anthropometric characteristics of rugby league players during a competitive season. J Strength Cond Res 19: 400-408, 2005.
10. Goldberger, JJ, Kannankeril, J, Le, FK, and Kadish, AH. Characteristics of heart rate recovery after maximal exercise. J Am Coll Cardiol 39(Supp 1): 100, 2002.
11. Gore, CJ, Catcheside, PG, French, SN, Bennett, JM, and Laforgia, J. Automated VO2max calibrator for open-circuit indirect calorimetry systems. Med Sci Sports Exerc 29: 1095-1103, 1997.
12. Gorostiaga, EM, Granados, C, Ibanez, J, Gonzalez-Badillo, JJ, and Izquierdo, M. Effects of an entire season on physical fitness changes in elite male handball players. Med Sci Sports Exerc 38: 357-366, 2006.
13. Hakkinen, K and Sinnemaki, P. Changes in physical fitness profile during the competitive season in elite bandy players. J Sports Med Phys Fitness 31: 37-43, 1991.
14. Hopkins, WG. Measures of reliability in sports medicine and science. Sports Med 30: 1-15, 2000.
15. Keytel, LR, Goedecke, JH, Noakes, TD, Hiiloskorpi, H, Laukkanen, R, van der Merwe, L, and Lambert, EV. Prediction of energy expenditure from heart rate monitoring during submaximal exercise. J Sports Sci 23: 289-297, 2005.
16. Koutedakis, Y. Seasonal variation in fitness parameters in competitive athletes. Sports Med 19: 373-392, 1995.
17. Lucia, A, Fleck, SJ, Gotshall, RW, and Kearney, JT. Validity and reliability of the Cosmed K2 instrument. Int J Sports Med 14: 380-386, 1993.
18. Macfarlane, DJ. Automated metabolic gas analysis systems: a review. Sports Med 31: 841-861, 2001.
19. Montain SJ and Coyle EF. Influence of graded dehydration on hyperthermia and cardiovascular drift during exercise. J Appl Physiol 73: 1340-1350, 1992.
20. Pierce, SJ, Hahn, AG, Davie, A, and Lawton, EW. Prolonged Incremental Tests do not Necessarily Compromise VO2max in Well-trained Athletes. J Sci Med Sport 2: 356-363, 1999.
21. Pulkkinen, A, Kettunen, J, Saalasti, S, and Rusko, HK. Accuracy of VO2 estimation increases with heart period derived measure of respiration. [Abstract]. Med Sci Sports Exerc 35: 192, 2003.
22. Pulkkinen, A, Kettunen, J, Martinmäki, K, Saalasti, S, and Rusko, HK. On- and off dynamics and respiration rate enhance the accuracy of heart rate based VO2 estimation. [Abstract]. Med Sci Sports Exerc 36: 253, 2004a.
23. Pulkkinen, A, Kettunen, J, Martinmäki, K, Saalasti, S, and Rusko, HK. On- and off dynamics and respiration rate enhance the accuracy of heart rate based VO2 estimation. In: Proceedings of the ACSM Congress, Indianapolis, June 2-5, 2004, 2004b.
24. Pulkkinen, A, Saalasti, S, and Rusko, HK. Energy expenditure can be accurately estimated from HR without individual laboratory calibration. In: Proceedings of the ACSM Congress, Indianapolis, June 2-5, 2004.
25. Smolander, J, Rusko, H, Ajoviita, M, Juuti, T, and Nummela, A. A novel method for using heart rate variability data for estimation of oxygen consumption and energy expenditure: A validation study. In: Proceedings of the ECSS Congress, Jyväskylä, Finland, July 11-14, 2007.
26. Withers, R, Gore, C, Gass, G. and Hahn, A. Determination of maximal oxygen consumption (VO2max) or maximal aerobic power. In: Physiological tests for elite athletes, Gore, CJ ed. Champaign Ill: Human Kinetics, 2000. pp. 114-128.

aerobic capacity; intermittent exercise; team sport

© 2009 National Strength and Conditioning Association