Cycling power output during training and competition can be estimated using mathematical models (13,15–17) or measured directly using powermeters or rear-wheel hubs instrumented with strain gauges. Such power monitoring devices (e.g., SRM and Power Tap) can interface with an athlete’s bicycle, making it possible to measure power output, cadence, and speed during competition and laboratory testing.
The SRM (Schoberer Rad Messtechnik, Welldorf, Germany) power monitoring system consists of an SRM powermeter (instrumented crank), an SRM powercontrol (data logger and onboard data display), and a sensor cable (linking data transfer from crank to the onboard powercontrol). Three different SRM powermeters are currently produced for use with track, road, and mountain bikes. Although the mountain bike SRM powermeter is only produced in a four-strain-gauge model (professional model), there are two track models (professional: four strain gauges and scientific: eight strain gauges) and three road models (amateur: two strain gauges, professional: four strain gauges, and scientific: eight strain gauges). The manufacturer reports that the accuracy of these SRM powermeters increases with the number of strain gauges (amateur = ± 5%; professional = ± 2%; scientific = ± 0.5%). The more recently introduced Power Tap (PT) also consists of an onboard data logger, a sensor cable, and an eight-strain-gauge instrumented rear-wheel hub. The manufacturers of PT claim an accuracy of ± 2.5%.
The majority of sports scientists and coaches using power monitoring technology have not addressed validity issues, choosing to rely on the manufacturers reported values. For example, there is no mention as to whether SRM powermeters were calibrated before use by Golich and Broker (6) or Jeukendrup and Van Diemen (9). In other studies, the term “calibration” appears to have been inappropriately used to describe the process of resetting the zero power output offset (2,4,5). Finally, in those studies that have attempted to check SRM powermeter calibration (10,13), Monark cycle ergometers have been used for comparison, despite reports that these ergometers underestimate true power output by up to 8% over a range of 59–353 W (14,18). Rarely, if at all, is the zero offset drift during a trial reported or corrected for, and in only a few published reports do authors indicate the model of SRM powermeter used (2,3,10).
The primary purpose of this study was to establish whether the SRM and PT powermeters accurately measure power output during simulated high-intensity cycling. In a series of four experiments, the accuracy of both systems were assessed using a first principles dynamic CALRIG (18). Specifically we addressed the following research questions: 1) Are multiple power monitoring units accurate during repeated calibration trials? 2) What is the influence of cadence on the accuracy of power output? 3) What is the influence of pretrial instrumentation temperature and warm-up on power output during prolonged trials of 30–60 min? and 4) What is the relationship between power output, cadence, and speed in the most accurate SRM and PT units during a training session in the field and a simulated session in the laboratory?
In the first of four experiments, the accuracy of power data from 19 SRM and 5 PT powermeters were initially assessed using a dynamic calibration rig (CALRIG) (Tom Stanef, SASI, Australia) (18). Powermeters were examined between the range of 50 and 1000 W. After the initial trials, 15 of the SRM powermeters were retested after 1 yr of use, whereas the five PT hubs were tested again 2 d later. The second experiment used the most accurate SRM and PT units (i.e., one of each) to examine the influence of cadence (60, 80, 100, and 120 rpm) on accuracy over a range of power outputs (50–1000 W). The third experiment examined the influence of temperature and equipment “warm-up” on power output measurements. More specifically, power output was measured simultaneously in both devices over one continuous hour using a CALRIG. These trials were repeated on separate days in order to establish repeatability. The fourth experiment examined the relationship between power output, cadence, and speed of an SRM and PT unit during a typical training ride in the field and a simulated CALRIG session in the laboratory.
The CALRIG used in this study was calibrated using four accredited masses up to 11.39 kg. For the purposes of this study, the most popular professional version of the SRM powermeters was used (i.e., four strain gauges). SRM powermeters and PT were mounted to a standard nine-speed road bicycle (Giant, Taiwan), which was attached to a stationary air-braked trainer (RX-5, Blackburn, Sydney, Australia) (Fig. 1). The PT were all less than 1 yr old whereas the SRM were less than 3 yr old with serial numbers ranging between 235 and 1031.
Standard calibration procedure.
After a 10-min warm-up at a power output of ~200 W, the zero offset for the SRM was set and the PT torque was zeroed; both procedures were carried out in accordance with the manufacturer’s instructions. The CALRIG load cell was calibrated with verified precision masses up to 11.39 kg and dynamic system losses were accommodated for by repeating unloaded zero offset procedures pre and post each trial. Unless specified, all testing sessions were performed at 100 rpm and involved shifting through the full range of nine gears in the rear cluster using both the large and small front chain rings. Due to technical limitations with the PT and CALRIG download process, a second-by-second comparison between the CALRIG, PT, and SRM data could not be made. Therefore, the protocol for recording involved a 40-s stabilization period in each gear, followed by instantaneous manual readings of power output from each instrument at 40, 45, 50, and 55-s periods. The four power outputs were then averaged for further analysis. All trials were run under standard environmental conditions (~21°C, 40–55% RH, 695–710 mm Hg), and the same rear chain cluster (Durace, Shimano, Japan) was used for each trial. All bicycle components were in good condition, less than 2 yr old, and the chain was well lubricated. At least 1 h separated the trials in order for the instrumentation to cool.
Experiment 1: accuracy and reliability of SRM and PT data.
Using the procedure outlined above, the accuracy of 19 SRM powermeters were initially evaluated throughout a power range of 50–1000 W. After this, the frequency versus torque slopes were adjusted so that the power reading was within ± 2% of the CALRIG. The SRM calibration slope can be calculated via the following equation (Schoberer, U. SRM Training System Online Manual. Accessed; November, 2003 www.srm.de/OnlineManual/index):
Eleven months later, after a full racing season, the accuracy of 15 of the powermeters was again assessed. After these trials and using the same calibration procedure, the accuracy of five PT rear-wheel hubs was assessed over a 2-d period during repeat random trials. As PT calibration cannot be altered by the user, no adjustment was made between trials.
Experiment 2: effects of cadence on SRM and PT power data.
Power measures from the most accurate SRM and PT units from the above trials were compared using the CALRIG as a reference at different cadences (60, 80, 100, and 120 rpm) over a 2-d period. The same equipment calibration, setup, and testing procedures were used as in experiment 1.
Experiment 3: effects of pretrial temperature on SRM and PT power data.
The most accurate SRM powermeter and PT hub were once again tested using the same equipment as that used in experiments one and two. The equipment was first tested then retested after 12 h of exposure to 6°C (55–65% RH, 695–710 mm Hg) and after 10 h of exposure to 21°C (40–55% RH, 695–710 mm Hg).
Before testing, the CALRIG was subjected to the same warm-up and calibration protocol as previously used. The zero offset for the SRM was set and the PT torque was zeroed before and after each testing session. In order to represent a “worst-case” scenario, the zero offset was initially set without warm-up and then rechecked after the trial to examine any drift. Each testing session was performed at a fixed power output (~300 W) for 1 h. Data were recorded at 1, 3, 5, 7, 10 min, and then every 5 min until the hour trial was completed.
After these trials, a further 30-min trial was completed using the standard 10-min warm-up and zero procedure outlined in experiment 1 to examine the effect of our standard warm-up in comparison to the cool and standard condition trials on the zero offset and power output drift.
Experiment 4: comparison of SRM and PT data from the laboratory and field.
Powermeters were compared in two ways. First, the most accurate SRM and PT units were mounted onto a standard road bicycle and attached to the CALRIG. The test involved randomly changing both the cadence (46–122 rpm), and all nine gears in order to gain fluctuations in the power output (0–550 W). Power output and cadence were collected for analysis.
In the second comparison, power output, cadence, and speed data were collected during a consistent shallow-grade 7-min hill climb using the same bicycle. Before the experiment, both devices were once again zeroed. Data during the ride were then recorded and downloaded for further analysis. The same tire circumference was entered into both computers for comparison of speed.
Modified Bland-Altman plots (1) were constructed to quantify and display the magnitude of the error between the SRM, PT, and CALRIG for each of the experiments performed. The percent error was calculated as ((estimate − criterion)/estimate). Data were also summarized using descriptive statistics, (mean ± SD, minimum and maximum error values). Because of differences in the downloadable time-base for both instruments (1.0 s for SRM and 1.26 s for PT), we relied on descriptive statistics to summarize data. Descriptive statistical procedures were chosen for this series of experiments with the intention of displaying a direct and practical comparison to the manufacturer’s reported accuracy while also providing a simple way of interpreting the error associated with this technology.
Table 1 shows mean percent error, SD, min, and max for SRM and PT data when compared with the CALRIG at 100 rpm. During the first trial of experiment 1, the range in the error scores for the 19 SRM powermeters was − 10.4 to 1.0% compared with a range of −2.9 to −2.0% error observed for the 5 PT. Despite the range variations, the mean percent error from all trials was similar. The best SRM showed greater accuracy than the best PT over the range of power outputs tested (1.0% for SRM vs −2.0% for PT). The 5 PT were, however, more consistently accurate. Furthermore, the results from the second experiment showed that once corrected to within 0.0–2.1% accuracy, 14 of the 15 SRM powermeters retested after the 11 months period of use stayed within that range. It was also observed that although PT calibration cannot be altered by the user, it appears to remain similar over a 2-d period.
Figure 2 presents modified Bland-Altman plots of percent error from the CALRIG for the most accurate SRM and PT during the repeat trials at 100 rpm. All data points collected throughout the range of power outputs for both the SRM powermeters and the PT hub are represented. The mean errors for SRM were 0.1 ± 1.1% for T1 and −0.9 ± 0.7% for T2. The mean percent errors for PT were −2.1 ± 1.3% for T1 and −1.5 ± 0.6% for T2. During the initial trials (T1) we observed that the most accurate SRM available to us produced a range of percent error scores from 0% at 100 W to −3% at 800 W. In contrast, the most accurate of the five PT produced a range of error scores from −3.5% at 100 W to −0.5% at 600 W.
Figure 3 shows modified Bland-Altman plots of percent error from the CALRIG for the influence of cadence on the accuracy of the most accurate SRM and PT. The mean percent errors for the SRM and the PT units were −0.1 ± 1.0% and −1.9 ± 0.8% at 60 rpm, −1.0 ± 0.4% and −2.0 ± 0.7% at 80 rpm, −0.9 ± 0.7% and −2.1 ± 1.3% at 100 rpm, and −1.3 ±0.4% and −1.2 ± 0.4% at 120 rpm, respectively.
Figure 4 presents modified Bland-Altman plots of percent error from the CALRIG showing the response to warm-up and drift during four 1 h and one 30 min constant power output and cadence (~300 W and 100 rpm) trials. When zeroed in cool conditions, the PT required up to 15 min of steady-state activity to stabilize. A common finding with both devices was that when zeroed after exposure to cool conditions, they gave a positive error (3.7 ± 0.4% for SRM and 5.5 ± 2.4% for PT) as opposed to a negative error when zeroed in standard lab conditions (−1.5 ± 0.4% for SRM and −3.2 ± 0.2% for PT). Therefore, when used by an inexperienced operator who may not take into consideration the environmental conditions, mean percent difference in power output values between standard lab and cool conditions may be as large as 5.2% for SRM and 8.4% for PT. This was also observed when recording the drift in both zero offset measurements. The zero offset of the SRM was set at 584 Hz before the initial (T1 cool) cool room trial and drifted to 615 Hz posttrial (31 Hz). Similarly, in the second cool room trial (T2 cool), the zero offset drifted 18 Hz for the SRM. During both PT trials, the zero drifted 7 W pre- to posttrial. During the standard condition trials the zero drift was not as large, with 4 and 1 Hz versus 3 and 2 W difference observed for SRM and PT units, respectively, during T1 and T2 standard.
The power profile for the direct comparison between SRM and PT during dynamic CALRIG trials is presented in Figure 5a. The average power output recorded from the SRM and PT during these trials was 245 ± 88 W and 251 ±88 W, respectively. Although average power output was similar, maximum power output was 73 W higher using the SRM compared with PT (547 W for SRM and 474 W for PT). During the trial, the PT signal often dropped out causing the phase delay seen when trials were overlaid (Fig. 5a). Figure 5b shows the cadence recorded for both units during the CALRIG trial. It can be seen that because the PT hub relies on an indirect estimation of cadence, the device cannot be calibrated when torque is applied throughout the whole pedal stroke. The average cadence recorded for the SRM during this trial was 94 ± 13 rpm compared with 54 ± 30 rpm for the PT hub.
Figure 6a shows a graph of the fluctuations in power throughout the hill climb field trial; again there was a phase delay for PT. The average power output was 411 ± 63 W for SRM and 432 ± 79 W for PT (Fig. 6a); however, in contrast to the CALRIG trials, maximum power output was 51 W lower for SRM compared with PT (651 W for SRM and 702 W for PT). In Figure 6b, the cadence profiles of the ride were compared. Although not as pronounced as those shown in Figure 5, one can observe that relying on indirect estimation of cadence sometimes causes false peaks and troughs. The average cadence recorded for SRM was 87 ± 9 rpm compared with 84 ± 14 rpm for PT, whereas maximum cadence was 105 rpm for the SRM compared with 112 rpm for the PT. Even with signal drop-out, average speed was similar, with 27.6 ± 3.5 km·h−1 for SRM and 27.7 ± 4.1 km·h−1 for PT (Fig. 6c).
The primary purpose of this study was to establish whether the SRM powermeter and PT hub accurately measure power output during simulated high-intensity cycling under a variety of conditions. The present data show that using average data, both SRM and PT are usually within their manufacturer’s specifications. However, when the data are expressed using modified Bland-Altman plots, there is a considerable scatter around the mean percent error. Acknowledging the range of scatter has important implications when trying to interpret laboratory intertrial reliability or technical error data using this type of power monitoring system. The present data also show that during trial 1 (with factory calibration) the most accurate SRM unit was more accurate than the best PT unit; however, the least accurate SRM was less accurate than the worst PT. It appears that once adjusted, the calibration of SRM is stable throughout an 11-month racing season. These results need to be considered in light of the low numbers of available PT (N = 5 for T1 and T2) as opposed to the number of SRM units evaluated (N = 19 for T1 and N = 15 for T2). The other main finding from the present study is that both the SRM and PT units are sensitive to differences in ambient temperature. Finally, our data suggest that comparing power data between SRM and PT from a training ride may not be accurate unless both devices have been calibrated via the same procedure previously.
It is important for coaches and sport scientists involved with cycling to be confident that the power monitoring device they are using is accurate. At present, however, it is difficult to interpret some of the published observations regarding power output demands of cycling. For instance, there is an extremely large range of power outputs associated with a track cycling speed of 60 km·h−1 (5), and it has also been reported that the Kingcycle ergometer provides a less reproducible (2) and valid (3) measure of peak power output when compared with an SRM powermeter. More recently, mountain bike SRM data were published evaluating the technical aspects of mountain bikes (12), and the authors reported similar oxygen uptake values for significantly different power outputs. It is possible that all these findings were simply the result of instrumentation drift and inappropriate calibration methodology.
Although a few validation trials on SRM powermeters have been published (10,13), no attention has been directed toward also validating the commonly used PT. To establish a “best-case scenario,” we ran a number of experiments using the most accurate SRM and PT units available to us at the time. In the present study SRM and PT data were compared with power output data generated by a dynamic CALRIG at the crank, thus allowing a comparison of power data with an ecologically valid reference point based on first principles (15). It should be noted when interpreting the data that PT measures power at the rear hub and may display approximately 2% lower values than the CALRIG because of transmission losses in chain and sprocket drive mechanism (11). Although very important to account for when modeling performance, the purpose of these trials was to focus on the accuracy of the two powermeters when a known amount of power was produced at the bottom bracket. Most users of these systems will be monitoring power in order to reflect fitness and changes in fitness. It is not clear whether Power Tap tries to account for transmission losses and attempts to reflect power produced at the bottom bracket or whether the displayed power is the actual power produced at the rear hub. In elite athletes, the detectable change in performance from an ergogenic or training intervention is usually of a magnitude less than 2% (7,8). Thus, researchers require a high degree of precision in the power output monitoring equipment.
As expected, our results revealed the overall accuracy to differ between SRM and PT units. Thus, not all SRM and PT units are equally accurate. Based on their initial factory calibration, the SRM powermeters ranged from −10.4 to 1.0% average error, whereas the PT hubs were within a range of −2.9 to −2.0% average error. It is important for readers to be aware that 8 of the 19 SRM units tested were outside the manufacturer’s reported accuracy on first calibration; however, when adjusted and retested after a full racing season, 14 of the 15 were still within 2.0% average error. It is concerning then that the evaluations of power monitoring systems are being published based on a comparison with SRM powermeters without any reported calibration as though it is the new “gold standard” in power measurement. The present data impress the need to evaluate SRM cranks using first principles and correct each unit to minimize the average error. During the first calibration, one of the five PT units assessed was outside of the manufacturer’s accuracy claims. Unfortunately, there is no way for the user to change the calibration if poor accuracy is present during calibration. In addition, none of the PT units assessed here were retested after an extensive period of use; therefore, more research may be needed in order to assess the accuracy of PT after a racing season.
We were particularly interested in determining whether accuracy of the most accurate SRM and PT units varied across different power outputs and over different cadences. The present data show that accuracy can vary from low to high power outputs and at different cadences for the same power output. This raises the need for a specific power-cadence band calibration. For example, it may be important for power monitoring devices to be calibrated to the specific power-cadence range adopted during the event of interest (track sprint vs road time trial).
The finding that both the SRM and PT units are sensitive to temperature has implications to both athletes and sports scientists. We found that SRM and PT warm-up is important before setting the zero offset on both devices. We also found that when zeroed in cool conditions (6°C) after 12 h of exposure, both units produced dissimilar average error after each testing occasion at room temperature (21°C). Interestingly, when zeroed at room temperature, little to no warm-up of the crank components (primarily strain gauges) was necessary. This finding suggests that in order to gain accurate data when starting on a ride in cool conditions and when temperature changes during the trial are expected, the zero offset should be reset at regular intervals.
As another point of note, the SRM produced three positive outlier error scores during the cadence trials at 80 rpm despite a mean percent error of −1.0 ± 0.4%. These positive error values occurred after the high power output produced on the small chain ring was switched to low powers on the large chain ring (highlighting a possible hysteresis in some units). When the power changed from high to low, such as from a small chain ring (39 × 15) to a large chain ring (52 × 21), deformation of the strain gauge strips may have resulted in hysteresis. This has implications for all cycling disciplines where high power is randomly interspersed with lower power. More work is needed to determine whether hysteresis is also observed in other SRM powermeters and in what situation it is most likely to occur.
Along with the observed hysteresis, differences observed in maximum power readings between the SRM powermeter and PT hub during the dynamic trials raises concern with respect to interpretation of data. There is a clear need for dynamic calibration at high power outputs especially if the monitoring devices are used to analyze high-intensity sprints. Also observed during the laboratory and field dynamic trials was the drop-out and the erratic nature of cadence measurement by the PT, especially when run on the CALRIG at a constant torque. Recently, a new model of PT has been produced that uses a traditional magnet sensor for cadence. All of these issues highlight the need for caution when interpreting average data from both devices. It should also be noted for scientific studies where power or cadence data are expressed as average values that the exported time base (1.26 s) for the PT hub is much less flexible than the SRM.
In conclusion, both SRM and PT are valuable instruments for the monitoring of power output during cycling. However, researchers and scientists should be aware of the limitations in both models of SRM and PT tested here when trying to detect performance changes of less than 2%, common among elite athletes (7,8). Researchers should also note that setting the zero offset does not substitute for a standardized calibration. In summary, accuracy may be different between and within SRM and PT units and both units are affected by temperature. It is possible that with continual refinements by the manufacturers, improvements will, and may have already been made to both devices.
The authors would like to extend gratitude to Prof. Allan Hahn and the staff of the Physiology Department at the Australian Institute of Sport for providing facilities and support in order to make this project possible. Thanks must also go to Dr. Inigo Mujika for his advice and critique of the manuscript.
1. Altman, D. G., and J. M. Bland. Measurement in medicine: the analysis of method comparison studies. Stat.
2. Balmer, J., R. C. Davison, and S. R. Bird. Reliability of an air-braked ergometer to record peak power during a maximal cycling
test. Med. Sci. Sports Exerc.
3. Balmer, J., R. C. Davison, D. A. Coleman, and S. R. Bird. The validity
of power output recorded during exercise performance tests using a Kingcycle air-braked cycle ergometer when compared with an SRM powermeter. Int J. Sports Med.
4. Bassett, D. R., Jr., C. R. Kyle, L. Passfield, J. P. Broker, and E. R. Burke. Comparing cycling
world hour records, 1967–1996: modeling with empirical data. Med. Sci. Sports Exerc.
5. Broker, J. P., C. R. Kyle, and E. R. Burke. Racing cyclist power requirements in the 4000-m individual and team pursuits. Med. Sci. Sports Exerc.
6. Golich, D., and J. Broker. SRM bicycle instrumentation and the power output of elite male cyclists during the 1994 Tour Dupont. Performance Conditioning for Cycling
7. Hopkins, W., H. J. A. Hawley, and L. M. Burke. Design and analysis of research on sport performance enhancement. Med. Sci. Sports Exerc.
8. Hopkins, W. G., E. J. Schabort, and J. A. Hawley. Reliability of power in physical performance tests. Sports Med.
9. Jeukendrup, A., and A. Van Diemen. Heart-rate monitoring during training and competition in cyclists. J. Sports Sci.
10. Jones, S. M., and L. Passfield. The dynamic calibration
of bicycle power measuring cranks. In:The Engineering of Sport
, S. J. Haake (Ed.). Oxford: Blackwell Science Ltd, 1998, pp. 265–274.
11. Kyle, C. R. Chain friction, windy hills and other quick calculations. Cycling Sci.
12. MacRae, H. S. H., K. J. Hise, and P. J. Allen. Effects of front and dual suspension mountain bike systems on uphill cycling
performance. Med. Sci. Sports Exerc.
13. Martin, J. C., D. L. Millikin, J. E. Cobb, K. L. McFadden, and A. R. Coggan. Validation of a mathematical model for road cycling
power. J. Appl. Biomech.
14. Maxwell, B. F., R. T. Withers, A. H. Ilsley, M. J. Wakim, G. F. Woods, and L. Day. Dynamic calibration
of mechanically, air- and electromagnetically braked cycle ergometers
. Eur. J. Appl Physiol. Occup. Physiol.
15. Olds, T. Modelling human locomotion: applications to cycling
. Sports Med.
16. Olds, T., K. Norton, N. Craig, S. Olive, and E. Lowe. The limits of the possible: models of power supply and demand in cycling
. Aust. J. Sci. Med. Sport
17. Olds, T. S., K. I. Norton, and N. P. Craig. Mathematical model of cycling
performance. J. Appl. Physiol.
18. Woods, G. F., L. Day, R. T. Withers, A. H. Ilsley, and B. F. Maxwell. The dynamic calibration
of cycle ergometers
. Int. J. Sports Med.