Journal Logo

Applied Sciences: Physical Fitness and Performance

Reliability and Variability of Running Economy in Elite Distance Runners


Author Information
Medicine & Science in Sports & Exercise: November 2004 - Volume 36 - Issue 11 - p 1972-1976
doi: 10.1249/01.MSS.0000145468.17329.9F
  • Free


Running economy (RE) is proportional to (and, for the purpose of this paper, defined as) the O2 cost for a given velocity, and estimated from measuring steady-state O2 consumption (V̇O2) during submaximal running (23). In metabolic terms, runners with good RE use less O2 at the same velocity than runners with poor RE (30). There is a strong association between improved RE and distance running performance (1,6,7,21). In elite or near-elite distance runners, who have similar maximal oxygen uptakes (V̇O2max), RE may be a better predictor of performance than V̇O2max (9,20), and is considered to be an important factor in determining success for distance runners (11). Other factors include the percentage of V̇O2max a runner can sustain without accumulating lactic acid, and, for longer events, the ability to use fat as a fuel at high work rates, thereby “sparing” carbohydrates (10). The velocity associated with attainment of V̇O2max and the velocity at the onset of blood lactate accumulation are also good indicators of distance running performance (3). Given the importance of RE, investigators and coaches require a reasonable degree of confidence or certainty that assessments of RE are reliable. Investigators need to consider the reliability of their measures when interpreting serial physiological assessments of distance runners.

Consideration of the typical intraindividual variation in RE is useful when investigating the effectiveness of interventions aimed at modifying RE. Small sample size and omission of the typical error (TE) associated with equipment, testing and biological variation restricts the degree to which meaningful conclusions on the impact of an intervention on RE can be drawn (20). Along with a rigorous experimental design to control confounding variables and permit a valid determination of the impact of interventions on RE, researchers should provide a statement of the test retest reliability or TE. Reliability studies using moderate to well-trained athletes measuring RE show intraindividual variations between 1.5 and 5% (4,18,19,22,24–26) indicating that within-subject results are relatively stable. This is a highly reliable test that suggests sufficient precision to detect real changes. To add weight to this approach, Hopkins (15) has proposed the concept of the smallest worthwhile change (SWC) to determine the practical significance of interventions. If TE (noise) is less than SWC (signal), then the ability of the test to detect real and worthwhile within-subject changes is rated as “good”; if TE is much greater than SWC, then the utility of the test is rated as “marginal”; and if TE is about the same as SWC, then the test is rated as “satisfactory” (17). No previous study has systematically addressed these issues in relation to RE in distance runners. The purpose of the current study was to determine the TE of RE, using a standard treadmill protocol, in a group of elite distance runners (national/international caliber). A secondary purpose was to determine the magnitude of SWC of RE in these runners, to assist researchers and coaches in their interpretation of the significance of training interventions aimed at improving RE.



Eleven elite, male, middle/long distance runners volunteered to participate in the reliability section of this study. Subjects all competed at a national level with six competing internationally. To determine the magnitude of the SWC using a representative set of observations, RE in 70 highly trained male distance runners was assessed over a 3-yr period. The subjects were all well-trained distance runners ranging from top junior athletes to national senior representatives competing in events from the 800 m up to a marathon. Descriptive characteristics of the subjects are presented in Table 1. Subjects were informed of experimental procedures and possible risks involved with participation before providing written consent. The Australian Institute of Sport ethics committee approved all testing protocols.

Subject characteristics.

Experimental Protocol.

For calculation of test retest reliability the RE test was completed twice within a 7-d period, with V̇O2max measured after the second RE test to characterize each subject’s aerobic capacity. This provided an indication of training status. In order to reduce intrasubject variability, the type of shoes worn, time of day and testing equipment were all standardized for both RE tests. Temperature was controlled by air conditioning and set at 20–22°C. A light fan that circulated air around the subjects was provided for both trials. All training in the 24-h period before testing was standardized between tests for each athlete but the type of training differed between subjects. To determine the magnitude of the SWC, 70 runners were tested at least once over a 3-yr period. All athletes performed the same RE protocol followed by the V̇O2max test.

Running economy.

RE was determined by measuring submaximal V̇O2 on a custom-built motorized treadmill (Australian Institute of Sport, Belconnen, Australia) at an ambient temperature of 20–22°C. The protocol involved subjects running at three set running speeds (14, 16, and 18 km·h−1; 0% grade) for 4 min separated by a 1-min rest period. After the completion of each 4-min stage, the treadmill was stopped for 1 min, in which time a capillary blood sample was taken from the finger. An in-house automated metabolic system was used to calculate V̇E, V̇O2, and RER. HR was measured by short-range telemetry (Polar Vantage NV, Kempele, Finland) during each minute of the 4-min submaximal runs. A capillary blood sample was drawn from a fingertip at the completion of each run for measurement of blood lactate concentration (Lac). In our laboratory, the TE associated with submaximal HR is 3.7%, while Lac is 11.1% for values <5 mM, and 4.8% for values between 5 and 10 mM. SR was determined by counting 50 steps (25 strides) and dividing this number into 1500 to determine the number of strides per minute at each running speed. RE was calculated as the V̇O2 during the last min of each 4-min stage. Table 2 indicates that steady state occurred during the 4 min at each running speed.

Oxygen consumption during each minute of the 3- × 4-min protocol at running speeds 14, 16, and 18 km·h−1.

Maximal oxygen uptake.

V̇O2max was determined during an incremental test to volitional exhaustion commencing 2 min after completion of the third submaximal effort of the second RE test described in detail by Telford (29). Briefly, subjects completed an incremental protocol starting at 18 km·h−1 and increasing by 1 km·h−1 every min up until 20 km·h−1. Once this velocity was attained, the treadmill gradient was increased 1% every minute until volitional exhaustion was reached. The V̇O2max was determined as the V̇O2 beyond which further increases in V̇O2 were not made, despite continued effort. The TE for V̇O2max measured in our laboratory is 2.2%.

Gas analysis.

A custom designed open-circuit, indirect calorimetric system and associated in-house software (Australian Institute of Sport, Belconnen, Australia) was used to monitor gas exchange patterns during exercise. The automated system measured gas exchange every 30 s, permitting calculation of pulmonary ventilation (V̇E), V̇O2, V̇CO2, and RER. Subjects breathed through a Hans Rudolph one-way valve with two openings, one allowing the inhalation of room air and the other directing expired air into respiratory tubing. The Douglas bag principle (13) was used to collect all expirate in one of two 150-L aluminized mylar bags, with each bag receiving expirate for 30 s. After this time the gas was evacuated by a calibrated precision piston pump. Measures of piston displacement recorded from the pump enabled the system to automatically calculate the volume of gas collected during each 30-s period using Boyle’s Law (where P1V1 = P2V2). A small volume of gas was sampled at the end of each 30-s period and passed through a desiccant column of anhydrous calcium chloride (Asia Pacific Specialty Chemicals Ltd, Sydney, Australia) before analysis by electronic O2 and CO2 analyzers (Applied Electrochemistry, Ametek, Pittsburg, PA). These analyzers were calibrated before each test with three precision gas mixtures, with an acceptable calibration (within ±0.03% of the target value). The accuracy of the system was compared to the automated V̇O2 calibrator for open-circuit indirect calorimetry systems (14), and was within ±5%, the acceptable range given by the Laboratory Standards Assistance Scheme (LSAS).

Blood handling procedure.

Capillary blood samples were collected from a fingertip using an Autolet II (Owen Mumford Ltd Medical Division, Oxford, England) sterile lancet. Approximately 95 μL of blood was collected into a Clinitube (Radiometer Medical A/S, Copenhagen, Denmark) by capillary action for automated analysis of Lac using an ABL700 Series Blood Gas Analyzer (Radiometer Medical A/S, Copenhagen, Denmark).

Statistical analysis.

Data were checked for heteroscedasticity using plots of the raw and log transformed data, with the change scores plotted against the initial scores and the uniformity of the scatter checked. Reliability measurements were calculated using methods described by Hopkins (15). The TE for each measure was calculated as the SD of the change score divided by √2, within a test retest design using the same subjects in two different trials. The absolute and percentage change between repeated analyses were established. The percentage change was calculated by using change scores of log transformed data of the two trials. Plots were constructed displaying these changes along with the 95% confidence intervals, to indicate the direction and magnitude of the changes. The effect size (ES), which represents the magnitude of the difference between two groups in terms of SD, was calculated from the log-transformed data by dividing the change in the mean by the average of the SD of the repeated analysis. The modified scale to interpret the ES was established as: trivial, <0.2%; small, 0.2–0.6%; moderate, 0.6–1.2%; and large, >1.2% (, 2002). The SWC for elite athletes was calculated as 0.2 times the between-subject SD for a particular test, based on Cohen’s effect size (5). This calculation gives an indirect estimation of the smallest worthwhile change in RE for this particular cohort of distance runners. Other studies will need to produce their own estimation of the SWC that may vary depending on the homogeneity of the sample group.


All parameters measured in the submaximal treadmill running test (Table 3) show that the TE of RE between tests was ∼2–3%. When RE was reported as an absolute measure (L·min−1) the TE between test 1 and test 2 was 2.4% (14 km·h−1), 2.5% (16 km·h−1), and 2.4% (18 km.h−1). When expressed relative to body mass (mL·min−1·kg−1), the TE in RE between test 1 and test 2 was 2.9% (14 km·h−1), 2.5% (16 km·h−1), and 2.7% (18 km·h−1). The SWC in RE for the highly trained distance runners was 2.6% at 14 km·h−1, 2.4% at 16 km·h−1, and 2.2% at 18 km·h−1; this averaged 2.4% across the three running speeds. The magnitude of the change scores between RE tests was trivial at the three speeds, with all effect sizes (ES) between 0.1 and 0.2. The difference between repeated analyses is shown in Figure 1.

Reliability of running economy measures.
FIGURE 1—The change in means for parameters measured in the RE tests. Values are the lower and upper 95% confidence intervals for the change in the means between test 1 and test 2;:
N = 11 subjects. All values are averages of the three running speeds.

There was poor reliability in Lac between tests, with a TE of 52% (14 km·h−1), 20% (16 km·h−1), and 10% (18 km·h−1). However, absolute differences in blood lactate concentration were quite small, averaging 0.6, 0.6, and 0.5 mM at the three running speeds. The ES for Lac was also small, with the change scores ranging between 0.2 and 0.3 at all speeds.


RE is represented by the energy expenditure and expressed as the submaximal O2 consumption (V̇O2submax) at a given running speed (1,6,7,21). Runners with good RE use less O2 than runners with poor RE at the same steady-state speed (30). The relationship between RE and performance is well documented, with many independent reports demonstrating a strong relationship between these two variables (6–9,12,27). This investigation has quantified the reliability of a submaximal running test at three different running speeds in elite male distance runners. Additionally, it has determined the magnitude required for practical significance due to an intervention (SWC) in RE in a cohort of highly trained runners. The TE of RE averaged 2.4% at 14, 16, and 18 km·h−1, which is in the general range (∼2–5%) of previous research conducted in this area (4,18,19,22,24–26). The SWC averaged 2.4% across the three running speeds, and given that the magnitudes of the TE and SWC are similar, the current RE test (three 4-min stages) should prove useful in determining whether important within-subject changes have occurred. The relatively small TE facilitates interpretation of changes in RE over time or with a targeted intervention, in an individual or a group. The current findings show the TE and SWC are similar, indicating that the test is “satisfactory” (17). The RE results should be considered in light of parallel changes in other measures such as V̇O2max, time to exhaustion and body composition, and the recent history of training and competitive performance.

RE was described as steady-state O2 consumption (V̇O2) during submaximal running (23). The current study utilized 4 min of running at three different running speeds to try and determine the test retest reliability of RE at different speeds in highly trained runners. We used 4-min stages because runners were familiar to treadmill running, and by the time they reached 18 km·h−1, the runners had been running for a total of 12 min. Runners reached a steady state, with a clear plateau being reached, in the final 2 min of each running speed (Table 2).

Subjects used in many of the previous studies investigating reliability of RE were of considerably lower standard than those assessed in the current study. Studies using well-trained subjects showed variations in RE of 5% (4), 3–5% (19), 2% (25), and 2% (22), which indicates there is little difference in the stability of RE between moderately trained and well-trained runners. The subjects in the current study are elite, with six of 11 having represented Australia in international competition. The TE reported (2.4%) in our highly trained group demonstrates the suitability of our RE test, and the independence of TE of RE from the subject’s fitness level in the range of 60–75 mL·min−1·kg−1.

The SWC identifies the magnitude of change required to elicit a meaningful or significant improvement in RE. The SWC, calculated as a proportion of the effect size, represents the magnitude of improvement in a variable as a function of the between-athlete standard deviation of the particular cohort. We chose an indirect method of estimating the SWC, using a small Cohen’s effect size as suggested by Hopkins (15). We estimated a SWC of 2.4% for RE in 70 highly trained distance runners. As a practical example, our group has demonstrated recently that 20 d of live-high train-low altitude exposure improved RE by 3.3% for the pooled data at 14, 16, and 18 km·h−1 (28). In conjunction with our reported statistical analysis, when this finding is compared with the SWC of 2.4% and TE of 2.4% reported in the current study, we can be confident at the 99% level that the observed 3.3% improvement in RE is meaningful and real, as it is greater than both the SWC and TE. We chose the SWC, described by Hopkins et al. (16), as a method of determining the practical significance of our RE test; however, other methods are also available to determine the magnitude of a worthwhile change (2).

The TE associated with other performance and physiological parameters associated with the RE test were relatively stable, averaging 0.8–1.5% for SR and 1.7–2.4% for HR across the three submaximal running speeds. RER and V̇E were more variable with the TE, averaging 3.4–4.4% (RER) and 6.6–8.3% (V̇E) for the three running speeds. Collectively, these findings suggest that highly trained runners require relatively small changes in SR and HR. However, they require larger changes in RER and V̇E before an investigator can be relatively confident of the significance of the observed change. The highest level of variation of any variable under investigation was the postexercise Lac. In percentage terms there was a large (52%) TE observed at 14 km·h−1, reduced (20%) at 16 km·h−1, and further reduced (10%) at 18 km·h−1. However, the higher relative TE at 14 km·h−1 can be attributed to the slower running speeds, eliciting very little lactate accumulation in these elite runners, averaging 2.0 mM at 14 km·h−1, 2.8 mM at 16 km·h−1, and 5.2 mM at 18 km·h−1 across the two trials. The absolute variation in Lac was small, averaging 0.6, 0.7, and 0.5 mM at the three respective running speeds. The magnitude of the variation observed at 18 km·h−1 is consistent with previously reported data (4).

In conclusion, we have demonstrated that RE in national/international caliber male distance runners is a relatively stable measure, with a TE of 2.4% observed during three 4-min stages of running at 14, 16, and 18 km·h−1. We also have demonstrated that the SWC required in highly trained distance runners is 2.4%. Therefore, an elite distance runner must improve their RE by >2.4% before a coach or scientist can be relatively confident that a real change has occurred. Both SR and HR were relatively stable, with greater variation associated with the physiological measures of V̇E, RER, and Lac. This study provides a framework for examining the significance of changes in RE and associated measures in elite distance runners.


1. Anderson, T. Biomechanics and running economy. Sports Med. 22:76–89, 1996.
2. Atkinson, G. Does size matter for sports performance researchers? J. Sports Sci. 21:73–74, 2003.
3. Billat, V. L., B. Flechet, B. Petit, G. Muriaux, and J. P. Koralsztein. Interval training at V̇O2max: effects on aerobic performance and overtraining markers. Med. Sci. Sports Exerc. 31:156–163, 1999.
4. Brisswalter, J., and P. Legros. Daily stability in energy cost of running, respiratory parameters and stride rate among well-trained middle distance runners. Int J. Sports Med. 15:238–241, 1994.
5. Cohen, J. Statistical power analysis for the behavioral sciences. 2nd ed. Mahwah, NJ: Lawrence Erlbaum, 1988, pp. 16–18.
6. Conley, D. L., and G. S. Krahenbuhl. Running economy and distance running performance of highly trained athletes. Med. Sci. Sports Exerc. 12:357–360, 1980.
7. Conley, D. L., G. S. Krahenbuhl, L. N. Burkett, and A. L. Millar. Following Steve Scott: physiological changes accompanying training. Phys Sports Med. 12:103–106, 1984.
8. Costill, D. L. The relationship between selected physiological variables and distance running performance. J. Sports Med. Phys Fitness. 7:61–66, 1967.
9. Costill, D. L., H. Thomason, and E. Roberts. Fractional utilization of the aerobic capacity during distance running. Med. Sci. Sports. 5:248–252, 1973.
10. Coyle, E. F. Physiological determinants of endurance exercise performance. J. Sci. Med. Sport. 2:181–189, 1999.
11. Daniels, J. T. A physiologist’s view of running economy. Med. Sci. Sports Exerc. 17:332–338, 1985.
12. Di Prampero P. E., C. Capelli, P. Pagliaro, G. Antonutto, M. Girardis, P. Zamparo, et al. Energetics of best performances in middle-distance running. J. Appl Physiol. 74:2318–2324, 1993.
13. Douglas, C. G. A method for determining the total respiratory exchange in man. J. Physiol. 42:17–18, 1911.
14. Gore, C. J., P. G. Catcheside, S. N. French, J. M. Bennett, and J. Laforgia. Automated VO2max calibrator for open-circuit indirect calorimetry systems. Med. Sci. Sports Exerc. 29:1095–1103, 1997.
15. Hopkins, W. G. Measures of reliability in sports medicine and science. Sports Med. 30:1–15, 2000.
16. Hopkins, W. G., J. A. Hawley, and L. M. Burke. Design and analysis of research on sport performance enhancement. Med. Sci. Sports Exerc. 31:472–485, 1999.
17. Liow, D. K., and W. G. Hopkins. Velocity specificity of weight training for kayak sprint performance. Med. Sci. Sports Exerc. 35:1232–1237, 2003.
18. Morgan, D. W. Effects of a prolonged maximal run on running economy and running mechanics. In: Unpublished doctoral dissertation: Arizona State University, 1988.
19. Morgan, D. W., F. D. Baldini, and P. E. Martin. Day-to-day stability in running economy and step length among well-trained male runners [abstract]. Int J. Sports Med. 8, 242 (1987).
20. Morgan, D. W., F. D. Baldini, P. E. Martin, and W. M. Kohrt. Ten kilometer performance and predicted velocity at VO2max among well-trained male runners. Med. Sci. Sports Exerc. 21:78–83, 1989.
21. Morgan, D. W., and M. Craib. Physiological aspects of running economy. Med. Sci. Sports Exerc. 24:456–461, 1992.
22. Morgan, D. W., M. W. Craib, G. S. Krahenbuhl, K. Woodall, S. Jordan, K. Filarski, et al. Daily variability in running economy among well-trained male and female distance runners. Res Q Exerc. Sport. 65:72–77, 1994.
23. Morgan, D. W., P. E. Martin, and G. S. Krahenbuhl. Factors affecting running economy. Sports Med. 7:310–330, 1989.
24. Morgan, D. W., P. E. Martin, G. S. Krahenbuhl, and F. D. Baldini. Variability in running economy and mechanics among trained male runners. Med. Sci. Sports Exerc. 23:378–383, 1991.
25. Pereira, M. A., and P. S. Freedson. Intraindividual variation of running economy in highly trained and moderately trained males. Int. J. Sports Med. 18:118–124, 1997.
26. Pereira, M. A., P. S. Freedson, and A. F. Maliszewski. Intra-individual variation during inclined steady rate treadmill running. Res. Q Exerc. Sport. 65:184–188, 1994.
27. Pollock, M. L. Submaximal and maximal working capacity of elite distance runners. Part I: Cardiorespiratory aspects. Ann. N Y Acad Sci. 301:310–322, 1977.
28. Saunders, P. U., R. D. Telford, D. B. Pyne, R. B. Cunningham, C. J. Gore, A. G. Hahn, et al. Improved running economy in elite runners after 20 days of simulated moderate-altitude exposure. J. Appl. Physiol. 96:931–937, 2004.
29. Telford, R. D. Physiological assessment of the runner. In: Test Methods Manual, J. Draper, B. Minikin, and R. D. Telford (Eds.). Canberra: National Sports Research Centre, 1991, p. 10.
30. Thomas, D. Q., B. Fernhall, and H. Grant. Changes in running economy during a 5-km run in trained men and women runners. J. Strength Cond. Res. 13:162–167, 1999.


©2004The American College of Sports Medicine