Secondary Logo

Journal Logo

Original Research

Reliability of the Ekblom Soccer-Specific Endurance Test

Williams, Morgan D1,2; Wiltshire, Huw D3; Lorenzen, Christian1,2; Wilson, Cameron J1,2; Meehan, Daniel L1,2; Cicioni Kolsky, Daniel J1,2

Author Information
Journal of Strength and Conditioning Research: August 2009 - Volume 23 - Issue 5 - p 1378-1382
doi: 10.1519/JSC.0b013e31819f1e6c
  • Free



Choosing an appropriate assessment to measure endurance capacity for soccer is important for the prescription of conditioning and evaluation of the player. Understanding the change in status is especially important after an extended period of rest such as the off-season break or break due to injury. Scientifically acceptable assessments of endurance for soccer players are well established (2,5,7,14,15). Laboratory-based assessments involving gas and blood analysis permit accurate measures of aerobic capacity and anaerobic threshold but are expensive and time consuming. In particular, maximal oxygen uptake o2max), the gold standard measure of aerobic power, is criticized for lacking sensitivity in identifying intraseasonal training status changes in professional soccer players (8). Moreover, treadmill running used in laboratory-based assessments lacks specificity to overground running (13). Field-based performance tests may, therefore, offer more appropriate assessments of soccer-specific endurance and on-field performance.

The 20-m multistage fitness test (20 m MSFT) is the most established and published field assessment of aerobic capacity (10). The test involves running 20-m shuttles, in which the velocity is increased incrementally until the participant fails to keep up with the “pace” indicated by a beep. This assessment has some appealing qualities for the evaluation of soccer players. These include large groups can be tested at once with minimal equipment; testing can be performed indoors, therefore standardizing the environment; the area required is small compared with a soccer field; the test involves a change of direction every 20 m; and data are widely available from other soccer squads at all levels of competition for comparisons.

Since the introduction of the 20 m MSFT, a similar assessment called the Yo-Yo test has been introduced that has been specifically adapted for soccer (2). The Yo-Yo test consists of 20-m shuttle runs similar to the 20 m MSFT but is different because after two 20-m shuttles, a 10-second recovery run over a 5-m distance is included, reflecting the intermittent nature of soccer (8). This intermittent aspect of the Yo-Yo test is more specific to soccer demands, and the recovery allows the test to be performed at greater running velocities compared with the 20 m MSFT. Furthermore, the Yo-Yo test has 2 maximal protocols available depending on the level of athlete, and a measure of recovery can be obtained as well as endurance performance (9). The most appealing option for the Yo-Yo test, however, is a submaximal version, which permits in-season testing with minimal disruption to match preparation (8).

As a measurement tool, the Yo-Yo test has been well supported within the literature by a high level of reliability (r = 0.98). In addition, during a precompetition period, soccer players (55 mL1·kg1·min1) improved Yo-Yo test performance by 25% (from 1760 to 2211 m) and o2max by 7% (from 55 to 59 mL1·kg1·min1). High-intensity running covered by the players during games was also correlated with the Yo-Yo test performance but not with o2max (8). Both these findings support the assertion from the recent reviews that the Yo-Yo test is more sensitive than o2max in evaluating soccer players' specific endurance capacity. In addition, the Yo-Yo test was found to distinguish between positions played within the team structure, showing fullbacks and midfielders covering greater distances during matches than central defenders and forwards.

Despite both the 20 m MSFT and Yo-Yo tests involving running and changing direction, other forms of locomotion are omitted. Running backwards, sideways, slalom running, and jumping are also important forms of locomotion in soccer. Multidirectional running places additional physiological demands beyond those experienced in the aforementioned assessments (1,3,12). Furthermore, these other forms of locomotion may be overlooked in periodized models developed for early preseason and rehabilitation conditioning. During these periods, the program goal is targeted at the development of an aerobic base, and success may be assessed by field tests such as the 20 m MSFT or the Yo-Yo test. An assessment that incorporates a greater variety of locomotion modes and activities specific to soccer may be a more accurate determinant of “readiness” for competition.

The Hoff test is one such assessment (5) that includes a variety of locomotion modes and the participant completes as many laps of the circuit as possible with a ball. It has been shown to be highly reliable (r = 0.96) and unlike the Yo-Yo test, the Hoff test has correlated well with measures of o2max (5). However, players are often reluctant to perform a test that involves running with a ball for a long duration. Furthermore, dribbling a ball for such extended times is not specific to soccer matches, and running in possession of the ball during match play is limited to brief moments. An alternative assessment, which is also performed on a soccer pitch, is the Ekblom endurance test (4). It is organized as a time trial circuit in which participants complete 4 laps as quickly as possible. The advantage of this test compared with other assessments of endurance capacity is the incorporation of sport-specific movements, rather than simply running and changing direction, and it does not involve dribbling a ball. Since publication, however, the Ekblom endurance test has received little attention compared with the other protocols. Sensitivity to changes in training status was reported in the original Ekblom paper that assessed a semiprofessional team (n = 11). The mean time to complete the circuit was found to decrease from 10 minutes 21 ± 43 seconds in preseason to 8 minutes 49 ± 24 seconds during the in-season training cycles (4). No test-retest reliability was, however, provided for the Ekblom endurance test, which may help explain the lack of popularity and the willingness of researchers/coaches to adopt it. The aim of this study was to explore and quantify measurement reliability of the Ekblom endurance test for a university representative soccer team.


Experimental Approach to the Problem

To quantify the reliability of the Ekblom endurance test, experienced university soccer players completed the circuit on three separate occasions. Test-retest reliability analysis was performed using the outcome measure (time to complete the circuit). It was anticipated once quantified, the findings would provide useful information for the practical application and administration of the test.


The study was approved by the University's Ethics Committee and the participants volunteering for the trials provided an informed consent statement before testing. In addition, before each testing session (trial), participants declared they were physically able to perform the tests and were free of injury. The participants (n = 19; age = 20.5 ± 2.5 years; mass = 80.4 ± 9.8 kg; and stature = 179.0 ± 6.0 cm) were experienced (more than 6 years of competitive soccer) university representative soccer players, who had just completed a full season. Players performed the Ekblom test on 3 separate occasions, within 10 days of the first and with a minimum of 48 hours of rest between trials. No activity embargo was placed up the participants; however, they were encouraged not to exercise excessively. Regular pre- and post-training diet, which included hydration practices, was also recommended. It was assumed that the time between trials was sufficient for recovery, but short enough that fitness/preparedness would not change. The collected data were then used to establish measurement reliability.

Testing for the study was performed outdoors, on the same astro-turf in springtime; although the climatic conditions were not recorded, the weather was similar on all days. That is, no testing was performed under adverse winds, temperature, or precipitation. All participants were familiar with the surface, having trained on it regularly, and all wore specifically designed soccer footwear for astro-turf.

Before commencing the test, participants followed a standardized warm-up that included 10 minutes of running drills, dynamic stretching of the lower limbs, and 2 submaximal attempts around the circuit (Figure 1). The aim of the test was to complete 4 circuits in the shortest time possible (total distance = 1905 m), changing the locomotion modes at the specified areas in the circuit. Color-coded cones (height = 15 cm) were placed throughout the circuit to guide the participants. Poles (height = 150 cm) were used for the slalom running and prevented the participants from cutting corners. Over the trials, consistency of cone and pole placement was achieved using the soccer field markings and measured distance from the landmarks.

Figure 1
Figure 1:
The Ekblom endurance test circuit (4).

In accordance with the original protocol (4), 5 players were tested together with 15-second intervals in their start times. A stopwatch was used to measure time to complete the 4 laps, measured to the nearest whole second. The nonparticipating members of the squad were also positioned strategically around the circuit to ensure that the correct path was followed.

Statistical Analyses

All data analyses were performed by SPSS for Windows (version 14.0, SPSS, Inc., Chicago, IL). Relevant data sets were assessed for normal distribution using Shapiro-Wilk test (p ≤ 0.05). Test-retest reliability was assessed using the suggested format outlined by Weir (16). This first included performing an analysis of variance with repeated measures (ANOVA-RM) to determine if a detectable bias occurred between test-retest trials (p ≤ 0.05). In the case of α above 0.05 but below 0.10, the result was treated with caution. When this occurred, to help protect against making a type II error, the partial eta squared (ηP2), an estimate of the effect size, in conjunction with the mean difference and 95% confidence intervals was examined to evaluate if the differences between trials were trivial. “Learning to set an effective pace” was anticipated to cause a reduction in time from the participant's first to second attempt (systematic bias). In the case of a systematic bias, the trials 2 and 3 would then be evaluated. Otherwise, when no systematic bias was observed, data from trials 1 and 2 would be used only.

To determine test-retest reliability, an evaluation of the relationship between the 2 measures was performed using a 3,1 intraclass correlation (ICC) model. Calculated using equation 1, this model evaluates the random error component only and does not include variance associated with systematic bias (16), where MSs is the subject's mean square, MSE is the error mean square, and k is the number of trials.

To estimate the expected trial-to-trial “noise,” the SEM was calculated using the square root of the error mean square

The SEM indicates the random variation in a subject's score or values across repeated measures after any shifts in the mean have been taken into account (6). In addition, to complement the SEM, the smallest worthwhile change (SWC) was reported. The SWC can be determined by rearranging Cohen d effect size calculation, where the smallest worthwhile effect (0.2) is multiplied by the between-subject SD (11). It can be interpreted as the minimal change that is required in a test performance before a coach can be confident that a real change had occurred. By comparing the SWC and SEM, the sensitivity of the test can be determined using the proposed thresholds presented by Liow and Hopkins (11). When the SEM is lower than the SWC, the ability of the test to detect a change is “good”; if the SEM is equal to the SWC, then the ability of the test is “satisfactory”; or when the SEM is greater than the SWC, the test is rated “marginal.” Before reporting the relevant data in the units of measurement, heteroscedasticity was assessed using a zero-order correlation between the absolute residuals and the predicted scores for each participant.

Finally to help predict a meaningful change for future testing (e.g., post-training phase), an estimate of the true score (TS) and the standard error of prediction (SEP) was determined (equations 2 and 3), where JOURNAL/jscr/04.02/00124278-200908000-00003/ENTITY_OV0398/v/2017-07-20T235400Z/r/image-png is the grand mean, SD is the standard deviation around the JOURNAL/jscr/04.02/00124278-200908000-00003/ENTITY_OV0398/v/2017-07-20T235400Z/r/image-png, and d is the obtained score − JOURNAL/jscr/04.02/00124278-200908000-00003/ENTITY_OV0398/v/2017-07-20T235400Z/r/image-png. The SEP was reported at a 95% confidence interval (SEP95%), which required the SEP to be multiplied by 1.96.


No significant difference (F1,18 = 4.119, p = 0.057, ηP2 = 0.186) in the time to complete the circuit was found between trial 1 (549 ± 26 seconds) and trial 2 (547 ± 26 seconds). The estimated effect size (ηP2), however, was interpreted as being more than trivial with almost 20% of the variation explained by the trials. Furthermore, the mean difference (2 seconds) and 95% confidence intervals (0 to 5 seconds) suggested a likely systematic bias. Therefore, the time to complete trial 1 was removed from the analyses, and trial 2 data were used compared with trial 3 data (548 ± 27 seconds). From the subsequent analysis (F1,18 = 0.740, p = 0.401, ηP2 = 0.039), the α increased, whereas the estimated effect size decreased; therefore, the risk of making a type II error was reduced compared with the previous trial 1 vs. trial 2 comparison. Further support justifying the removal of trial 1 data from the reliability analyses was evidenced by the mean difference (0 second) and 95% confidence intervals (−3 to 1 seconds) between trials 2 and 3.

Residual data, for the trial 2 and trial 3 comparison, were normally distributed (Shapiro-Wilk p = 0.574), and no heteroscedasticity was found (n = 19, r = 0.070, p = 0.775).

From the reliability analyses (3,1 ICC = 0.983, SEM = ±3 seconds, SWC = 5 seconds, SEP95% ± 9 seconds), a high level of measurement reliability was observed and the sensitivity of the test to monitor changes was “good.”


In this study, the Ekblom test was shown to be highly reliable and sensitive to detect change. To our knowledge, this is the first time such an evaluation of this soccer-specific endurance field test has been reported. The absence of support within the literature, combined with the test's superior specificity, compared with more established assessments, underpinned the rationale for this study. The inclusion of a variety of soccer-specific movements gave rise to the potential for poor measurement repeatability. Some evidence of improved “pace” strategy was found where times to complete the circuit decreased from trial 1 to trial 2. After the removal of trial 1 data, the analysis was performed on trials 2 and 3 where no further evidence of systematic bias was observed. From these findings and the additional reliability analysis, it was interpreted that test familiarization is an important consideration when first introducing the Ekblom test, and after a familiarization trial, the inclusion of additional movements were not detrimental to the integrity of the test. Further research is required to confirm these findings and assess the individual locomotion modes throughout the test. This can be achieved by using timing gates placed throughout the circuit and an assessment of jump performance at the 3 locations around the circuit. Future research may also be directed toward other populations (i.e., females) and exploring the relationship between the Ekblom test and more established measures of endurance.

Within the constraints of this study, the findings are promising but are limited to the specific population sample (university soccer players) and the identical circuit dimensions used within the study. An inherent problem with using a soccer field for the circuit is the absence of standard pitch dimensions and regulations allowing pitches to vary in both length and width. Furthermore, the Ekblom test cannot be easily transported and moved to other locations with teams that travel to training camps or competitions for extended periods. If portability were a required quality of the assessment, Yo-Yo tests (8,9) would be more accessible and feasible. In summary, the Ekblom test offers a repeatable and sensitive monitoring tool to assess the soccer-specific endurance of university players.

Practical Applications

For strength and conditioning coaches considering using the Ekblom test, a similar reliability study to the one presented is encouraged. This will provide indices of reliability specific to the population and soccer pitch dimensions/surfaces that are available to them. The transfer of data presented herein to other populations is, however, somewhat limited and discouraged. The reliability analyses previously described (11,16) and used in the present study may be helpful to the strength and conditioning coach. Previously, reliability of soccer-specific endurance has been established by relationships between test-retest scores alone. The analyses presented an assessment of systematic bias (ANOVA-RM), which in this study identified the need for test familiarization; estimates of the random error component of the measures (SEM); an evaluation of test sensitivity to detect the smallest worthwhile change (SWC); and a prediction of the expected change by an individual required in subsequent measures of the test, for the strength and conditioning coach to be confident that a “real” change had occurred (TS ± SEP95%). To help interpret the findings from this study, a participant who took 499 seconds to complete the circuit in trial 3 in this study would require a time outside of 491-509 seconds to be (95%) confident that a change (either positive or negative) in score was due a change in specific endurance fitness and not error alone. This was determined using equations 2 and 3, given that JOURNAL/jscr/04.02/00124278-200908000-00003/ENTITY_OV0398/v/2017-07-20T235400Z/r/image-png was 547 ± 26 seconds.


1. Bangsbo, J. Energy demands in competitive soccer. J Sport Sci 12: S5-S12, 1994.
2. Bangsbo, J. Fitness Training in Football: A Scientific Approach. Bagsvaerd, Denmark: HO+Storm, 1994.
3. Bangsbo, J. The physiology of soccer with special reference to intense intermittent exercise. Acta Physiol Scand S619: 1-155, 1994.
4. Ekblom, B. A field test for soccer players. Sci Football 1: 13-15, 1989.
5. Hoff, J. Training and testing physical capacities for elite soccer players. J Sport Sci 23: 573-582, 2005.
6. Hopkins, WG. Measures of reliability in sports medicine and science. Sport Med 30: 1-15, 2000.
7. Impellizzeri, FM, Rampinini, E, and Marcora, SM. Physiological assessment of aerobic training in soccer. J Sport Sci 23: 583-592, 2005.
8. Krustrup, P, Mohr, M, Amstrup, T, Rysgaard, T, Johansen, J, Stennsberg, A, Pedersen, PK, and Bangsbo, J. The Yo-Yo intermittent recovery test: Physiological response, reliability, and validity. Med Sci Sports Exerc 35: 697-705, 2003.
9. Krustrup, P, Mohr, M, Nybo, L, Majgaard, JJ, Jung, N, and Bangsbo, J. The Yo-Yo IR2 test: Physiological response, reliability, and application to elite soccer. Med Sci Sports Exerc 38: 1666-1673, 2006.
10. Leger, LA, Mercier, D, Gadoury, C, and Lambert, J. The multistage 20 metre shuttle run test for aerobic fitness. J Sport Sci 6: 93-101, 1988.
11. Liow, DK and Hopkins, WG. Velocity specificity of weight training for kayak sprint performance. Med Sci Sports Exerc 35: 1232-1237, 2003.
12. Reilly, T. Physiological profile of the player. In: Football (soccer). Ekblom, B, ed. London, UK: Blackwell, 1994. pp. 78-95.
13. Schache, AG, Blanch, PD, Rath, DA, Wrigley, TV, Starr, R, and Bennell, KL. A comparison of overground and treadmill running for measuring the three-dimensional kinematics of the lumbo-pelvic-hip complex. Clin Biomech 16: 667-680, 2001.
14. Stolen, T, Chamari, K, Castagna, C, and Wisloff, V. Physiology of soccer: An update. Sports Med 35: 501-536, 2005.
15. Svensson, M and Drust, B. Testing soccer players. J Sport Sci 23: 601-618, 2005.
16. Wier, JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res 19: 231-240, 2005.

field testing; endurance capacity; intraclass correlation; smallest worthwhile change; standard error of measurement; standard error of prediction

© 2009 National Strength and Conditioning Association