Secondary Logo

Journal Logo

Original Research

The Reliability of Functional Movement Screening and In-Season Changes in Physical Function and Performance Among Elite Rugby League Players

Waldron, Mark1; Gray, Adrian1; Worsfold, Paul2; Twist, Craig2

Author Information
The Journal of Strength & Conditioning Research: April 2016 - Volume 30 - Issue 4 - p 910-918
doi: 10.1519/JSC.0000000000000270
  • Free



Functional movement screening (FMS) is advocated as a practical and objective tool to assess the “fundamental movement characteristics” of athletes (3). The entire screening process is categorized into 7 different movements, designed to reflect the basic positions and movement patterns required for whole-body function (3). In the FMS protocol, fundamental movements are characterized by the simultaneous requirement of motion, stability, and balance, which are thought to be underpinned by muscle strength, flexibility, range of motion, coordination, and proprioception (9). The quality of the participants' movement is subjectively assessed by an observer according to specific criteria and is proposed to detect the functional limitations and asymmetries of the participant (3). The scores are placed on an ordinal scale of measurement, ranging from 0 to 3. A score of 0 indicates pain on movement, a score of 1 indicates an incomplete movement, a score of 2 indicating a compensated movement, and a score of 3 meaning that the movement was fully completed (3). Individual scores are summated to determine an accumulated FMS score.

Because FMS relies on the judgment of the observer, it might be anticipated that factors such as experience and qualification will influence the interpretation of the movement pattern (17). However, Minick et al. (11) reported weighted kappa coefficients equating to either “substantial” or “excellent” agreement between novice (recently trained with FMS) and experienced observers (instructors of FMS). There was, however, an increase reported in the frequency of “moderate” agreements between the 2 novice observers, which was attributed to their lack of training with the FMS descriptors for each movement pattern. On the basis of these data, novice users can reliably use FMS to assess their athletes. Indeed, using the kappa statistic, other studies have reported “moderate” to “high” agreement between FMS testing sessions among observers of a nonspecified experience (14,16). However, the kappa statistic is a dimensionless value that prevents an interpretation of reliability within the context of its eventual use. The level of test-retest reliability that is acceptable should be considered a priori, according to the magnitude of systematic change that might occur in an individual's FMS score over time. Kiesel et al. (8) reported common mean changes in accumulated FMS score (owing to a training intervention) of less than 1, which is, in fact, an unattainable score on the ordinal scale. The ordinal scoring system (0–3) should also preclude the use of other statistical processes, such as SEM, which has been applied in previous FMS reliability analyses (14). Finally, previous studies have failed to assess the reliability of the FMS within each of the individual movement categories, which is important if erroneous interpretation of the accumulated FMS score is to be avoided. A method posited by Cooper et al. (4) provides an alternative approach for the assessment of FMS intraobserver reliability. This method adopts the use of a “practically important reference value” and identifies both the systematic and random error of a measurement. In the case of FMS, such an approach establishes the minimum change in an FMS score to be considered meaningful, and in turn, what level of reliability is “acceptable.”

A recent study identified American Football players with an accumulated FMS score of less than 14 of 21 as “injury-risks” (9). That is, an accumulated FMS score of <14 has been used to predict athletes losing 3 weeks or more of training time with specificity of 0.91 and sensitivity of 0.54 (9). The administration of an off-season conditioning intervention among American Football players increased the number of players with accumulated FMS scores above 14 compared with baseline values (8). Although FMS scores seem sensitive to training intervention, only poor-to-moderate relationships between accumulated FMS score and tests of physical fitness have been found among elite golfers (15) or untrained individuals (13), thus questioning the generalizability of the accumulated FMS score as a correlate of athletic ability. It is currently unknown whether FMS scores are “sensitive” to changes in various components of physical fitness that typically occur across in-season periods, particularly when there is no specific aim to improve the performance of players on the components of the FMS. During such periods, one might expect changes in the physical fitness of athletes, in response to modified training and match loads (6). However, it is not currently known whether changes in whole-body function (as determined by the FMS) will occur without deliberately following a program designed to improve such movement patterns, and indeed, whether changes in physical fitness rely on an improvement in FMS score. This is important because practitioners should be aware of the different qualities that underpin general physical fitness or specific whole-body function, as determined by the FMS.

Given the statistical methods used to assess the test-retest reliability of FMS to date are inappropriate, the first aim of this study was to assess the real-time test-retest reliability of the FMS, adopting a nonparametric statistical approach (4). Second, because training interventions have improved accumulated FMS score in other contact sports (8,9), a second aim was to assess the changes in FMS score and concurrent tests of physical fitness at 3 stages of a competitive season in elite rugby league players.


Experimental Approach to the Problem

A descriptive study was undertaken to determine whether changes in the players' whole-body function (as described by the FMS tool) changed in accordance with anticipated changes in physical fitness (speed, strength, and jump height) over 3 stages of a competitive season (preseason, midseason, and late season). Specifically, these were 6 weeks before the first competitive match of the season (January), the middle week of the season (April), and 4 weeks before the end of the season (August). At each stage, elite rugby league players were tested for speed, counter-movement jump (CMJ) height, 3 repetition maximum (3RM) full squat, and 1 repetition maximum (1RM) bench press and underwent FMS. Although the training program (below) followed by the players was aimed at increasing strength, speed, and power, there was no attempt to change their performance on the FMS. Intraobserver reliability of the FMS was assessed during the first period of data collection, comprising 2 screens, separated by 1 week. The observer was deemed to be of an intermediate standard, with 1 year of experience using the FMS and the associated descriptors.


Thirteen elite male under-19 rugby league players contracted to a professional club in England volunteered to participate in the study (age: 18.2 ± 0.5 years; body mass: 92.5 ± 12.5 kg; stature: 182.2 ± 6.0 cm). The players had an average playing experience of 8 ± 1 years and had been contracted to the professional club for the previous 5 years. The players followed a supervised training program, attending 3 sessions per week, comprising field-based aerobic training, gym-based resistance training, and small-sided rugby games and played competitively for the club. The sample was later reduced to 12 as 1 player was omitted from the study owing to injury between the start- and mid-season stages. No other participants sustained an injury that prevented them from taking part in more than 1 training session or match. During the study period, the players typically trained for 3–4 days of the week and played 1 match at the weekend (in-season only). Their training included a variety of strength, power, endurance, core stability, and skill-related exercises arranged to accommodate the demands of the season. More specifically, the players followed a periodized cycle beginning in the December preseason (focus on hypertrophy and gains in strength alongside continuous aerobic conditioning), changing to an in-season program in February (power-based activities alongside higher-intensity/sprint interval training). Hypertrophy sessions included moderate-load, high-volume resistance sessions, using compound upper- and lower-body movements. Upper- and lower-body plyometric exercises and Olympic-style lifts were incorporated into the players' program during the in-season periods, performed at lower volumes and higher intensities. Importantly, none of the exercises were aimed specifically at improving performance on the FMS alone. Consent was obtained from the players and their parents/guardians pursuant to law, and Institutional Board approval for the study was granted by the Faculty of Applied Health Sciences Ethics Committee.

Functional Movement Screening

The FMS procedure was performed in accordance with the guidelines of Cook et al. (3). In brief, the full screen included 12 tests (in order); the squat, hurdle step (right then left), lunge (right then left), shoulder mobility (right then left), active straight leg raise (right then left), push-up, and rotary stability (right then left) (3). In the days before the screen, each movement was demonstrated to the participants by an experienced strength and conditioning practitioner using the guidelines provided by Cook et al. (3), without any cues or suggestions relating to the quality of their movement from the researcher or observer. The movement was scored according to the criteria outlined in previous studies (11,13). Table 1 describes the instructions provided for each movement pattern and additional scoring information used by the researcher to assess the quality of the movement (see Cook et al. (3)). The equipment used was a 121.9 × 5.1 × 15.3 cm PVC measurement board with removable dowel (76.2 cm) inserts, a 121.9-cm PVC dowel, and elastic band for the hurdle-step movement. The use of the equipment is described in Table 1.

Table 1
Table 1:
a Functional movement screening instructions and additional scoring information (see Cook et al. (3)).*
Table 1
Table 1:
b Functional movement screening instructions and additional scoring information (see Cook et al. (3)).*

The participants performed the movements twice, permitting the observer to vary their view of the athlete's movement through different planes of motion (i.e., sagittal and frontal), respectively. An identical screening procedure was administered at both the mid- and end-of-season phase, at the same time of the day (1300–2000 hours), using the same equipment. The FMS was performed by the participants at the start of a designated testing week, which was followed by the physical performance tests in the subsequent days. For consistency, the fitness tests were also performed at the same time of day (between 1600 and 1900 hours). All the participants did not take part in exercise, outside of that required for completion of the study, in the 48 hours before any of the functional tests or performance measurements. As part of their support program, the players' hydration and nutritional status was regularly monitored, and if required, adjusted (weekly) by sports scientists employed by the club.

Sprinting Speed and Counter-Movement Jump Height

Both the sprinting and CMJ tests were performed on the day after the FMS procedure. The tests of sprinting speed and CMJ height were preceded by a standardized warm-up that comprised moderate intensity jogging, calisthenics, and dynamic stretching. The sprinting protocol consisted of 2 maximal sprint efforts, starting from a standing position, separated by a 3-minute recovery period. The sprinting course was marked with a premeasured (tape measure) straight painted line, on which timing gates were positioned at 10 and 40 m. At each interval, timing gate height was set at 60 cm (5). On both occasions, participants were instructed to start sprinting from 30 cm behind the first timing gate, from their preferred foot, until they reached the final cone. Split times were recorded at 10 and 40 m from a wireless receiver (Brower timing systems, Utah, USA) accurate to 0.01 seconds. The coefficient of variation (CV) for sprinting times over 10 and 40 m was 1.1 and 1.4%, respectively. The CMJ test was performed approximately 10 minutes after the sprinting test. Maintaining a stance at shoulder width, participants flexed their knees in a rapid downward motion, reaching approximately 90°, before rapidly extending their knees and driving in an upward motion to complete the jump. The participants performed 3 jumps with the highest jump used for analysis. Counter-movement jump height (in centimeters) was calculated as the difference between landing and take-off time recorded using a timing mat system (Just Jump System, Probotics Inc., Huntsville, AL, USA). The CV of the CMJ height was 2.4%.

Three Repetition Maximum Full Squat and 1 Repetition Maximum Bench Press

For each of the strength tests, the procedures followed those outlined for the assessment of rugby league players (1). For the full squat, the warm-up consisted of dynamic stretching and 3 light sets of 3 repetitions with progressively heavier loads, finishing 10–15 kg less than the individually prescribed goal 3RM. The participants then performed a 3RM load that was estimated in accordance with their training history and lifts during the warm-up. The participants were offered an additional attempt with the 3RM depending on whether the prescribed 3RM load was either successfully or unsuccessfully lifted. In such cases, 1 further attempt with an increment or reduction of 5% of the original load was permitted, respectively. For the 1RM bench press exercise, the participants followed an identical warm-up pattern using single repetition rather than 3 repetitions. For both the bench press and full squat exercise, the bar was lifted in a smooth motion (i.e., approximately 2-second eccentric, 1-second pause, and 2-second concentric action). For the bench press exercise, the bar was gripped marginally outside of shoulder width, with the feet remaining in contact with the floor and the buttocks remaining in contact with the bench. For the full squat, the participant descended eccentrically until the top of the thigh was slightly below parallel with the floor. A qualified strength and conditioning coach monitored the lifting technique of each exercise, with the heaviest load lifted recorded as the participant's final score. We have previously determined the test-retest reliability (CV%) of the bench press and full squat exercises to be 1.7 and 2.0%, respectively.

Statistical Analyses

The distributions of the data sets were checked for normality using the Shapiro-Wilk statistic, with equality of variance being assessed through Levene's test. Because violations to normality were observed (p ≤ 0.05) in the FMS data sets, changes in each of the 12 FMS scores over the season were assessed using a nonparametric Friedman test, with seasonal stage (start season, midseason, and late season) as the independent variable. Changes in physical fitness (10-m speed, 40-m speed, CMJ height, bench press, and full squat) were assessed using a series of one-way analyses of variance (ANOVAs) with repeated measures (one-way ANOVA-RM), using seasonal stage as the independent variable. Pairwise comparisons were performed post hoc using paired t-tests. Statistical significance was set at p ≤ 0.05, and analysis was performed using SPSS (SPSS Statistics for Windows, Version 19.0, Chicago, IL).

Intraobserver test-retest reliability was assessed using the nonparametric statistical technique of Cooper et al. (4). First, this technique required that the presence of bias between the test and retest trials of the observer was checked through a median sign test. Second, the degree of random variation between trials was evaluated by calculating the percentage of agreement and associated 95% confidence intervals (CIs) between trials inside a “practically important” reference value (12). In the context of FMS among elite team sport athletes, previous studies have demonstrated common mean changes of less than 1 in each movement component after a preseason training intervention program (8). Such changes contributed to a significant increase in the composite FMS score of the athletes, rendering them as lower injury risks (8). As such, small (1 or less) changes in each FMS component require identification using the FMS protocol. Therefore, any variability (error) between repeated FMS measurements exceeding 1 would mean that any potential change (perhaps owing to a training intervention) is undetectable. As such, a reference value of “perfect agreement” (zero difference between observations) was deemed as “practically important” for each movement being assessed. For demonstrative purposes, a secondary (hypothetical) reference value of ±1 (a difference of one in either direction) was also set. An error of ±1 was adopted because this is the smallest possible error that can be made on the 1–3 ordinal scale. It was anticipated that adopting a more tolerant reference value of ±1 would highlight the margins within which the FMS components might be considered reliable.


Intraobserver Reliability of the Function Movement Screening

There was no systematic bias (p > 0.05) found between trials for the scoring of any FMS component. Within the practically important reference value of “perfect agreement,” the observer demonstrated 100% agreement in all tests apart from the left lunge, left leg raise, and the right leg raise. In the worst case, left leg raise and right leg raise showed agreements of 88.3% with 95%, respectively, with CIs ranging between 65.6 and 100% (Table 2).

Table 2
Table 2:
The intraobserver reliability of the functional movement screening components.*

Seasonal Changes in Functional Movement Screening and Physical Performance

There were no effects of season stage on any of the FMS components (p > 0.05; Table 3). Similarly, there was no change (p > 0.05) in the accumulated FMS score across the season (preseason = median 14, 95% CI = 14–18; midseason = median 14, 95% CI = 14–18; late season = median 14, 95% CI = 14–18).

Table 3
Table 3:
Intraseason changes in FMS scores in rugby league players (medians and interquartile range).*

The ANOVA-RM demonstrated differences across the season for 10-m sprint time (F(1,11) = 3719, p < 0.001), 40-m sprint time (F(1,11) = 4410, p < 0.001), CMJ height (F(1,11) = 750, p < 0.001), 3RM squat (F(1,11) = 467, p < 0.001), and 1RM bench press (F(1,11) = 335, p < 0.001). Post hoc paired t-tests revealed an improvement between pre- and both mid- and late-season periods in every performance test (Table 4). For each of the outcome measures, between 83% (n = 10) and 100% (n = 12) of the participants showed a descriptive improvement in their fitness, reflecting a consistency in their response to training across the season. There were no differences (p ≤ 0.05) found between midseason and late season in any performance test, which was the case for all the participants.

Table 4
Table 4:
Intraseason changes in physical performance tests of fitness.*†


Our findings have demonstrated that the majority of FMS components, in relation to a “practically relevant” analytical goal, can be considered reliable. Such findings reaffirm the conclusions of previous studies (11), while adopting the use of a more appropriate nonparametric statistical technique. In particular, the squat, hurdle step (right and left), lunge (left), shoulder mobility (right and left), push-up, and rotary stability (right and left) demonstrated 100% perfect agreement between trials, with a population range (CIs) showing no potential for random error in these measurements. The level of reliability shown in the aforementioned tests would permit the detection of changes in FMS components of the magnitude (i.e., <1) previously reported by Kiesel et al. (8). Although there was no systematic bias between trials for any FMS component (Table 2), there was more random error found for the active leg raise (right and left) and the right-sided lunge. Although 100% agreement was demonstrated within the hypothetical tolerance of ±1 in the above tests, perfect agreement was marginally poorer, ranging from 88.3 to 91.6% (Table 2). Based on the 95% CIs, the active leg raise (both left and right) error could range as low as 65.6% or as high as 100% if the tests were to be performed again. In the context of the current analysis, obtaining an agreement of 65.6% would result in 34.4% of athletes' hamstring flexibility being misinterpreted. It is for practitioners to recognize the implications of such findings when working with athletes and to consider the acceptability of this error range. However, as we have demonstrated, it is unlikely that an error range of ±1 would be acceptable for team sport practitioners. As such, some caution is warranted when using the active leg raise tests in conjunction with the FMS protocol. The poorer reliability in the active leg raise components might be a consequence of insufficient prewarming of the muscle group before the assessment. Incidentally, there is no standard warm-up advocated before the FMS assessment, which may also influence the scoring of this test. Within the hypothetical practical reference value of greater tolerance (±1), 100% agreement was found for every FMS component. Such findings highlight the use of a priori reference values that suit the context of eventual use. For example, potential users of the FMS who are less concerned with reaching perfect agreement and are willing to incorporate an error of ±1 into their assessment can be assured of the credibility of the FMS to detect changes of a larger magnitude.

The improvements in speed, strength, and jump height between the start and middle of the season are similar to those reported among rugby league players of a similar age and ability (7). Such changes reflect an acute neuromuscular adaptation (7), as well as increases in the contractile fibers of the relevant muscle groups, (1) that would support an increase in strength and power. In this study, each parameter of fitness remained the same between the mid- and late-season periods (Table 4). Such changes were anticipated among the participants owing to the reductions in training load that are typical of a mid- to late-season transition (6). Given that the physical fitness of the participants' changed markedly from the start to the middle of the season, one might expect concomitant changes in their FMS scores. However, no changes (either increase or decrease) in FMS components were found across the entire season (Table 3). This finding suggests either the qualities that are suggested to underpin the FMS tests (muscle strength, flexibility, range of motion, coordination, and proprioception) do not contribute to performance in the physical fitness tests or that FMS scoring is not sensitive to these changes. Alternatively, it has been suggested that athletes can enhance athletic ability, independent of proper function, through compensatory adaptations (3). The lack of change in the FMS score in the presence of improvements in general physical fitness over the course of a competitive season has highlighted that the FMS tool should not be confused as a marker of athletic ability. Such findings are in agreement with previous studies that have demonstrated no relationship between physical performance and FMS scores among elite golfers (15). Although it is not our contention that the FMS tool should be used to measure athletic ability, it is interesting that changes in strength, power, and speed are not coupled by changes in functional movement. In this regard, our findings have 2 potential implications; first, it is possible that the widely adopted 21-point scale lacks sensitivity to changes in whole-body function, and thus physical performance. The recent development of a 100-point FMS protocol (2) might be a useful, and perhaps more sensitive, method of identifying changes in function. Second, in accordance with view of the developers of the FMS tool (3), our findings demonstrate that physical performance and physical function are separate constructs.

A 12-week off-season training intervention among elite American Football players helped to increase the cumulative FMS score, leading to the attainment of scores beyond the injury threshold of “14” on the 21-point scale (8). In this study, there were no such changes in the accumulated FMS score across the 3 season periods. Kiesel et al. (8) designed individualized training intervention programs, specifically focusing on improving the movement patterns of the FMS that were deficient at baseline. In contrast, a program designed to improve rugby league performance, rather than the FMS patterns alone, does not induce the same changes in the FMS. Potential users should be aware that a sport-specific training program that does not focus on deficient components of the FMS, yet encompasses various core stability and flexibility regimes (among others), will not improve performance in the FMS. Indeed, Kiesel et al. (8) suggested that the lack of a control group (i.e., not undergoing FMS-specific training) was a limitation of their study. It is possible that FMS-specific training programs condition the athlete to perform simple, yet novel, movements over a prolonged period of time. Future research should consider evaluating the relationship between FMS and injury incidence among rugby league players, which was not an aim of this study. Such an analysis is required to ratify the FMS, and in turn programs aimed to improve FMS performance, as an indicator of the well-known injury risk in rugby league (10).

Practical Applications

Within a reference value of “practical importance,” the FMS components can be reliably administered to elite rugby league players. Some caution is warranted with the active leg raise components, dependent on the degree of error that can be tolerated by the user. Although the FMS is a reliable measure, our findings provide evidence that systematic changes in athletic ability can be developed independent of changes in movement function. Such findings demonstrate the apparent differences between physical performance and physical function and should encourage practitioners to separately assess each construct. The important finding for rugby league practitioners is that improvements in speed, strength, and power across a competitive season do not rely on changes in FMS score, and without specific focus on developing the FMS movement patterns, they will not improve. Our findings should not necessarily deter practitioners from using the FMS but begin to question the specific qualities that are being assessed through its administration, as well as highlighting its poor relationship to improvements in selected parameters of fitness.


1. Baker D, Nance S. The relation between strength and power in professional rugby league players. J Strength Cond Res 13: 224–229, 1999.
2. Butler RJ, Plisky PJ, Kiesel KB. Inter-rater reliability of videotaped performance on the functional movement screen using the 100-point scoring scale. Athl Train Sport Health Care 4: 103–109, 2012.
3. Cook G, Burton L, Kiesel K, Rose G, Bryant MF. Movement: Functional Movement Systems: Screening, Assessment, and corrective strategies. Aptos, CA: On target publications, 2010.
4. Cooper SM, Hughes M, O'Donoghue P, Nevill AM. A simple statistical method for assessing the reliability of data entered into sport performance analysis systems. Int J Perf Anal Sport 7: 87–109, 2007.
5. Cronin JB, Templeton RL. Timing light height affects sprint times. J Strength Cond Res 22: 318–320, 2008.
6. Gabbett TJ. Reductions in pre-season training loads reduce training injury rates in rugby league players. Br J Sports Med 38: 743–749, 2004.
7. Gabbett TJ, Johns J, Riemann M. Performance changes following training in junior rugby players. J Strength Cond Res 22: 910–917, 2008.
8. Kiesel K, Plisky P, Butler L. Functional movement test scores improve following a standardized off-season intervention program in professional football players. Scand J Med Sci Sports 21: 287–292, 2011.
9. Kiesel K, Plisky P, Voight M. Can serious injury in professional football be predicted by a preseason functional movement screen? N Am J Sports Phys Ther 2: 147–158, 2007.
10. King DA, Hume AP, Milburn PD, Guttenbeil D. Match and training injuries in rugby league: A review of published studies. Sports Med 40: 163–178, 2010.
11. Minick KI, Kiesel KB, Burton L, Taylor A, Plisky P, Butler RJ. Inter-rater reliability of the functional movement screen. J Strength Cond Res 24: 479–486, 2010.
12. Nevill AM, Lane AM, Kilgour LJ, Bowes N, Whyte GP. Stability of psychometric questionnaires. J Sports Sci 19: 273–278, 2001.
13. Okada T, Huxel KC, Nesser TW. Relationship between core stability, functional movement and performance. J Strength Cond Res 25: 252–261, 2011.
14. Onate JA, Dewey T, Kollock RO, Thomas KS, Van Lunen BL, DeMaio M, Ringleb SI. Real-time intersession and inter-rater reliability of the functional movement screen. J Strength Cond Res 26: 408–415, 2012.
15. Parchmann CJ, McBride JM. Relationship between functional movement screen and athletic performance. J Strength Cond Res 25: 3378–3384, 2011.
16. Shultz R, Mooney K, Anderson S, Marcello B, Garzal D, Matheson GO, Besier T. Functional movement screen: Inter-rater and subject reliability. Br J Sport Med 45: 374, 2011.
17. Waldron M, Worsfold P, Twist C, Lamb K. The reliability of tests for sport-specific motor skill amongst elite youth rugby league players. Eur J Sport Sci 14(sup1): S471–S477, 2012.

team sport; mobility; measurement error

Copyright © 2016 by the National Strength & Conditioning Association.