A Machine Learning Approach to Assess Injury Risk in Elite Youth Football Players
It is a team physician's dream to have a tool to identify athletes at risk for injury prior to the season and implement an individualized program to decrease injury risk. Previous studies using multiple-regression analysis of preseason characteristics to predict in-season injuries have not had the precision to be particularly useful. In the August 2020 issue of Medicine & Science in Sports & Exercise®, Rommers and colleagues from Belgium used a machine learning approach to constructing a prediction model for in-season injuries among elite youth football (soccer) players (1).
Anthropometric and motor performance measures were obtained on 734 male youth football (soccer) players in the U10 to U15 Belgian Premier League football clubs prior to the 2017 to 2018 competitive season. Anthropometric variables included height, sitting height, leg length, body weight, and estimated years from peak height velocity (PHV). Tests of general motor coordination (jumping sideways, moving sideways, and balancing backward) and a football-specific motor coordination test involving both agility and dribbling were performed. Physical performance tests were the sit-and-reach test for flexibility, three strength tests (standing broad jump, countermovement jump, and curl-ups) and speed and agility assessments.
Injury was defined as any musculoskeletal condition that required an assessment by the medical or paramedical staff. All teams had on-site medical coverage. Injuries that had a specific identifiable inciting event were categorized as acute, while injuries that lacked an identifiable inciting event at their onset were labeled as overuse injuries. For each player, only the first occurring injury during the season was used in the analysis.
During the study season, 368 initial injuries — 173 overuse and 195 acute — were identified. Machine learning modeling was done using the XGBoost application. The initial model was developed on 80% of players who were randomly selected. This model was then applied to the remaining 20% (147) of the players.
The initial machine learning model had a precision (or positive predictive value) of 84%, which means that the model incorrectly identified 16% of uninjured players as injured. The recall (or sensitivity) was 83% or 17% of injured players were incorrectly identified as uninjured. When this model was applied to the remaining 147 players, the precision and recall were both 85%. This indicates that the injury prediction model did equally as well on the subset of players whose data were not included in the model development. An analysis of the model found that the variables that had the greatest influence on predicting injury were higher age at PHV (late maturers), higher body height and leg length, and lower fat percentage.
A second machine learning analysis was done to develop a model to predict if injured athletes would sustain acute or chronic injuries. Data from the 368 injured athletes were utilized. The model developed on 294 (80%) of the injured athletes had a precision (positive predictive value) and recall (sensitivity) of 82%. When applied to the remaining 74 athletes, precision and recall dropped to 78%. The influence of specific variables was less strong than in the overall injury model. The variables most associated with overuse injuries were lower predicted age at PHV (early maturers), higher sitting height, slower agility test times, and lower moving sideways test scores.
A machine learning approach to injury prediction may have some advantages over a traditional multiple-regression analysis. The machine learning model can better capture the interaction between variables and the influence of variables that have a nonlinear impact on injury. However, the large number of variables (in this study, 29) makes it unlikely that changing one or two variables (if possible) will decrease injury risk. The individual influence of the variables that were subsequently identified as “most predictive” was not strong, particularly in the prediction of acute or overuse injury. In addition, by design, the model is based on a very specific population, so the generalizability to other athletic populations is questionable. However, the machine learning approach could be useful in assessing injury risk in a specific athletic population and in identifying variable interactions that could potentially be incorporated in an injury prevention program.
Bottom Line. A machine learning approach was able to create a model using preseason anthropometric and motor performance variables that had a precision (PPV) of 85% and recall (sensitivity) of 85% to predict injury incidence during the ensuing season in elite Belgian youth soccer players. Future studies need to be done to see if information gained from this model can be used to guide interventions to reduce injury risk.
Velocity Loss as a Critical Variable Determining the Adaptations to Strength Training
Muscular strength and balance is clearly important in athletic performance and injury prevention. There also is a growing body of evidence that enhancing muscle mass and function and preventing the loss of muscle mass with aging are important for health promotion and disease prevention. Recommending specific strength training programs can be daunting with all the variables that are involved. Lift velocity loss has been proposed as a factor that influences the impact of a strength program and allows for more precise program individualization. This carefully controlled study from Seville, Spain, compared four resistance training programs that differed only in the magnitude of velocity loss (VL) during repetitive lifts on their impact on muscular strength, hypertrophy, and performance (2).
The study group consisted of 55 men (average age, 24.1 + 4.3 years) experienced in resistance training randomized to one of four resistance training groups. Each group trained twice a week for 8 wk under identical conditions. The only lift performed was a full squat (FS) performed at maximal intended velocity. The load for the FS increased from 70% to 85% of one repetition maximum during the course of the study. Each training session consisted of three sets of FS with 4 min between sets. The number of repetitions per set was limited by the VL — sets were stopped when the assigned VL threshold was exceeded. The VLs were 0% (VL0), 10% (VL10), 20% (VL20), and 40% (VL40).
Each subject underwent evaluation and testing 3 d before and 3 d after the 8-wk training period. Cross-sectional area (CSA) and muscle architecture of the vastus lateralis (VLA) were assessed by ultrasound. Muscle strength was measured as the maximal isometric force (MIF) and maximal rate of force development against a force plate. A progressive loading test and a muscle fatigue test using FS also were done. Neuromuscular function was assessed by tensiomyography (TMG) of the VLA and vastus medialis muscles consisting of an electrically evoked contraction. Measurements were maximum radial displacement of the muscle belly, time from 10% to 90% of DM (contraction time), and the delay time (Td) corresponding to the time between electrical stimulus and 10% DM. Performance tests were a 20-m sprint and a vertical jump.
The training groups with higher VL thresholds accumulated higher training volumes due to the performance of more repetitions in each set. The total number of FS repetitions done during the training program ranged from 48 for the VL0 group to an average of 306 in the VL40 group. The higher VL threshold groups (VL40 and VL20) did show significant muscle hypertrophy of the VLA (increased CSA) while the lower threshold groups (VL10 and VL0) did not. However, all four groups improved in MIF strength and vertical jump, and showed similar improvements on the muscle fatigue and progressive loading tests, with no between-groups difference. On TMG, the VL40 group significantly increased. A higher Td may indicate a poorer neuromuscular adaptation to training. The authors conclude that the VL threshold of 20% provided similar benefits in muscle hypertrophy and strength while avoiding the possible detrimental effects of delayed muscle responsiveness seen in the highest (VL40) group.
The results of this study are provocative in that they challenge the traditional adage that resistance training should go to complete “failure” to be maximally effective. However, this study did have limitations. The number of subjects was low, so the study may not have been sufficiently powered to show differences between groups. In addition, longer programs may produce more differences in strength gains, and this study involved only squats. Other muscle groups may respond differently. The response of novice lifters or older or younger age groups also is not known.
Bottom Line. Gains in strength and hypertrophy can be obtained at relatively low threshold velocity loss based resistance training. Future studies need to develop practical ways to apply these concepts to unmonitored resistance training.
1. Rommers N, Rössler R, Verhagen E, et al. A machine learning approach to assess injury risk in elite youth football players. Med. Sci. Sports Exerc
. 2020; 52:1745–51.
2. Pareja-Blanco F, Alcazar J, Sánchez-Valdepeñas J, et al. Velocity loss as a critical variable determining the adaptations to strength training. Med. Sci. Sports Exerc
. 2020; 52:1752–62.