The National Football League (NFL) conducts an annual week-long event known as the NFL combine. The purpose of the combine is to allow personnel from the 32 professional football teams making up the NFL to evaluate top prospects for the upcoming NFL draft. Player participation in the combine is by invitation and predominantly includes seniors from National Collegiate Athletic Association (NCAA) colleges. Approximately 3% (i.e., approximately 330 players) of all football players representing the NCAA Division I teams are invited to the combine (9,10). During the combine, prospects are assessed. Part of the assessment process includes a number of standardized physical tests. Because these tests are standardized and performed under similar conditions, the data collected at the combine is viewed as representative of a player's abilities. Although college players on successful teams may build impressive resumes throughout their college careers, as compared to those on less successful teams, competition at the combine is viewed as occurring on a level playing field.
Considerable effort and expense are incurred in holding the combine. Over the course of the combine, a number of measures are evaluated, including: skill, anthropometric, physical, cognitive, and injury susceptibility. Furthermore, drug screening and extensive interviewing (each of the 32 teams may conduct 60 interviews) are conducted. Presumably, such effort and expense is incurred because the combine is viewed as a valid tool by which to assess NFL prospects.
Accurate assessment of prospects has financial implications for NFL teams. The median salary across teams in the NFL in 2009 ranged between $541,630 and $1,315,000 (13). Teams have active player rosters of 53 and also carry injured reserve and practice rosters. Large player rosters result in significant salary costs for NFL teams. The average NFL team value exceeds $US1 billion (2). As a rule, team values increase with game success (2,12). Given the relatively large financial implications for NFL teams of hiring (i.e., drafting) well, or not, accurate assessment of prospects is clearly important.
Regardless of the considerable effort and expense incurred to hold the combine, the value of certain aspects of the combine in terms of predicting performance in the NFL draft has been questioned (9,10). Specifically, it has been suggested that some of the exercises making up physical testing (36.6-m sprinthttp://en.wikipedia.org/wiki/40_yard_dash, vertical jump, broad jump, 18.3-m shuttle, 3-cone drill, and bench press) have limited usefulness with respect to predicting performance of the draft order for certain positions (9,10). Kumitz and Adams (9) reported, that one-third or less of the physical performance measures making up the combine test batteries correlated well with draft performance in the quaterback, running back, and wide receiver positions. It would appear from these findings that NFL teams do not rely heavily on physical performance data collected at the combine when making draft decisions. McGee and Burkett (10) generated prediction equations for a number of positions, or groups of positions, based on data from all drafted athletes in their respective positions and reported greater prediction success for a number of positions. Normalization of data may be another, less laborious method, by which to manipulate data for the purpose of more accurately predicting NFL draft order from combine data.
Normalization of data refers to the manipulation of data to account for differences in body size. Commonly, performance data are divided by body mass (BM) in an attempt to normalize performance. This type of scaling, known as ratio scaling, assumes a linear relationship between mass and performance, and, as such, favors lighter individuals (1,8). Allometric scaling emphasizes BM using a power exponent and may offer a more effective method of normalizing performance data (1,4,5,7). It has been suggested that a power exponent based on the theory of geometric symmetry (0.67) may not be appropriate in populations of elite athletes in which a greater proportion of muscle to BM is prevalent (3,11). Power exponents ranging between 0.33 and 0.64 have been derived in populations of elite rugby union players (5) Olympic lifters (3) and power lifters (6). Similar to generating regression equations, deriving allometric exponents specific to a group or population could also be viewed as laborious.
A less time-consuming method of normalizing data (e.g., using a predetermined allometric exponent), which may then be used to better predict performance, could be useful. Such methods could be applied to NFL combine data and provide NFL teams with information enabling them to better assess the abilities of draft prospects. The primary purpose of this study was to examine the validity of raw and scaled, ratio and allometric, NFL combine data as a predictor of draft order selection. To the best of the author's knowledge, such research investigating normalized data does not exist. Furthermore, the current research examined the ability of combine data to predict draft performance over a more comprehensive (17) and specific (no collapsing of data across positions) set of positions than previous research has. It was hypothesized that data scaled using an allometric exponent reflecting the body composition of elite athletes would better predict draft order selection.
Experimental Approach to the Problem
The NFL combine data were examined for draft order predictive ability. Combine data were examined in raw form, ratio-scaled (outcome/BM) and allometrically scaled (outcome/BMa), where the exponent a is 0.50. It has been suggested that the exponent 0.67, derived from the theory of geometric symmetry, is too large for use in athletic populations exhibiting relatively high proportions of muscle mass to BM (3,5,6,); and that exponents ranging from 0.33 to 0.64 are more effective with respect to normalizing data in athletic populations (3,5,6). An exponent of 0.50 was used in an attempt to allometrically scale data so that it may adequately normalize performance and provide a standard, simple-to-use method by which to scale combine data to predict draft order. Using combine data in its raw, ratio, and allometric form, tables describing correlations between combine measures and draft order are presented for each of 17 positions. Specifically, correlation tables are presented for the following positions: center, cornerback, defensive end, defensive tackle, free safety, fullback, inside linebacker, kicker, offensive guard, offensive tackle, outside linebacker, punter, quarterback, running back, strong safety, tight end, and wide receiver. The position of long snapper was not included, as during the period examined (i.e., 2005-2009), only 2 players satisfied the inclusion criteria. To add statistical power to the study, data were aggregated by position for the 5-year period.
Participants include players invited to NFL combines between years 2005 and 2009 who were drafted in the same year they took part in the combine. A total of 1,155 players were included in the study. Table 1 presents the number of participants by position for each year. Institutional review was considered a nonissue because of the retrospective nature of this study, the fact that no names are revealed and all data were retrieved from public access domains.
National Football League combine data were collected from NFLdraftscout.com, and draft order data were collected from football.about.com. Data from these websites are deemed accurate. All combine invitees included in the study did not necessarily complete all physical tests included in the combine. Furthermore, certain positions are exempt from performing certain tests (e.g., quarterbacks are exempt from the bench press test). Therefore, certain correlations will not be provided for certain positions (e.g., bench press-draft order for quarterbacks) and certain provided correlations may have been calculated from a sample somewhat smaller than the sample size presented in Table 1 (e.g., although 90 running backs were included in the study, the vertical jump-draft order correlations presented in Table 15 were derived from a sample of 73 running backs). Data for kickers and punters were retrievable for the 36.6-m sprint only. Also, in very limited instances (< 5%), combine data for a drafted player were not available. In such instances, it was assumed that player did not attend the combine and was therefore not included in the study. Data from the following combine tests were analyzed.
The 36.6-m sprint is a test of speed, acceleration, and power. From a 3-point stance, a player runs 36.6 m as fast as he can. Split times are also recorded at 9.1 and 18.3 m. Thus, the 36.6-m sprint test provides 3 separate outcome measures.
The vertical jump is a measure of lower body strength and power. Jump height is measured using a device (e.g., Vertec) whereby players jump for maximal height from a standing 2-footed position in a countermovement manner. At the peak of the jump, the player reaches as high as possible with a single hand to move horizontal vanes of the Vertec. Vertical jump height is calculated by subtracting the player's standing reach height from the height of the highest vane moved.
Standing Broad Jump
The standing broad jump is a test of lower body strength and power. Horizontal jump distance is measured. From a standing 2-footed position, the player jumps forward for maximal distance. Jump distance is measured as the distance from the start line to the nearest body part upon landing (this is typically the point of heel contact).
The 18.3-m shuttle is a test of power, acceleration, and change of direction. From the starting position, a player runs 4.6 m in 1 direction, quickly changes direction and runs 9.1 m in the opposite direction, and then changes direction again and runs a final 4.6 m in the opposite direction (i.e., the direction in which he initially ran). The test is run in both directions (i.e., left and right) for maximal speed, and the average of the 2 tests is recorded as the score.
The 3-cone drill is a test of speed, power and change of direction. The player runs around 3 cones placed in the shape of an “L,” with 4.6 m between each cone. From a 3 point stance, the payer runs a predetermined route as quickly as possible.
The bench press test is a test of upper body strength. Players bench press 102.1 kg for maximum repetitions.
To determine whether the performance measures predicted draft order, Pearson's r correlation matrices were generated using SPSS version 18. Correlations were generated for combine data in its raw form and when scaled in a ratio (outcome/BM) and allometric (outcome/BMa) manner. Each of the 3 correlations was generated for all available performance measures in the 17 examined positions. For example, to determine if broad jump performance predicted draft order in offensive linemen, 3 correlations were generated (i.e., raw data-draft order, ratio-scaled data-draft order, and allometric-scaled data-draft order). To better represent draft order by position, the overall draft order was converted to draft order within position. For example, if a wide receiver was drafted 100th overall, but was the 10th wide receiver drafted that year, a draft order score of 10 was given, rather than 100. A 5% level of significance (p ≤ 0.05) was used to determine statistically significant correlations.
A total of 360 correlations of the combine performance measures and draft order were performed. Tables 2-18 present correlations of draft order with data from the 8 combine tests for each of the 17 positions. A correlation table is provided for each position using data in its raw form and also when scaled in a ratio and allometric manner. Depending on the relationship between draft order and performance measure, the interpretation of the sign of the correlation may not be obvious. Firstly, a lower draft order is desirable (e.g., being drafted first is better than being drafted tenth) and, as such, a score of 1 is better than a score of 10 with respect to draft order. Regarding the performance measures, lower scores are preferable for the 9.1-, 18.3-, and 36.6-m sprint; 18.3-m shuttle; and 3-cone drill. Conversely, higher scores are preferable for the vertical and broad jump and the bench press performance measures. Therefore, positive correlations between draft order and the 9.1-, 18.3,- and 36.6-m sprint; 18.3-m shuttle; and 3-cone drill indicate that higher draft picks perform these drills in less time (i.e., better), whereas negative correlations between draft order and vertical jump, broad jump, and bench press indicate higher draft picks perform better in the drills. A number of correlations under each of the 3 data sets (raw, ratio, and allometric) are in a counterintuitive direction, that is, the sign of the correlation implies that the draft order is associated with lesser performance. The number of significant correlations by position and performance measure are presented in Tables 19 and 20, respectively.
The NFL combine is a multimillion-dollar event that allows NFL personnel to evaluate top football prospects before the annual NFL draft. The primary finding of the current research is that, whether raw or normalized, the draft order predictive ability of the outcomes of many of the physical performance measures undertaken at the combine is questionable. Not only are statistically significant relationships between the various performance measures and draft order scarce, but a number of the correlations under each of the 3 data sets are also in a direction indicating that higher draft order selection is associated with lesser performance. Because of the independent nature of the data gathered (raw) and generated (scaled) in the research for this study, before discussing the results in an aggregate manner, the data need to be treated and discussed in mutually exclusive silos. Each set of data (raw, ratio, and allometric), performance measurements (9.1-, 18.3-, and 36.6-m sprint; vertical and broad jumps; 18.3-m shuttle; 3-cone drill; and bench press) and positions (center, cornerback, defensive end, defensive tackle, free safety, fullback, inside linebacker, kicker, offensive guard, offensive tackle, outside linebacker, punter, quarterback, running back, strong safety, tight end, and wide receiver) deserves specific attention. Furthermore, each of these 3 categories must be viewed in context of the other 2.
It was hypothesized that data scaled using an allometric exponent reflecting the body composition of elite athletes would better predict draft order selection. This hypothesis was rejected. Overall, the raw data provided the greatest number of significant correlations at 29 followed by the allometric data set at 27. With respect to individual performance measures, the allometric data seemed to provide much better predictive ability for the 3-cone drill only. The 3-cone drill measures speed, power, and change of direction ability (9). Although speed and power are attributes likely to be associated with a number of the other performance measures (e.g., 9.1-, 18.3-, and 36.6-m sprint; vertical and broad jumps; 18.3-m shuttle), change of direction ability is specific to the 3-cone drill and 18.3-m shuttle. It is possible that allometric scaling of data generated from the 3-cone drill, using an exponent of 0.50, may better predict success in the NFL draft. In-depth discussion as to why allometric scaling of the 3-cone drill combine data appears to better predict draft order is beyond the scope of this study.
Ratio-scaled combine data do not appear to hold many advantages over raw data with respect to predictive ability for draft order. The exception may be superiority in predicting draft success from outcomes generated from the 18.3-m shuttle and 3-cone drill. Change of direction is common to the 18.3-m shuttle and 3-cone drill and may provide a basis from which to explain the greater number of significant correlations using the normalized, as compared to the raw, data. It is possible that the normalization of data generated from tests measuring change of direction ability better predicts draft order.
The raw data were comparable to, or better than, the normalized data sets with respect to predicting draft success via all combine measures with the exception of the 2 tests discussed above. However, this is not to suggest that the predictive ability of combine test raw data is high. In fact, this research is in agreement with previous findings (9), in that performance in combine physical tests appears to have little association with subsequent draft order selection. Kumitz and Adams (9) observed, when collapsed across position, that approximately 30% of performances measures were significantly correlated with draft order success. Similarly, the present research determined that when aggregated across positions, the raw data provided approximately 24% (29/120) significant correlations.
Of the 8 performance measures examined in this study, some would appear more useful than others in terms of predicting draft order. Excluding the postulation above regarding the possibility that normalized data may hold some advantage in terms of predictive ability with respect to the 18.3-m shuttle and 3-cone drill, it would appear that the 36.6-m sprint and associated split times, and the jumping measures best predict draft order. With the exception of the broad jump, which predicted draft order best when data was left in its raw state, all 3 data sets were quite similar in their ability to predict draft order from the data associated with these performance measures. Under each of the data sets, 77 correlations were performed on these 5 measures. Specifically, 17, 15, 15, 15, and 15 correlations were performed on the 9.1-, 18.3-, and 36.6-m sprint and vertical and broad jumps, respectively. Under the raw data set, there were 7, 8, 4, 5, and 4 significant correlations, respectively. This equates to 41, 53, 27, 33, and 27%, respectively. One would assume that the 9.1-, 18.3-, and 36.6-m split times are related, as are the vertical and broad jumps. It appears that straight sprint time and jumping ability are the best predictors of success in the NFL draft. To save time and cost, combine personnel should consider a single sprint test and single jumping task.
The raw data were comparable to, or better than, the normalized data in terms of predicting draft order success for all positions with the exception of defensive end and quarterback. With respect to these positions, neither the ratio nor allometric data were particularly better than the other in terms of predicting draft order. It is difficult to speculate as to why normalized data may better predict draft success in these positions. It is possible that the body type (i.e., lower BM) of the defensive ends and quarterbacks taken higher in the draft were more likely to be affected by normalization. For example, if, in these 2 positions, lighter players were selected before heavier players, this may have influenced results.
Excluding kickers and punters, for which only 1 test was performed, positions in which >50% of correlations were significant under any of the 3 data sets include offensive tackle, outside linebacker, and running back. The highest number of significant correlations under any position using any data set was 5 (62.5%). Under each data set, only 1 position was associated with 5 significant correlations. That is, in a possible 45 sets of 7 or 8 correlations only 3 sets indicated significant correlations exceeding half. Conversely, among the same 45 sets of correlations, 25 sets presented 0 or 1 significant correlation. Furthermore, as previously mentioned, a number of the correlations generated under each of the data sets are in a counterintuitive direction (i.e., draft success is associated with lesser performance). It would appear that regardless of position, the current battery of physical tests undertaken at the combine holds little value in terms of predicting draft order.
Raw data appear to be as effective as either of the 2 normalized data sets in predicting draft order. Although it is possible that normalized data may be advantageous with respect to a very limited number of certain positions and tasks, it is difficult to make any convincing argument for either method of normalization. Thus, it would seem to make little sense to spend any resources normalizing data in attempts to predict draft order from combine data. It is important to note that although the raw data set may be the best and most practical method for predicting draft success, as investigated here, this is certainly not to suggest that raw data from the current combine test battery is adequate for prediction purposes. In fact, although the combine test battery raw data set proved to be the most predictive overall, the test battery likely has little predictive value in its current state. The finding that raw data were better than normalized data with respect to predicting draft order is magnified when presented in conjunction with the suggestion that the straight sprinting and jumping exercises may be the only exercises of any predictive value in the current battery. Because the sprint tests are likely highly correlated, as are the jumping tests, it would seem prudent for organizers of the combine to reduce the current battery of tests to 2 (i.e., a sprint test and a single jumping test). It is possible that other physical performance measures, not currently undertaken at the combine, may have predictive ability. Alternative tests should be investigated for predictive ability.
It seems somewhat paradoxical that the NFL goes to such lengths to hold the physical performance portion of the combine for the purpose of evaluating draft prospects and subsequently drafts those prospects with little regard for the data generated by the very tool which it has created to evaluate those prospects. This does not appear to be a recent phenomenon, as research examining the 6 years before 2005 came to similar conclusions. It is important to note that the physical test battery examined in this study makes up only a portion of the combine evaluations. Comment on the usefulness of other aspects of the combine is beyond the scope of the current research. However, regardless of the value of the other portions of the combine, if the combine battery of physical tests is seemingly of such little interest to NFL teams, one has to wonder why it has not been abolished or modified to be more meaningful. It is possible that the media coverage and hype surrounding the combine are of such value that the testing is secondary. It is also possible that the other aspects of the combine hold more weight with NFL teams when it comes to player selection. The author is not questioning the value of the combine, but rather, the value of the current physical test battery. If NFL teams are not interested in the results of a number of the tests, perhaps elimination of those tests should be considered. If the measurement of certain physical attributes is important to NFL teams, a test battery measuring those attributes deemed important should be devised. Alternatively, if NFL teams place more weight on other aspects (e.g., technical or mental) evaluated at the combine, perhaps more time should be allotted to these tests in lieu of some, or all, of the physical tests.
Considerable expertise goes into the current combine test battery every year. Top strength and conditioning personnel are involved. Thus, rather than the test battery being inappropriate, it is perhaps more likely that NFL teams place greater emphasis on skills and traits other than those reflected by the physical test battery. Resources may be more effectively used evaluating other more “important”-or less obvious (requiring more time to accurately evaluate)-skills and traits. Traits such as “willingness to learn” or “team attitude” may not be as readily obvious or as easily tested as traits such as upper body strength (e.g., via the bench press). It is possible that NFL teams place more weight on aspects of performance other than those reflected in the combine test battery, as those attributes may be viewed as “learnable.” That is, NFL teams may presume a level of physical development in NCAA athletes that is adequate to build on. The mental aspects of the game necessary to perform at the professional level may be viewed as less “learnable,” and therefore more worthy of attention in making draft decisions.
Various explanations exist as to why performance in a number of the combine tests is not strongly correlated with draft order. It has been suggested that the lack of a strong relationship between the performance measures and the draft order may be because of the rigorous preparation invitees undertake before attending the combine (9). Without question, players partake in training designed specifically to perform well in the combine tests, and this may lead to equalization of outcomes. Indeed, this may explain the lack of significant correlations in the data. But presumably, if “outsiders” (i.e., those not directly involved in the NFL or combine) are aware of this, those directly concerned (i.e., NFL personnel) are also aware of this. In fact, the underlying premise of this argument is that NFL personnel are aware of the task-specific preparation invitees undertake and thus discount the combine test outcomes. The question remains “Why is the test battery conducted in its current state?” Alternatively, it may be that the combine tests are not sport specific and therefore have little bearing on a player's true ability and consequently receive little attention from NFL personnel. Again, this may explain the lack of significant correlations in the data but does not explain why the NFL continues to perform tests and collect data in which it has little interest. Perhaps NFL teams are not concerned with a number of the performance measures tested at the combine because of a belief that these physical skills can be taught. Regardless of the explanation as to why the test battery scores may be unimportant to NFL teams and therefore not correlated well with draft selection, one wonders at the time and cost of annually performing the tests on hundreds of players. Why has the combine test battery not been modified to better reflect or measure what NFL teams are interested in with respect to players' physical performance? If the construction of such a battery is deemed impossible or unnecessary, why has the testing not been abolished?
There are limitations to this study. It has been assumed that the test data collected at the combine were done so appropriately. Because the data were mined and not directly collected by the author, it is impossible to comment on collection technique rigor. It is also possible that because of the large number of correlations performed, spurious findings are present. It is likely that at least some of the significant correlations found are attributable to the random chance model. However, if this is the case, it strengthens the argument that NFL teams pay little attention to the data generated by the physical test battery undertaken at the combine.
Although some research (9,10) has examined the predictive ability of combine data in terms of predicting draft order, none has done so using normalized data. Furthermore, previous research has not investigated such a comprehensive set of positions. The results of the current research suggest that NFL teams are not particularly interested in physical attributes beyond straight sprint time and jumping ability, and, in fact, that the relevance of these attributes may be questionable for some positions. Thus, presumably, the highest level football personnel are more concerned with other attributes deemed more important at the professional level. If attributes such as football-playing ability, mental strength, team attitude, and willingness to learn, as compared to attributes such as upper body strength and change of direction ability, are given more weight by NFL personnel when drafting college prospects, young players may be well advised to develop the former. It may be that the physical attributes necessary to compete at the NCAA level are deemed by NFL personnel to provide adequate building blocks from which to progress to the professional level. That is, perhaps NFL teams are of the view that given a sound physical foundation, further advances in physical development can be achieved. If this is a belief commonly held by NFL teams, then attributes such as “willingness to learn” become increasingly important. This has implications for athletes at all levels and across all sports.
Specifically regarding the mental preparation of football players aspiring to enter the NFL, some cautious comments can be made. Elite college American football players invited to the NFL combine are evaluated for cognitive ability, in part, via the Wonderlic Personnel Test. Although not investigated in this study, previous research (9) has suggested that performance on the Wonderlic Personnel Test is not associated with draft order success. As such, players preparing for the combine are cautiously advised to focus mental preparation on combine components other than the Wonderlic Personnel Test. Although speculative, it may be that the interview component of the combine holds considerable weight. If this is the case, interview preparation would seem prudent.
The NFL is arguably one of the best-run professional sport leagues in the country and likely the world. Assuming such a well-run league involves competency at the drafting level, and further assuming that desirable athlete traits are transferrable across sports, one could argue that the insight into what is and is not important provided by this study can be applied to all sports. Specifically, it may be advantageous for athletes in all sports to increase time spent developing aspects of performance other than the purely physical. This is not to suggest physical development is unimportant. Rather, especially as competition levels increase, the decisive factor(s) for moving to the next level may be other than physical. Technical and physical development commonly receive great attention. Mental preparation specifically for athletic competition may not receive similar attention. Resources and tools (e.g., mental preparation programs and psychological coaches) are available. At advanced competition levels, if mental preparation is not a current component of training such resources should be sought out by athletes and coaches.
The author has no conflict of interest that is directly relevant to the contents of this manuscript.