Hospital for Special Surgery Pediatric Functional Activity Brief Scale Predicts Physical Fitness Testing Performance : Clinical Orthopaedics and Related Research®

Secondary Logo

Journal Logo

Clinical Research

Hospital for Special Surgery Pediatric Functional Activity Brief Scale Predicts Physical Fitness Testing Performance

Fabricant, Peter D. MD, MPH1,a; Robles, Alex BS2; McLaren, Son H. MD3; Marx, Robert G. MD, MSc4; Widmann, Roger F. MD1; Green, Daniel W. MD, MS1

Author Information
Clinical Orthopaedics and Related Research 472(5):p 1610-1616, May 2014. | DOI: 10.1007/s11999-013-3429-1
  • Free



Activity level is known to be a key prognostic variable for active patients with musculoskeletal injuries, and until recently, there were no standardized, validated scales to assess physical activity in children and adolescents [2]. To that end, an eight-item activity scale was developed and validated (Hospital for Special Surgery Pediatric Functional Activity Brief Scale, Appendix 1) to fill this critical gap in the realm of pediatric sports medicine [2]. Whereas the original validation study was performed by comparing the activity scale with existing adult questionnaires and self-reported measures of activity, its implications in practical settings with respect to physical fitness remain unclear. Furthermore, like with any survey method, the activity scale questionnaire inherently depends on a reporter's ability to accurately interpret questions as well as to identify and categorize relevant activities [10]. Thus, validation of the activity scale with a standardized quantitative assessment of physical fitness would further establish its ability to objectively discriminate between children of varying activity levels.

FITNESSGRAM® (Human Kinetics, Champaign, IL, USA) is a series of assessments and corresponding web-based software tool designed to objectively and uniformly assess various components of physical fitness in children through age 17 years, including aerobic capacity, body composition, muscular strength, and endurance [9]. Rather than using normative-referenced standards to rank an individual among their peers, it uses criterion-referenced standards, which are less susceptible to the makeup of any one reference population [4, 9, 14-17]. After development, these criterion-referenced standards are repeatedly evaluated, refined, and validated [14], thus making it a fitness tool of choice for schools nationwide [13].

The purpose of this study is to use FITNESSGRAM® testing to prospectively determine whether scores on the activity scale exhibit any floor or ceiling effects and explore whether activity scale scores are correlated with standardized strength and aerobic physical fitness metrics in a cohort of otherwise healthy high school students. If an association exists between the activity scale and VO2-max, a third objective is to determine the discrimination ability of the activity scale to differentiate between adolescents with healthy or unhealthy levels of aerobic capacity and calculate an appropriate cutoff value for its use as a screening tool.

Patients and Methods

This study was approved by the hospital's institutional review board as well as the school administration at the high school in which it was implemented. Consent of students and parents was obtained using routine school protocols. To protect the privacy of the students, only age, sex, body mass index, and grade level demographics were provided to the research team.

Subjects were recruited in a cross-sectional design from a single high school and considered for inclusion if they were enrolled in physical education. Of the 186 who met these inclusion criteria, four (2%) were excluded because they were medically excused from physical education during the time of the study, leaving 182 adolescents (66 boys, 116 girls) ages 14 to 17 years (mean, 15.3 years old) for prospective evaluation. No student refused to participate. In addition to completing the activity questionnaire, one of five trained physical education teachers assessed each student's performance on standardized metrics of physical fitness as part of their normal curriculum. These metrics included number of pushups, number of sit-ups, performance on a Progressive Aerobic Cardiovascular Endurance Run (PACER) timed 20-m shuttle run exercise, and FITNESSGRAM®-calculated VO2-max [4, 7, 9, 12].

Testing of each performance metric was done uniformly per FITNESSGRAM® standardized protocols. Pushups were tested on a hard surface at a rate of 20 per minute until failure and counted only if the student performed a pushup from full extension to an elbow angle of 90° and back to full extension. Sit-ups were performed on a mat at a rate of 20 per minute until failure, two corrections for inadequate form, or after completing 75 repetitions. Adequate form for the sit-up test included keeping the student's knees bent and feet flat throughout the exercise with their arms by their side on the mat (rather than across the chest or behind the head). Sit-up range of motion started with the head touching the mat, curling up from the abdominals with the hands remaining on the floor, and gliding across the surface of the mat for a minimum of 4.5 inches of excursion. The PACER was performed by running back and forth across a 20-m distance at a specified pace that gets faster each minute until the student fails to complete the 20-m run in the specified pace time twice. VO2-max was calculated using a previously validated and crossvalidated FITNESSGRAM® regression model, taking into account physical performance, body metrics, age, and sex [7].

Because age, sex, and body mass index were used in the original activity scale validation study for discriminant validity testing [2], we hypothesized that all four physical fitness metrics would correlate with the activity scale, whereas age, sex, and body mass index would not.

To investigate for a clinically relevant activity score threshold, each subject's FITNESSGRAM®-calculated VO2-max was categorized with respect to an age- and sex-specific cutoff (Table 1). These healthy fitness zone values were established in a previous study of 1966 children by Welk et al. [16] in which each participant performed VO2-max testing and were also evaluated for the presence of metabolic syndrome markers (waist circumference, hypertension, low high-density lipoprotein cholesterol, hypertriglyceridemia, and elevated fasting glucose). These markers were used to define the presence or absence of metabolic syndrome based on pediatric criteria established by the International Diabetes Foundation as adapted from the National Cholesterol Education Program Adult Treatment Panel [6]. In the current study, receiver operating characteristic analyses were performed using activity score as a predictor variable and values of VO2-max either above or below the healthy fitness zone cutoff as the outcome variable, thus investigating for a threshold value of activity score that can provide clinical use as a screening tool for those children above or below their healthy fitness zone aerobic capacity.

Table 1:
Age- and sex-specific thresholds for healthy aerobic fitness (VO2-max, mg/kg/min [12 , 16])

Statistical Methods

Statistical analyses were performed by a member of the research team (PDF) with advanced training in epidemiology and biostatistics using SAS Version 9.3 software (SAS Institute, Inc, Cary, NC, USA). Descriptive statistics were used to evaluate the distribution of continuous data. Pearson correlation coefficients were used to assess for correlations between the brief activity scale and FITNESSGRAM® metrics. Unpaired Student's t-tests were used to evaluate for sex differences between the continuous variables, and linear regression analyses were used to investigate potential confounding variables. Receiver operating characteristic analysis was performed using a customized SAS macro to investigate potential activity scale thresholds that may be predictive of children with values of VO2-max at risk of metabolic syndrome. All analyses that generated p values were two-tailed and used p = 0.05 as the threshold for statistical significance.


Mean activity score was 19.9 ± 7.4 (range, 0-30) (Table 2). Based on criteria defining floor and ceiling effects as > 15% of participants scoring the minimum or maximum score, respectively [8, 11], there were no floor or ceiling effects observed in this cohort; 1.1% (two of 182) scored 0 points and 1.6% (three of 182) scored the maximum 30 points.

Table 2:
Means, SDs, and ranges for each measured variable

Pushups, sit-ups, 20-m PACER, and VO2-max were all positively correlated with scores on the brief activity scale (Pearson correlations, all p < 0.001) but not with student age or body mass index (Table 3). Males scored 3 points higher than females on the activity scale, on average, when controlling for age and body mass index (linear regression analysis, p < 0.001). Further investigation of sex as a potential confounder of the relationship between the activity score and physical fitness revealed no associations between sex and any of the four standardized physical fitness metrics (unpaired t-tests, p > 0.05 for all), thus excluding it as a confounding variable.

Table 3:
Correlations between activity score and each of the continuous fitness assessments and demographic variables

Receiver operating characteristic analysis (Fig. 1) revealed that a brief activity scale score of ≤ 14 predicted a VO2-max value corresponding to a range of those at risk of metabolic syndrome with a sensitivity of 83% and specificity of 57%. Area under the receiver operating characteristic curve was acceptable (AUC = 0.80) [1].

Fig. 1:
Receiver operating characteristic analysis determined that an activity score of ≤ 14 (corresponding data point labeled “14”) predicted a VO2-max value below a subject's age- and sex-specific healthy fitness zone with a sensitivity of 83% and specificity of 57%. Area under the curve = 0.80.


The goal of this study was to prospectively investigate whether scores on the Hospital for Special Surgery Pediatric Functional Activity Brief Scale exhibited any floor or ceiling effects and to determine if scores are correlated with standardized strength and aerobic physical fitness metrics in a cohort of otherwise healthy high school students. Furthermore, we sought to use receiver operating characteristic analysis to establish a clinically significant threshold for activity level in children. This study is the first to our knowledge to prospectively validate a musculoskeletal activity scale through comparison with standardized quantitative physical fitness assessments. The current study has shown the lack of floor and ceiling effects, significant positive correlations between activity score and four standardized physical fitness metrics, and discriminative ability for the scale to be used as a screening tool for detecting an unhealthy level of aerobic fitness.

The current study has some limitations. Because this investigation was conducted at a single high school, generalizability may be limited. Validating a questionnaire as a screening test typically requires larger, more diverse study populations; however, this cross-sectional design allowed for the inclusion of subjects with a wide range of activity levels and physical fitness. This provides preliminary evidence for its use as a screening tool in addition to further validation as a quantification of physical activity. Also, as a result of unavoidable privacy policies at the high school, organized activity and sports participation information, race/ethnicity, and current medical problems were not collected. This information would have been helpful in further analyzing the relationship between activity score and physical fitness. Whereas knowing athletic participation and organized activities would have been helpful for further validation testing, it is unlikely that race/ethnicity would have any impact on physical activity scores. We can also be certain that the included subjects were healthy without any medical conditions that would have precluded activity at levels required by a high school physical education curriculum, thereby limiting the use of having further information on subjects’ medical history. Finally, although the initial validation study included patients aged 10 to 18 years, this study was comprised of patients aged 14 to 17 years. Therefore, although the scale may be used to quantify activity in children aged 10 to 18 years, the results of the current study may only be applied to high school-aged adolescents.

The investigation for floor and ceiling effects of activity scale scores in this cohort revealed no such phenomenon. This confirms the findings of the original validation study [2]. Furthermore, in establishing a scale as a potential screening tool, it is vital for the scale to be free of floor and ceiling effects when applied to the target population. This indicates that the scale can appropriately measure activity along a broad range of levels, encompassing all of those present in the cohort of interest.

In the current study, activity scale scores were positively correlated with all four physical fitness metrics (pushups, sit-ups, performance on the 20-m PACER, and VO2-max) but not associated with age or body mass index, as expected. These results further validate this activity scale by comparing scores with a quantitative, standardized assessment of physical fitness as a secondary measure of physical activity rather than depending on a subject's unbiased ability to accurately interpret questions as well as identify and categorize relevant activities. By doing so, this establishes the scale's ability to discriminate between children of varying fitness levels. Such an analysis investigating convergent validity between a questionnaire-based activity scale and standardized quantitative physical fitness assessments has not heretofore been performed, to our knowledge. When demonstrating convergent validity, correlation coefficients in the 0.3 to 0.6 range indicate a clinically significant correlation. Because the correlation coefficient between the activity scale and the calculated VO2-max was in this range (ρ = 0.43), we continued with our analysis of the third research question to evaluate the discriminative capacity of the activity scale to predict age- and sex-normalized healthy or unhealthy levels of VO2-max.

Perhaps most importantly, in evaluating our third research question, the current study used receiver operating characteristic analysis to establish an activity score threshold that is predictive of VO2-max values that have been previously shown to be associated with children at risk of metabolic syndrome. This further establishes the Hospital for Special Surgery Pediatric Functional Activity Brief Scale as a useful tool that may also be used to screen for children with at-risk levels of aerobic physical fitness and thereby identify those who might benefit from early intervention. Although it is impossible to determine from the current study methodology whether increasing activity score would change the risk profile for metabolic syndrome in any one child, it is nonetheless a useful screening tool for identifying at-risk children with a sensitivity of 83% and a specificity of 57%. This high sensitivity is favorable because screening tests preferentially optimize sensitivity to rule out a given condition in the event of a negative test result [3, 5]. Given the rarity of at-risk FITNESSGRAM®-calculated VO2-max levels below the age- and sex-specific healthy fitness zones in this cohort (sample prevalence of 8%), this yields a negative predictive value (NPV) of 97.5% and a positive predictive value (PPV) of 14.4%. Therefore, although 14.4% of subjects with an activity score of ≤ 14 actually exhibited at-risk aerobic fitness levels, those with an activity score > 14 had a risk of only 2.5%. The Hospital for Special Surgery Pediatric Functional Activity Brief Scale is therefore a good screening test in ruling out an unhealthy level of aerobic fitness. Because sensitivity and specificity are inherent to a screening test, whereas PPV and NPV are dependent on disease prevalence in a population, these PPV and NPV values would vary in different cohorts based on prevalence of children with at-risk aerobic fitness levels, whereas the sensitivity and specificity should remain constant. To that end, if the prevalence of obesity and coexisting comorbid conditions increases, the NPV may decrease with a resultant increase in the PPV.

One unexpected finding was the statistically significant difference in activity scores between sexes. Although males had statistically higher activity scores, the 3-point difference is of unclear clinical and analytical significance. Clinically, it is not clear whether a 3-point difference between sexes is meaningful. Analytically, it was important to evaluate whether sex was a true confounding variable. This would allow for the determination of whether the scale is biased toward male responders (eg, a true confounder) or if male sex was in fact associated with higher activity level. Sex was evaluated and formally excluded as a confounder variable through regression analyses and lack of correlation with any physical fitness (outcome) variables. In fact, aggregate unpublished data obtained from the school indicated that a slightly higher proportion of males participate in organized sports; 62% of males and 59% of females are involved in any organized sport with males participating in 0.92 sports per student and females participating in 0.83 sports per student. This small difference in sports participation may account for our finding of males having modestly higher scores than females, on average, and that this sample is representative of the high school population, in which males are in fact slightly more active than females. In other words, sex was a marker for increased activity in this high school cohort rather than inherent bias in the scale.

In conclusion, the current study further validated the Hospital for Special Surgery Pediatric Functional Activity Brief Scale by showing the absence of floor and ceiling effects and revealed positive correlations with four uniformly measured quantitative physical fitness metrics. This study also determined that those with a score ≤ 14 were more likely to have an at-risk aerobic capacity that fell in a range that has been previously shown to be associated with an elevated risk of metabolic syndrome. The use of this activity scale can help physicians treating adolescents to objectively quantify activity, screen for unhealthy activity levels, and aid in clinical research in sports medicine where activity level is a prognostic variable. Future research should determine a threshold for the minimally clinically important difference of the scale as well as expand the generalizability of these results by implementing the scale across larger populations of children and adolescents in a multicenter approach. This will provide further confirmation of the results of the current study as well as more robustly examine the scale's reliability and validity.


We thank James Flanagan of Daniel Hand High School in Madison, CT, USA, for his invaluable assistance with coordinating and executing this study.


1. Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition. 1997;7:1145-1159 10.1016/S0031-3203(96)00142-2.
2. Fabricant PD, Robles A, Downey-Zayas T, Do HT, Marx RG, Widmann RF, Green DW. Development and validation of a pediatric sports activity rating scale: the Hospital for Special Surgery Pediatric Functional Activity Brief Scale (HSS Pedi-FABS). Am J Sports Med. 2013;41:2421-2429 10.1177/0363546513496548.
3. Gilbert R, Logan S, Moyer VA, Elliott EJ. Assessing diagnostic and screening tests: Part 1. concepts. West J Med. 2001;6:405-409 10.1136/ewjm.174.6.405.
4. Human Kinetics. FITNESSGRAM: Activity and fitness assessment, reporting, and tracking. 2013. Available at: Accessed June 13, 2013.
5. Kocher MS, Zurakowski D. Clinical epidemiology and biostatistics: a primer for orthopaedic surgeons. J Bone Joint Surg Am. 2004;3:607-620.
6. Lee L, Sanders RA. Metabolic syndrome. Pediatrics in Review. 2012;10:459-468 10.1542/pir.33-10-459.
7. Mahar MT, Guerieri AM, Hanna MS, Kemble CD. Estimation of aerobic fitness from 20-m multistage shuttle run test performance. Am J Prev Med. 2011;4:Suppl 2S117-S123 10.1016/j.amepre.2011.07.008.
8. McHorney CA, Tarlov AR. Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res. 1995;4:293-307 10.1007/BF01593882.
9. Plowman SA, Sterling CL, Corbin CB, Meredith MD, Welk GJ, Morrow JR. The history of FITNESSGRAM. Journal of Physical Activity & Health 2006;2:S5-S20.
10. Sirard JR, Pate RR. Physical activity assessment in children and adolescents. Sports Med. 2001;6:439-454 10.2165/00007256-200131060-00004.
11. Terwee CB, Bot SD, Boer MR, Windt DA, Knol DL, Dekker J, Bouter LM, Vet HC. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;1:34-42 10.1016/j.jclinepi.2006.03.012.
12. The Cooper Institute. Healthy fitness zone standards. 2010. Available at: Accessed June 13, 2013.
13. The President's Council on Fitness, Sports and Nutrition. Celebrate your achievements: the president's challenge. 2013. Available at: Accessed June 13, 2013.
14. Welk GJ, De Saint-Maurice Maduro PF, Laurson KR, Brown DD. Field evaluation of the new FITNESSGRAM(R) criterion-referenced standards. Am J Prev Med. 2011;4:Suppl 2S131-S142 10.1016/j.amepre.2011.07.011.
15. Welk GJ, Going SB, Morrow JR Jr, Meredith MD. Development of new criterion-referenced fitness standards in the FITNESSGRAM® program: rationale and conceptual overview. Am J Prev Med. 2011;4:Suppl 2S63-S67 10.1016/j.amepre.2011.07.012.
16. Welk GJ, Laurson KR, Eisenmann JC, Cureton KJ. Development of youth aerobic-capacity standards using receiver operating characteristic curves. Am J Prev Med. 2011;4:Suppl 2S111-S116 10.1016/j.amepre.2011.07.007.
17. Zhu W, Mahar MT, Welk GJ, Going SB, Cureton KJ. Approaches for development of criterion-referenced standards in health-related youth fitness tests. Am J Prev Med. 2011;4:Suppl 2S68-S76 10.1016/j.amepre.2011.07.001.

Appendix 1

Scoring of the Hospital for Special Surgery Pediatric Functional Activity Brief Scale is performed by adding points from each question for total possible score range from 1 to 30 points. For “Running”, “Cutting”, “Decelerating”, “Pivoting”, “Duration”, and “Endurance”, each question is worth 0, 1, 2, 3, or 4 points. For the “Competition” and Supervision” questions, each question is worth 0, 1, 2, or 3 points. Reprinted with permission from Fabricant PD, Robles A, Downey-Zayas T, Do HT, Marx RG, Widmann RF, Green DW. Development and validation of a pediatric sports activity rating scale: the Hospital for Special Surgery Pediatric Functional Activity Brief Scale (HSS Pedi-FABS). Am J Sports Med. 2013;41:2421-2429.

© 2014 Lippincott Williams & Wilkins, Inc.