Share this article on:

Reliability and Validity of YRBS Physical Activity Items among Middle School Students


Medicine & Science in Sports & Exercise: March 2007 - Volume 39 - Issue 3 - p 416-425
doi: 10.1249/mss.0b013e31802d97af
BASIC SCIENCES: Epidemiology

Purpose: To assess test-retest reliability and validity of the Youth Risk Behavior Survey (YRBS) items for moderate and vigorous physical activity in middle school students.

Methods: Students (N = 125; 12.7 ± 0.6 yr) wore Actigraph accelerometers for 6.1 ± 1.0 d and twice completed surveys that included YRBS moderate and vigorous physical activity items. Accelerometer counts were transformed into minutes of moderate (3-6 METs) and vigorous (> 6 METs) physical activity. Days per week meeting moderate and vigorous physical activity recommendations were estimated using four summary methods. Reliability was assessed using intraclass correlation coefficients (ICC) from the two surveys. Validity was assessed as percent concordance, kappa coefficients, and sensitivity and specificity using binary YRBS and Actigraph outcomes.

Results: Test-retest ICC for the moderate and vigorous physical activity items were 0.51 and 0.46, respectively. Twenty-two percent of students met the recommended level of moderate physical activity (≥ 30 min·d−1, ≥ 5 d·wk−1) according to self-reports, whereas 90.4 and 66.4% met the recommendation according to accumulated accelerometer minutes and 5-min-bout criteria, respectively. Concordance between YRBS and Actigraph moderate physical activity measures was highest using accumulated accelerometer minutes. Sensitivity of the moderate YRBS item ranged from 0.19 to 0.23 for four comparisons, and specificity was 0.74-0.92. More than two thirds of students reported vigorous physical activity at recommended levels (≥ 20 min·d−1, ≥ 3 d·wk−1), whereas the highest prevalence according to Actigraph monitoring was 22.4%. Sensitivity of the YRBS vigorous item was high (0.75-0.92) compared with the four Actigraph measures; specificity was low (0.23-0.26).

Conclusion: YRBS questions underestimate the proportion of students attaining recommended levels of moderate physical activity and overestimate the proportion meeting vigorous recommendations. Use of accelerometry for physical activity surveillance seems to be indicated. At the minimum, new questions demonstrating greater validity are needed.

1Department of Health and Kinesiology, Purdue University, West Lafayette, IN; 2Department of Society, Human Development and Health, Harvard School of Public Health, Boston, MA; 3Department of Kinesiology, University of Connecticut, Storrs, CT; 4Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN; 5Mathematica Policy Research, Inc., Cambridge, MA; 6Department of Kinesiology and Community Health, University of Illinois, Urbana-Champaign, IL; and 7Department of Nutrition, Harvard School of Public Health, Boston, MA

Address for correspondence: Philip J. Troped, Ph.D., M.S., Department of Health and Kinesiology, Purdue University, Lambert Fieldhouse, Room 106-B, 800 West Stadium Avenue, West Lafayette, IN 47907-2046; E-mail:

Submitted for publication April 2006.

Accepted for publication October 2006.

Increasing physical activity participation among youth is a priority public health objective in the United States (27,30). Healthy People 2010 objectives for high school youth are to increase the prevalence of participation in sufficient moderate (≥ 30 min·d−1, ≥ 5 d·wk−1) and vigorous physical activity (≥ 20 min·d−1, ≥ 3 d·wk−1) to 35 and 85%, respectively. Recommendations contained within the 2005 Dietary Guidelines for Americans also emphasize that children and youth engage in 60 min of physical activity on most days of the week (29). Physiologic and health benefits of regular physical activity for youth, such as weight control, increases in lean muscle mass, and reductions in body fat, have been documented in the Surgeon General's Report on Physical Activity and Health (28). Physical inactivity is also recognized as playing a role in the growing epidemic of overweight and obesity among children and youth (24,25) and in the increase in pediatric type 2 diabetes (22).

Availability of valid, reliable physical activity surveillance measures that accurately characterize the population are key to monitoring the progress of local, state, and national efforts to promote physical activity among youth. Although objective measurement of physical activity with devices such as accelerometers seems less prone to bias and inaccuracies in free-living populations (8,31), some still consider self-reports more feasible for population surveillance (2). In the early 1990s, the U.S. Centers for Disease Control and Prevention (CDC) developed the Youth Risk Behavior Survey (YRBS) to monitor physical activity and other health risk behaviors in high school students, grades 9-12 (13). The YRBS includes a national school-based survey conducted by the CDC and state, territorial, and local surveys conducted by health and education agencies (10). A middle school YRBS module is available from the CDC and has items adapted from the high school YRBS, including one item that assesses vigorous physical activity during the past 7 d. Although not endorsed by the CDC, the YRBS item for moderate physical activity also has been used with middle school youth in some states, such as Vermont (see, p. 75) and Massachusetts (5-2-1 Go! project). It is critical to determine whether these items are reliable and valid, correctly identifying the proportions of youth obtaining recommended levels of moderate and vigorous physical activity.

To our knowledge, the YRBS physical activity items have not been tested in high school or middle school youth using objective measures. Two studies have assessed the test-retest reliability of select YRBS physical activity items among middle school and high school youth (3,4), yet neither evaluated the items for moderate and vigorous physical activity. Another study assessed reliability and validity of YRBS-based physical activity items among middle school and high school students and found a statistically significant correlation between accelerometer data and self-reported vigorous physical activity (21), but the wording of these items was different.

Other recent research suggests that compared with objective methods, the YRBS items produce very different surveillance estimates of youth meeting physical activity recommendations. Using accelerometers, Pate and colleagues (20) found that 93% of seventh- to ninth graders and 76% of 10th- to 12th graders from a Massachusetts sample met Healthy People 2010 guidelines for moderate physical activity (≥ 30 min·d−1, ≥ 5 d·wk−1), whereas only 1-3% met the guideline for sustained vigorous physical activity. In contrast, contemporaneous prevalence estimates for Massachusetts high school youth obtained from the YRBS show that 25% reported meeting the guideline for moderate physical activity, whereas 63% reported meeting the objective for vigorous activity (16).

The purposes of this study were twofold: 1) to assess the test-retest reliability of YRBS items for moderate and vigorous physical activity in a sample of middle school students, and 2) to evaluate the validity of these items by comparing self-report data against objective physical activity data obtained with an Actigraph accelerometer (formally known as Computer Science and Applications and Manufacturing Technology Inc.; now Actigraph, LLC, Fort Walton Beach, FL). Specifically, we determined how well YRBS items classified students as meeting the recommended levels of moderate and vigorous physical activity.

Back to Top | Article Outline


Study participants.

All study procedures were approved by the Massachusetts Department of Public Health (MDPH) Human Research Review Committee and the Human Subjects Committee at the Harvard School of Public Health. Sixth- and seventh-grade students from Massachusetts middle schools participating in a 2-yr nutrition and physical activity intervention (5-2-1 Go!) sponsored by MDPH were randomly selected to participate in the YRBS validation study, using a random-number generator in SAS (version 9.1., SAS Institute Inc., Cary, NC).

Six of 13 schools participating in 5-2-1 Go! were selected to be in this validation study. The schools were located in urban and suburban communities in eastern and central Massachusetts. The recruitment goal was to enlist 25 students from each of the schools for a total sample of 150 students. Additional student names selected at random were added to the recruitment list when students declined to participate or did not return a consent form. Contacts with students and parents were coordinated by a staff member at each school (in most cases a school nurse).

In total, recruitment letters and informed consent documents were sent to the parents or guardians of 293 students either by mail or student delivery. Overall, 144 students (49%) returned a signed consent form to participate in the study, 45 (15%) returned a signed form declining to participate, and 104 (36%) students did not return a form. Five students with consent either were absent on days when accelerometers were distributed or they chose not to participate. Thus, 139 students (47.4% of those recruited) ultimately participated in the study.

Back to Top | Article Outline

Study design.

Accelerometer and survey data used in this study were collected during spring 2003. Research staff met with students at their schools and instructed them to wear an Actigraph accelerometer for 7 d. On the day monitors were collected, students completed a brief survey, which included moderate and vigorous physical activity items from the middle school and high school YRBS (survey test 1). Data from this post-monitoring period survey were used to assess the validity of the two YRBS items.

In addition to the short survey, students completed a longer survey including the two YRBS items (survey test 2) as part of their participation in the 5-2-1 Go! study. The protocol specified an interval of 1-2 wk between the two survey administrations. Because of logistical constraints of some participating schools, the actual interval between surveys ranged from 1 to 40 d. Forty-seven students (35.9%) completed surveys 1-6 d apart, and 47 (35.9%) completed surveys 10-15 d apart. The remaining 37 students (28.2%) from two schools completed the two surveys 26 and 40 d apart, respectively. Data from the two survey administrations were used to assess test-retest reliability of the YRBS items.

To compare accelerometer study participants with other students in 5-2-1 Go!, we used additional data from the baseline 5-2-1 Go! survey and anthropometry measurements collected during fall 2002. Besides the YRBS items for moderate and vigorous activity, students answered questions about participation in physical education during the past 7 d and the number of sports teams they had been on during the previous year. Each subject's height and weight were measured by school personnel. The proportion of students that were a healthy weight or underweight (BMI < 85th percentile), at risk for overweight (BMI ≥ 85th percentile and < 95th percentile), and overweight (≥ 95th percentile) was calculated using an age- and sex-based algorithm developed by the CDC (available at

Back to Top | Article Outline

Accelerometer protocol.

For objective monitoring of physical activity, we used an Actigraph accelerometer (model 7164), a lightweight (1.5 ounces) uniaxial monitor designed to measure accelerations in the vertical plane. Although accelerometers cannot necessarily be considered the "gold standard" method for measuring physical activity (2), they seem to be the best available objective measure of free-living activity. Previous laboratory- and field-based studies have shown that the Actigraph accurately measures the volume, intensity, and temporal patterns of dynamic activities such as walking and running when worn on the hip (7,18,19). Activity monitors were set to sample at 60-s time intervals. Therefore, activity count data were obtained as counts per minute. The Actigraph monitors were secured on an adjustable nylon belt that minimized extraneous movement of the monitor. Students were instructed to wear the monitor on the right hip for seven consecutive days. They were asked to wear the monitor at all times, except when sleeping, bathing, or swimming, and they were given daily log sheets to record times they put their monitors on and removed them. Activity monitors and log sheets were collected at each school after 1 wk, and students were given a small cash stipend for their participation. All monitors were checked for calibration before data collection using the manufacturer's calibration device, and the monitors were checked after each deployment using a brief walk-test protocol.

Back to Top | Article Outline

Actigraph data reduction.

Accelerometer data were downloaded and processed using a SAS data-reduction program with built-in quality-assurance checks as described previously (17). Minute-by-minute accelerometer counts were classified as light-intensity activity (< 3 METs), moderate-intensity activity (3-6 METs), and vigorous-intensity activity (> 6 METs), using an age-based equation (METs = 2.757 + (0.0015 × counts per minute) − (0.0896 × age [yr]) − (0.000038 × counts per minute × age [yr]) developed by Freedson and colleagues (9).

Because the YRBS questions do not specify whether bout length is important in accumulating minutes of vigorous and moderate physical activity, we varied the operational definition for our objective Actigraph measures. Daily total minutes of moderate- and vigorous-intensity physical activity were computed four ways for each intensity: total accumulated minutes above the moderate or vigorous threshold, and minutes estimated from bouts ≥ 5 min, ≥ 10 min, ≥ 30 min (moderate physical activity only), and ≥ 20 min (vigorous physical activity only). A bout started when a minimum intensity threshold for moderate or vigorous activity was exceeded. Bout duration was estimated by extracting the number of subsequent minutes that were spent in bouts of activity above specific intensity levels. We allowed bouts to include interruptions of 1 or 2 min, during which activity intensity was below the threshold. A bout ended whenever there were three consecutive minutes below the threshold. In these cases, the last 3 min were not included in the bout duration.

The SAS program noted above created dichotomous bout variables, which detected any occurrences of the following bouts: 5-9, 10-19, 20-29, 30-44, and ≥ 45 min. To estimate daily duration of moderate and vigorous physical activity, and to determine whether someone met the activity recommendation for a given day (30 min for moderate and 20 min for vigorous), we did the following. For the 5- to 9-min- and the 10- to 19-min-bout measures, we estimated minutes in bouts using the midpoint of the bout range (e.g., 7min for 5- to 9-min bouts) multiplied by the number of bout occurrences. For bouts longer than 19 min, we used the lower endpoint of the bout range. It is important to note that when we estimated daily minutes from bouts ≥ 5 min, we included minutes obtained through 5- to 9-min bouts as well as longer bouts (e.g., 10-19 min); similarly, physical activity accumulated in bouts ≥ 10 min in duration included 10- to 19-min bouts as well as longer ones.

As an example, subject A had two 5- to 9-min bouts of moderate activity, two bouts of 10-19 min, no bouts of 30-44 min, and no bouts ≥ 45 min on a given day. Thus, this subject's 5-min-bout measure was two (7 min) plus two (14.5 min) = 43 min of moderate activity obtained in bouts ≥ 5 min.

Data were summarized using an automated scoring algorithm, which assumed that periods of monitor inactivity (0 counts per minute) for 20 consecutive minutes indicated that the monitor was not being worn by the participant. Except for day 1, valid days of monitoring were defined as days during which the monitor was worn for at least 600 min·d−1 (71% of 14 waking hours), using wear estimates from the algorithm. We included day 1 if a student had at least 383 min of data on that day (71% of nine waking hours, with the monitor initialized to start recording at 12:00 p.m.). Our analytic sample for the validation analyses comprised 125 students who had at least 4 d of valid Actigraph monitoring and who completed the brief post-monitoring period survey. The overall average number of valid days per assessment period in this sample was 6.1 ± 1.0, and the estimated mean daily duration of wear was 803.9 ± 63.3 min·d−1 (about 13.4 h). Average daily minutes of moderate and vigorous physical activity also were obtained using estimates of total minutes each day divided by the number of valid days of monitoring.

Back to Top | Article Outline

Survey physical activity measures.

Both the short and longer survey administered to students included YRBS items for moderate and vigorous physical activity (available at The moderate physical activity item was worded: "On how many of the past 7 d did you participate in physical activity for at least 30 min that did not make you sweat or breathe hard, such as fast walking, slow bicycling, skating, pushing a lawn mower, or mopping floors?" The vigorous physical activity item was worded: "On how many of the past 7 d did you exercise or participate in physical activity for at least 20 min that made you sweat and breathe hard, such as basketball, soccer, running, swimming laps, fast bicycling, fast dancing, or similar aerobic activities?" Response categories for both items were 0 to 7 d.

For test-retest reliability analyses, we used the number of days reported by students on the two surveys. For the assessment of YRBS item validity, we created a dichotomous (yes/no) measure for moderate (≥ 30 min, ≥ 5d·wk−1) and vigorous physical activity (≥ 20 min, ≥ 3 d·wk−1). These cut points are consistent with current recommendations for moderate and vigorous physical activity and recently published YRBS data (5).

Back to Top | Article Outline

Actigraph measures.

Actigraph data were used to create four frequency variables for both moderate and vigorous physical activity to enable comparisons with the YRBS items for moderate and vigorous activity. Frequency was defined as the number of days a student obtained 30min ofmoderate and 20 min of vigorous physical activity, respectively. As noted, we used four different methods for estimating daily duration from the accelerometer: 1) accumulated minutes (total minutes during the day), 2) minutes per day obtained through bouts at least 5 min in duration (bouts ≥ 5-9 min), 3) minutes per day obtained through bouts at least 10 min in duration (bouts ≥ 10-19min), and 4) minutes obtained through bouts of sustained activity during a day (≥ 30 min for moderate and ≥20 min for vigorous).

These duration data were used to estimate the numbers of days per week of moderate and vigorous physical activity meeting recommendations for daily duration from valid days of monitoring. The purpose was to derive a frequency estimate for the past 7 d comparable with what is derived from the YRBS questions. Finally, we created eight dichotomous Actigraph outcomes for moderate (≥ 30 min, ≥5d·wk−1) and vigorous physical activity (≥ 20 min, ≥3d·wk−1), comparable with the two derived YRBS measures.

Back to Top | Article Outline

Statistical analysis.

To assess test-retest reliability of the two YRBS items, we estimated intraclass correlation coefficients (ICC) for the overall sample and then for subgroups stratified by sex and time between survey administrations (≤ 15 d and > 15 d). The 15-d cut point is consistent with the time frame used in previous YRBS reliability assessments (3,4).

We used three statistical procedures to assess the validity of the YRBS items. First, we cross-tabulated the YRBS and Actigraph measures to assess the prevalence of concordant classification for meeting moderate and vigorous physical activity recommendations, for the overall sample and separately for girls and boys. Second, we estimated kappa coefficients for each comparison between the YRBS and Actigraph classification. Third, using data from these 2 × 2 tables, we calculated the sensitivity and specificity of the YRBS items for the four different Actigraph measures of moderate- and vigorous-intensity physical activity. Sensitivity was the probability of the YRBS items correctly classifying students as meeting recommendations, where the Actigraph data was considered the true prevalence (i.e., the criterion measure). Specificity was the probability of YRBS correctly classifying students as not meeting the recommended level of physical activity. All analyses were conducted with SAS.

Back to Top | Article Outline


As shown in Table 1, study participants were comparable with other sixth- and seventh-grade students participating in the 5-2-1 Go! intervention at the six schools. A slightly higher percentage of Asian students participated in the accelerometer study (P = 0.03), and a higher proportion of students in the validation study reported lower levels of participation in physical education classes (i.e., 1-2 d). No differences were seen in other demographic characteristics, weight status, self-reported moderate and vigorous physical activity derived from YRBS questions, or sports team participation.



Students accumulated an average of 77.7 ± 28.1 min·d−1 of moderate and 10.8 ± 10.5 min·d−1 of vigorous physical activity according to their Actigraph data (data not shown). Values for the 25th, 50th, 75th, and 100th percentiles were 58.3, 76.0, 96.4, and 156.1 min for moderate physical activity and 4.1, 8.3, 14.0, and 69.0 min for vigorous physical activity. For the 122 returned log sheets that were reviewed, 38 students reported a total of 75 instances where they removed the Actigraph before a specific moderate- or vigorous-intensity activity (e.g., swimming, soccer, baseball, cheerleading).

Back to Top | Article Outline

Test-retest reliability for YRBS items.

The overall ICC for the YRBS moderate and vigorous physical activity items were in the moderate range at 0.51 and 0.46, respectively (Table 2). Test-retest reliability tended to be higher for girls than boys. For both moderate and vigorous physical activity items, ICC were higher when the time between survey administrations was ≤ 15 d, compared with > 15 d. The reduction in magnitude of the ICC with a longer test-retest interval was more pronounced for moderate physical activity.



Back to Top | Article Outline

Validity of YRBS moderate physical activity item.

Overall, 21.6% of students met the recommended level of moderate physical activity according to the YRBS item (Fig. 1). On the basis of the four Actigraph measures, the proportion of students performing moderate physical activity for 30 min, ≥ 5 d·wk−1 was 90.4% (accumulated minutes), 66.4% (5-min bouts), 25.6% (10-min bouts), and 2.4% (sustained). The one statistically significant difference between boys and girls was found for the Actigraph 10-min-bout measure, where 36.7% of boys met the recommended level of physical activity compared with only 15.4% of girls (χ2 = 7.4, P = 0.007).



As shown in Table 3, the highest level of concordance (20.8%), but also discordance (69.6%), in terms of identifying students who performed moderate physical activity ≥ 5 d·wk−1, was achieved when the accumulated time definition was used to create the Actigraph measure. Overall, the percentage of discordant pairs ranged from 24.0% for the sustained Actigraph measure to 70.4% for the measure of accumulated minutes. Concordance for meeting the recommendation for moderate physical activity declined in a consistent fashion from accumulated (20.8%) to a 5-min bout (12.8%) to a 10-min bout (5.6%) to a sustained definition (0.0%) of activity. No students met the recommended level of moderate physical activity according to their Actigraph data when the criterion was a sustained 30 min of activity. Kappa coefficients between the four Actigraph measures and the YRBS measure ranged from −0.05 to 0.03 (data not shown). All of these values are considered poor according to Landis and Koch's adjectival rating system (14). The percent concordance differed for girls and boys, and the decline in percent concordance for meeting activity recommendations across Actigraph measures was steeper for girls than for boys.



In all cases, sensitivity of the YRBS moderate physical activity item was low (Table 3). Specificity ranged between 0.74 and 0.92 for the four Actigraph measures and was highest for the measure of accumulated minutes.

Back to Top | Article Outline

Validity of YRBS vigorous activity item.

As shown in Figure 2, about 85% of boys and 69% of girls reported engaging in vigorous physical activity ≥ 3 d·wk−12 = 4.4, P = 0.04). A significantly higher proportion of girls (6.2%) than boys (0.0%) met the recommended level of vigorous activity on the basis of a sustained (≥ 20) definition of Actigraph minutes [χ2 = 3.81, P = 0.05]. There were no statistically significant differences between boys and girls in the prevalence of vigorous physical activity on the basis of the other three Actigraph outcomes. Among the overall study sample, the prevalence of vigorous physical activity for the Actigraph measures ranged from a high of 22.4% (accumulated minutes criterion) to a low of 3.2% (20-min sustained criterion).



The highest level of concordance between the YRBS and Actigraph measures in terms of classifying students as meeting recommendations for vigorous physical activity occurred using the Actigraph measure of accumulated minutes (19.2%) (Table 4). The percentage of discordant pairs for the overall sample ranged from 60.8 to 75.2% for the four Actigraph measures. This primarily comprised students who reported performing vigorous activity ≥ 3 d·wk−1 on the YRBS item, but this group did not attain this level of activity according to their accelerometer data. Similar to results for moderate physical activity, the k values for the vigorous physical activity measures were all low, ranging from −0.002 to 0.06 (data not shown). Overall, there were slight differences between girls and boys for concordance between the YRBS and Actigraph measures.



In contrast to the moderate physical activity item, sensitivity of the YRBS vigorous item was relatively high (above 0.80 for three Actigraph measures and 0.75 for the fourth). Conversely, specificity was low (ranging between 0.23 and 0.26). Sensitivity of the YRBS vigorous physical activity item was higher for boys than for girls on three of the four Actigraph measures.

Back to Top | Article Outline


We assessed the test-retest reliability and validity of two YRBS items for moderate and vigorous physical activity in a sample of middle school students. The purpose was to examine the utility of the YRBS items for determining prevalence of physical activity in this population. Overall, these items demonstrate moderate reliability but poor validity for male and female students in the sixth and seventh grades.

Key findings were that the YRBS moderate physical activity item grossly underestimated the proportion of students who performed moderate activity ≥ 30 min·d−1, ≥ 5 d·wk−1, as indicated by students' accelerometer data; that is, the item has poor sensitivity. Sensitivity of the YRBS moderate item was not improved by varying how the objective Actigraph measure was defined. In contrast, the YRBS vigorous physical activity question overestimated the proportion of students performing vigorous activity ≥ 20 min·d−1, > 3 d·wk−1. This was reflected in the relatively high sensitivity and low specificity of the item, irrespective of whether the Actigraph outcome was defined as accumulated minutes in a day or duration obtained through bouts. A higher proportion of boys versus girls self-reported engaging in vigorous physical activity at recommended levels (by about 16%), whereas the Actigraph data only indicated a statistically significant gender difference in vigorous activity when a sustained definition of activity was used, and, in this case, girls met the recommendation, not boys.

The present study's estimates of daily moderate and vigorous physical activity from accelerometry (e.g., about 78 and 11 min, respectively) are somewhat consistent with two other recent studies reporting on samples of MA youth (6,20) using objective data. For example, Cradock and colleagues found that middle school students monitored with TriTrac accelerometers had daily estimates of moderate and vigorous physical activity of approximately 59 and 7 min, respectively (6). Also, the overall proportions of students in this study meeting recommendations for moderate physical activity on the basis of accumulated minutes of Actigraph data (about 88%) and for vigorous activity on the basis of bouts ≥ 20 min (about 3%) are quite similar to estimates Pate et al. (20) reported for seventh- to ninth-grade students in MA on the basis of Actigraph measurement (93% and 1-3%, respectively).

Among youth, validation studies of self-report measures using accelerometer data are limited. In a recent review, Kohl and colleagues (12) have identified one physical activity self-report validation study in youth (11) that used the Actigraph accelerometer as a criterion measure. In that study, investigators found low to moderate correlations (r = 0.03-0.51) with different questionnaire measures of physical activity and Actigraph data among 7- to 15-yr-old youth. We are aware of only one published study that has assessed the validity of items similar to the YRBS physical activity items using Actigraph data as the validation measure (21). The authors of that study found a modest but statistically significant correlation (r = 0.36) between Actigraph data and self-reported vigorous physical activity for the previous 7 d, but they did not find a significant correlation for a moderate physical activity item (r = 0.26). Meaningful comparisons between our findings and these studies are difficult to make because we did not use a correlational approach to assessing validity. However, one consistent conclusion for all of these studies, including ours, is that self-reported physical activity items that assess intensity and duration have, at best, modest validity among youth.

Comparable assessments of the relative validity of self-report items in adults have indicated that brief physical activity surveillance items can be used to classify activity levels at the population level. Using Actigraph data as one criterion measure, Marshall and colleagues (15) have found that percentage agreement (between criterion and self-report) was 67 and 74%, respectively, for items that measured moderate and vigorous activity-levels of agreement substantially higher than those found in the present study. Matthews et al. (17) have found similar correct classification for two close-ended surveillance items measuring moderate and vigorous physical activity (70 and 81%, respectively).

Overall, we found moderate test-retest reliability for both the moderate- and vigorous-intensity YRBS items. Kohl and colleagues (12) have reviewed a number of studies that assessed test-retest reliability of physical activity self-report measures in youth. Studies with comparable age groups and test-retest time periods generally had higher intraclass and Pearson correlation coefficients (ranging from 0.70 to 0.96) than we found. Prochaska and colleagues (21) also found test-retest ICC for items that assessed vigorous and moderate physical activity during 7d that were higher (0.66 and 0.64, respectively) than those reported here. However, subjects in that study were, on average, about 2 yr older than our study population. Recently, in an Australian study assessing test-retest reliability of an adolescent physical activity survey among eighth graders, ICC were in the range of 0.30-0.64 for calculated energy expenditure, with most below 0.50 (1).

Our study has several limitations that may have affected our results. First, because of logistic constraints of participating schools, the interval between surveys exceed a 2-wk protocol in 28% of students. Thus, our test-retest reliability findings could reflect modest item reliability or an underestimate of reliability attributable to actual changes in moderate or vigorous physical activity from the first survey administration to the second (12). In fact, the latter explanation is relevant to all subjects in this study, even those with retest periods under 2 wk, because they were reporting on physical activity for two different time periods (because the YRBS question asks about the previous 7 d), and activity patterns could change. Data were collected during the spring in Massachusetts, when both temperature and precipitation can be quite variable from day to day. Both of these environmental factors could affect the amount of moderate and vigorous physical activity being performed by students from 1 wk to the next.

Several factors could affect the accuracy of accelerometer estimates of moderate and vigorous physical activity. In both adult and youth populations, there continues to be active research (23) and a lack of consensus as to which accelerometer regression equation to use to predict time spent at different intensities of activity. We used a formula developed by Freedson and colleagues (9), consistent with the methods recently employed by other researchers (26). Nevertheless, an overestimation of moderate-intensity physical activity or an underestimation of vigorous activity (e.g., occurring if count levels representing vigorous activity were classified as moderate intensity because of a faulty algorithm) from the Actigraph would have resulted in a less favorable assessment of YRBS item validity. Although use of a 1-min epoch in the present study was consistent with other studies in youth, use of a shorter sampling interval (e.g., 30 s) likely would have resulted in higher estimates of accumulated moderate and vigorous activity. We do not believe that this increase in Actigraph estimates would have changed the overall conclusion that the YRBS moderate and vigorous items underestimate and overestimate activity, respectively. We conclude this because, according to self-reports, the prevalence of students meeting recommendations was very low for moderate activity (22%) and high for vigorous activity (77%).

Systematic bias introduced by variable subject compliance during different types of physical activities is another potential concern. Overall, compliance with wearing the accelerometers in our study seemed acceptable. However, as noted earlier, a review of log sheets indicated that 38 students (30% of our sample) reported 75 instances where they removed the monitor before playing sports or some other activity of mostly moderate or vigorous intensity. Preferential Actigraph removal during vigorous physical activity would erroneously reduce the concordance between Actigraph and YRBS vigorous activity measures. However, given that only a third of the students reported removing the monitor an average of fewer than two times during the monitoring period (which averaged 6 d), it is reasonable to assume that removal of monitors did not substantially bias our findings. Another recent study using TriTrac accelerometers in similarly aged youth did not find that missing accelerometer data biased overall physical activity estimates (6).

Finally, certain moderate- to vigorous-intensity physical activities such as walking on an incline, stair climbing, or carrying/pushing objects may be underestimated by uniaxial monitors mounted at the hip (23), leading to a possible underestimation of the prevalence of these activities by the Actigraph. In our study, self-reported measures of moderate-intensity activities from the YRBS were consistently lower than those measured by the Actigraph, whereas self-reported vigorous activities were substantially higher than those measured by the Actigraph. Given the magnitude of the differences between self-report and objective measures for vigorous activity, we do not believe an upward adjustment in the vigorous physical activity estimates from the Actigraphs would have substantially changed our findings, because the overall levels of objectively measured vigorous physical activity were extremely low (e.g., averaging only 10 min·d−1). In summary, we would agree with the conclusion of Brener and colleagues (2) that accelerometers may not provide a "gold standard" validation measure for self-reported physical activity and might miss certain activities in which youth engage. Nevertheless, despite their limitations, we believe that accelerometers provide useful objective information about important elements of the free-living activity patterns in youth and, therefore, can provide valuable insight into the measurement properties of physical activity self-reports.

Back to Top | Article Outline


The YRBS items for moderate and vigorous physical activity that we tested have moderate reliability and poor validity among middle school students when compared with data obtained from accelerometry. Neither item demonstrated validity in terms of its ability to correctly classify middle school students with respect to meeting current physical activity recommendations.

For surveillance among this population of preadolescent youth, objective monitoring with accelerometers seems to be the more ideal measurement methodology. At a minimum, improved self-report items are needed for middle school youth. In this age group, items may perform better if they are limited to asking youth about frequency of specific physical activities; this should be examined in further studies. The CDC has continued to update the YRBS self-report physical activity measures for middle school youth. The 2007 middle school YRBS includes one item tailored to the current 60-min physical activity recommendation (29) that asks students to "add up all the time" they spend in moderate to vigorous physical activity during each of the past 7 d.

The YRBS items we evaluated in middle school youth had the potential to result in misguided allocation of public health resources and physical activity promotion efforts. For example, objective data from this study and others indicate that middle school youth typically obtain insufficient amounts of vigorous-intensity physical activity but that they are engaged in levels of moderate activity that meet the Healthy People 2010 recommendations. Therefore, it is likely that public health messages and intervention strategies need to concentrate on creating opportunities for middle school youth to include more vigorous-intensity physical activity in their daily lives. Validation of the new YRBS question for middle school youth is still needed, but it has the potential to improve the accuracy of surveillance estimates.

The authors acknowledge Maria Bettencourt, Vanessa Cavallaro, Kathleen Grattan, Solomon Mezgebu, Wee Lock Ooi, and Julie Robarts at the Massachusetts Department of Public Health for their participation in the design and implementation of the 5-2-1 Go! study. This research was supported through a subcontract from the Massachusetts Department of Public Health under a CDC cooperative agreement with MDPH, and the Centers for Disease Control and Prevention (Prevention Research Centers Cooperative Agreement U48/CCU115807). The authors are not employees of the Massachusetts Department of Public Health, which is not responsible for the accuracy of the reported results or the views expressed by the authors.

Back to Top | Article Outline


1. Booth, M. L., A. D. Okely, T. N. Chey, and A. Bauman. The reliability and validity of the Adolescent Physical Activity Recall Questionnaire. Med. Sci. Sports Exerc. 34:1986-1995, 2002.
2. Brener, N. D., J. O. Billy, and W. R. Grady. Assessment of factors affecting the validity of self-reported health-risk behavior among adolescents: evidence from the scientific literature. J. Adolesc. Health 33:436-457, 2003.
3. Brener, N. D., J. L. Collins, L. Kann, C. W. Warren, and B. I. Williams. Reliability of the youth risk behavior survey questionnaire. Am. J. Epidemiol. 141:575-580, 1995.
4. Brener, N. D., L. Kann, T. McManus, S. A. Kinchen, E. C. Sundberg, and J. G. Ross. Reliability of the 1999 youth risk behavior survey questionnaire. J. Adolesc. Health 31:336-342, 2002.
5. Centers for Disease Control and Prevention. Surveillance Summaries, May 21, 2004. MMWR 2004:53(No. SS-2).
6. Cradock, A. L., J. L. Wiecha, K. E. Peterson, A. M. Sobol, G. A. Colditz, and S. L. Gortmaker. Youth recall and TriTrac accelerometer estimates of physical activity levels. Med. Sci. Sports Exerc. 36:525-532, 2004.
7. Freedson, P. S., E. Melanson, and J. Sirard. Calibration of the Computer Science and Applications, Inc. accelerometer. Med. Sci. Sports Exerc. 30:777-781, 1998.
8. Freedson, P. S., and K. Miller. Objective monitoring of physical activity using motion sensors and heart rate. Res. Q. Exerc. Sport 71:S21-S29, 2000.
9. Freedson, P. S., J. Sirard, E. Debold, et al. Calibration of the Computer Science and applications, Inc. (CSA) accelerometer. Med. Sci. Sports Exerc. 29(Suppl.):S45, 1997.
10. Grunbaum, J. A., L. Kann, S. A. Kinchen, et al. Youth risk behavior surveillance-United States, 2001. MMWR Surveill. Summ. 51:1-62, 2002.
11. Janz, K. F., J. Witt, and L. T. Mahoney. The stability of children's physical activity as measured by accelerometry and self-report. Med. Sci. Sports Exerc. 27:1326-1332, 1995.
12. Kohl, H. W. III, J. E. Fulton, and C. J. Caspersen. Assessment of physical activity among children and adolescents: a review and synthesis. Prev. Med. 31:S54-S76, 2000.
13. Kolbe, L. J., L. Kann, and J. L. Collins. Overview of the youth risk behavior surveillance system. Public Health Rep. 108(Suppl 1): 2-10, 1993.
14. Landis, J. R., and G. G. Koch. The measurement of observer agreement for categorical data. Biometrics 33:159-174, 1977.
15. Marshall, A. L., B. J. Smith, A. E. Bauman, and S. Kaur. Reliability and validity of a brief physical activity assessment for use by family doctors. Br. J. Sports Med. 39:294-297, 2005.
16. Massachusetts Department of Education. 2001 Massachusetts Youth Risk Behavior Survey Results. Boston, MA: Massachusetts Department of Education, pp. 1-126, 2002.
17. Matthews, C. E., B. E. Ainsworth, C. Hanby, et al. Development and testing of a short physical activity recall questionnaire. Med. Sci. Sports Exerc. 37:986-994, 2005.
18. Melanson, E. L., Jr., and P. S. Freedson. Validity of the Computer Science and Applications, Inc. (CSA) activity monitor. Med. Sci. Sports Exerc. 27:934-940, 1995.
19. Nichols, J. F., C. G. Morgan, L. E. Chabot, J. F. Sallis, and K.J. Calfas. Assessment of physical activity with the Computer Science and Applications, Inc., accelerometer: laboratory versus field validation. Res. Q. Exerc. Sport 71:36-43, 2000.
20. Pate, R. R., P. S. Freedson, J. F. Sallis, et al. Compliance with physical activity guidelines: prevalence in a population of children and youth. Ann. Epidemiol. 12:303-308, 2002.
21. Prochaska, J. J., J. F. Sallis, and B. Long. A physical activity screening measure for use with adolescents in primary care. Arch. Pediatr. Adolesc. Med. 155:554-559, 2001.
22. Rosenbloom, A. L., J. R. Joe, R. S. Young, and W. E. Winter. Emerging epidemic of type 2 diabetes in youth. Diabetes Care 22:345-354, 1999.
23. Strath, S. J., D. R. Bassett Jr., and A. M. Swartz. Comparison of MTI accelerometer cut-points for predicting time spent in physical activity. Int. J. Sports Med. 24:298-303, 2003.
24. Strauss, R. S., and H. A. Pollack. Epidemic increase in childhood overweight, 1986-1998. JAMA 286:2845-2848, 2001.
25. Troiano, R. P., K. M. Flegal, R. J. Kuczmarski, S. M. Campbell, and C. L. Johnson. Overweight prevalence and trends for children and adolescents. The National Health and Nutrition ExaminationSurveys, 1963 to 1991. Arch. Pediatr. Adolesc. Med. 149: 1085-1091, 1995.
26. Trost, S. G., R. R. Pate, J. F. Sallis, et al. Age and gender differences in objectively measured physical activity in youth. Med. Sci. Sports Exerc. 34:350-355, 2002.
27. U.S. Department of Health and Human Services. Healthy People 2010. Understanding and Improving Health. 2nd ed. Washington, DC: U.S. Department of Health and Human Services, pp. 22-3-22-32, 2000.
28. U.S. Department of Health and Human Services. Physical Activity and Health: A Report of the Surgeon General, Atlanta, GA: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, pp. 61-172, 1996.
29. U.S. Department of Health and Human Services and U.S. Department of Agriculture. Dietary Guidelines for Americans, 2005, 6th Ed. Washington DC: U.S. Government Printing Office, pp. 19-22, 2005.
30. U.S. Department of Health and Human Services, U.S. Department of Education. Promoting better health for young people through physical activity and sports, a report to the President from the Secretary of Health and Human Services and the Secretary of Education. Washington, DC: U.S. Department of Health and Human Services, U.S. Department of Education, pp. 1-38, 2000.
31. Westerterp, K. R. Physical activity assessment with accelerometers. Int. J. Obes. Relat. Metab. Disord. 23(Suppl 3):S45-S49, 1999.


©2007The American College of Sports Medicine