Journal Logo

BASIC SCIENCES: Epidemiology

Reliability and Validity of the Instrument Used in BRFSS to Assess Physical Activity


Author Information
Medicine & Science in Sports & Exercise: August 2007 - Volume 39 - Issue 8 - p 1267-1274
doi: 10.1249/mss.0b013e3180618bbe
  • Free


The target of one of the Healthy People 2010 (HP 2010) objectives is that 50% of the U.S. population engage in regular physical activity (24). Progress toward meeting objectives for physical activity in HP 2010 is measured on the national level using the National Health Interview Survey, and state-level progress is measured with the Behavioral Risk Factor Surveillance System (BRFSS) (24).

The BRFSS is a survey of health behaviors and conditions that is administered yearly by telephone in all 50 states and the District of Columbia, Guam, Puerto Rico, and the U.S. Virgin Islands (8). Before 2001, the BRFSS physical activity questions consisted of two questions: "What type of physical activity or exercise did you spend the most time doing during the past month?" and "What other type of physical activity gave you the next most exercise during the past month?" (4). Respondents answered an activity for each question, and the interviewer coded that activity using a previously developed list of 56 leisure-time sports and recreation activities (4). The current questions on physical activity were implemented in 2001 and were designed to measure leisure-time, household, and transportation activities in which a respondent participates in a usual week, with additional questions for occupational activity and detail about walking and strengthening activities (17). These questions ask about frequency (d·wk−1) and duration (min·d−1) of moderate- and vigorous-intensity physical activity and walking in a usual week, and they allow the respondent to describe the intensity (moderate or vigorous) of their activities (4). Unlike the previous BRFSS questions, the current surveillance questions assess physical activity levels recommended by the 1995 CDC/ACSM recommendations (19).

Little is known about the reliability and validity of the data obtained from the questions used before 2001. Reliability (measured by the kappa (κ)) ranged from 0.50 to 0.77 for the question on sedentary lifestyle (20,21), from −0.07 to 0.64 for regular aerobic exercise (21), and from 0.26 to 0.30 for regular moderate- or vigorous-intensity activity (6,14). Although validity was not assessed for data from the original BRFSS questions, the validity of data from similar survey questions (e.g., Minnesota Leisure-Time Physical Activity Survey and Harvard Alumni Activity Survey) ranged from correlations of 0.30 to 0.50 (7,18,25). Several reports of reliability and validity for data from the current BRFSS questions have been published. In a national study conducted by Evenson et al. (12) among women from diverse racial and ethnic groups, the test-retest reliability from the current BRFSS questions (using intraclass correlation coefficients (ICC)) was ICC = 0.69 to determine level of physical activity (met recommendations, insufficiently active, inactive). A larger reliability study found κ = 0.40 for classifying respondents into activity levels (active, insufficient, and inactive), ICC = 0.39 for vigorous-intensity activity, and ICC = 0.44 for moderate-intensity activity (5). A validity study comparing the recommended level of physical activity with an objective measure that combined heart rate monitoring and accelerometry using individually calibrated patterns of energy expenditure found validity of κ = 0.61 (22).

The development of new surveillance questions is a long-term process requiring both qualitative (e.g., focus groups, cognitive testing) and quantitative (e.g., reliability and validity testing) assessment. Developing questions that allow accurate reporting of physical activity is complicated because respondents have difficulty in recalling the details of participation in the various types of activity, including the intensity, duration, and frequency of activity. In addition, failure to remember during the time between the activity and the survey, the tendency to give socially desirable answers, and personal characteristics (such as level of education) may differentially affect a person's ability to provide reliable, valid responses (11,14). The 2001 BRFSS questions were developed using two studies that employed cognitive testing (1,14). After each of these developmental studies, the questions were modified to clarify meaning and improve comprehension. For example, feedback from focus groups prompted researchers to reverse the order of questions from that used in earlier versions, thus placing questions on moderate intensity before those on vigorous intensity.

The physical activity questions currently used in the BRFSS were tested for reliability and validity in the BRFSS Physical Activity Study (BPAS). The purpose of this paper is to report the findings for reliability and validity of the questions that were developed for the BRFSS.


Study design.

A convenience sample of 60 participants (30 men and 30 women) was recruited from September 2000 to May 2001 from the campus of the University of South Carolina (USC) and the Columbia, SC metropolitan area. Participants were recruited through public service announcements and were paid a $25 stipend. Before participating, each person read and signed an informed consent form, which had been approved by the USC institutional review board for protection of human subjects.

Participants were observed for 22 d and visited the study center four times. During visit 1 (day 1), the participants completed the informed consent and a demographic form to record their age, sex, race/ethnicity, and educational attainment. They were measured for height and weight using a wall-mounted tape measure and a portable Seca Model 770 scale (Shorr Productions, Olney, MD). Visit 2 was on day 8, with participants receiving a physical activity log, an accelerometer, and a pedometer. Participants were instructed to wear the accelerometer and pedometer concurrently from day 9 to day 15 and to complete the log at the end of each day during this time frame. The pedometer steps accumulated each day were self-recorded in the log. On visit 3 (day 16), the participants returned the physical activity log, accelerometer, and pedometer. Finally, on day 22 (visit 4), the participants returned to the study center to receive their $25 for participating. In addition to the activities described, participants were surveyed about their physical activity three times by telephone between the hours of 9 a.m. and 7 p.m. Participants were called twice during the first week (surveys 1 and 2) and once during the third week (survey 3).


The physical activity survey included items on a variety of activities, including occupational activity (N = 1), walking (N = 3), muscle-strengthening (N = 1), and moderate- (N = 3) and vigorous-intensity (N = 3) physical activity. (Survey questions are available from the author.) The final version of these questions can be found online (4). Five standards of physical activity were created from the survey. The standard for walking, including walking for occupational reasons, recreation, exercise, or transportation, was defined as walking ≥ 30 min·d−1 on ≥ 5 d·wk−1. The standard for strengthening was defined as any muscle-strengthening activity on ≥ 2 d·wk−1. Meeting the standard for moderate-intensity activity was defined as engaging in such activity ≥ 30 min·d−1 on ≥ 5 d·wk−1, and meeting the standard for vigorous-intensity activity was defined as engaging in this activity ≥ 20 min·d−1 on ≥ 3 d·wk−1. The fifth standard, which had three levels, was used to classify participants in terms of recommended activity: (i) recommended (participants met the criteria for moderate- or vigorous-intensity activity), (ii) irregular (engaged in ≥ 10 min·wk−1 of activity but at less than recommended levels), or (iii) inactive (≤ 10 min·wk−1 of moderate- or vigorous-intensity activity). Because only one participant was classified as inactive by the survey, we collapsed the classification into two levels (recommended and less than recommended) for analyses. The standards for recommended, vigorous, and strengthening physical activity were created to correspond to the HP 2010 physical activity objectives 22-2 through 22-4, respectively (24).

Physical activity log.

The self-administered physical activity log, which was designed to track daily activities, consisted of 43 moderate- or vigorous-intensity items organized into six categories of activity: household (4 items), transportation (2 items), occupation (6 items), conditioning (11 items), sports (12 items), and leisure (8 items). Additional spaces were provided for participants to write in any activities not on the list. Participants were asked to complete the log at the end of the day, by circling the activities they performed that day (if lasting ≥ 10 min) and recording the total time (hours and minutes) they spent in each activity. The log has been used in a previous study to validate participation levels in physical activity (log is available on request) (1).

Standards of physical activity were derived from the log for walking, strengthening, moderate, vigorous, and recommended activity using the same definitions as in the physical activity survey. Activities from the log were classified as moderate-intensity (three to six metabolic equivalents (METs)) or vigorous-intensity (> 6 METs) according to the compendium of physical activities (2). The log included activity choices for sports and recreation, household, strengthening, walking for exercise, and transportation. In addition, the entries on walking in the log were used to calculate the average days per week and minutes per day for walking.


Participants wore ActiGraph accelerometers (model 7164, ActiGraph, LLC, Fort Walton Beach, FL) to monitor physical activity during the 7-d monitoring period. After being shown how to use the accelerometer, participants wore the monitors in a carrying pouch affixed to a belt worn on the waist over the right anterior axillary line. Data were collected each day during waking hours except while bathing or swimming until participants returned the devices on day 8 or 9 of data collection; data were analyzed for the first 7 d. Moderate- and vigorous-intensity bouts of activity were defined as periods of time in which at least 8 of 10 consecutive minutes were of a single intensity (moderate or vigorous). Two minutes of a different intensity (light, moderate, vigorous) were allowed within every 10-min period. Bouts were separated by at least three consecutive minutes of a different intensity. Data from minutes with more than 20,000 counts per minute were examined as potential outliers. Data from two participants were excluded because the accelerometers malfunctioned during several days of the week. Bouts of at least 10 min were classified using the cut points (moderate: 2020-5998 counts per minute; vigorous: ≥ 5999 counts per minute) that were selected for initial reporting of NHANES accelerometer data (23). Data from the accelerometer were classified using the same standards as the survey data for moderate and vigorous activity and for meeting recommendations.


To monitor their ambulatory activity, participants wore a Yamax Digiwalker (model SW-200, Yamax, Tokyo, Japan) pedometer for seven consecutive days during week 2 during waking hours, except while bathing or swimming. The pedometer was attached to the waistband on the hip opposite the ActiGraph. Participants recorded the time and the number of daily steps taken, and they reset the pedometer to zero at the end of each day.

Statistical analysis.

Descriptive statistics for the age and racial distributions of the study participants, and for each of the physical activity standards, were reported. Reliability and validity for walking, strengthening, moderate-intensity, vigorous-intensity, and recommended physical activity were calculated. Test-retest reliability was calculated using Cohen's κ (9,16) with 95% confidence intervals (95% CI) to compare results from the first administration of the survey with the results from its second and third administrations. To determine whether the surveys accurately measured the behavior of interest, several validity tests were performed on these data. Cohen's κ (9,16) was used to evaluate how strongly the physical activity measures agreed between the log and the first and third telephone surveys, and between the accelerometer and the first and third telephone surveys. Kappa statistics are described as almost perfect (κ = 0.81-1.0), substantial (0.61-0.80), moderate (0.41-0.60), fair (0.21-0.40), and poor (0.0-0.20) (16). Spearman's rank-order correlations were also calculated, but because the results were the same as the κ statistics, they are not shown in the tables. Continuous measures of minutes per week for moderate- and vigorous-intensity physical activity and walking on the surveys, log, and accelerometer were compared using Pearson correlation coefficients and intraclass correlation coefficients (ICC). Mean pedometer steps among those meeting the walking standard of at least 30 min·d−1 on at least 5 d·wk−1 were compared with those not meeting the walking standard using generalized linear models, with statistical significance assessed at the P ≤ 0.05 level. Analyses were performed using SAS version 8.2 (SAS Institute, Cary, NC) and SPSS 13 (SPSS Inc, Chicago, IL).


The 60 participants had a mean ± SD age of 44.5 ± 15.7 yr, 85% were white (N = 51), 15% were African American (N = 9), and 95% (N = 57) had more than a high school education (Table 1). In survey 1, 42% of participants met the standard for moderate-intensity activity, 57% reached the standard for vigorous-intensity activity, and 73% met the standard for recommended activity (Table 2). The median for time spent in moderate-intensity physical activity reported on survey 1 was 180 min·wk−1, on the log was 154 min·wk−1, and on the accelerometer was 106 min·wk−1. The median values for time spent in vigorous-intensity physical activity were 113 min·wk−1 on survey 1, 32 min·wk−1 on the log, and 60 min·wk−1 on the accelerometer. Just fewer than half (48%) the 60 participants reported walking ≥ 5 d·wk−1 for 30 min·d−1, and 67% reported strengthening activities on ≥ 2 d·wk−1 on survey 1.

Descriptive characteristics of Behavioral Risk Factor Surveillance System Physical Activity Study participants.
Summary of physical activity in the Behavioral Risk Factor Surveillance System Physical Activity Study.

Test-retest reliability between the first and second surveys (taken 1-5 d apart, mean = 3.0 days) across all physical activities was higher than reliability between the first and third surveys (taken 10-19 d apart, Table 3). For the retest that occurred within 5 d, the agreement was κ = 0.53 for moderate intensity, κ = 0.86 for vigorous intensity, and κ = 0.84 for the two-level recommended standard. Agreement for the walking standard was κ = 0.56, and agreement for the strengthening standard was κ = 0.92. Significant differences were seen in agreement (data not shown) between men and women in comparing survey 1 with survey 2 for moderate-intensity activity (men: κ = 0.19, 95% CI: 0.0-0.54; women: κ = 0.86, 95% CI: 0.67-1.0) and vigorous-intensity activity (men: κ = 0.72; 95% CI: 0.45-0.97; women: κ = 1.0; 95% CI: 1.0-1.0). Differences between men and women were not significant for any other comparison.

Reliability of physical activity.

Validity (κ) as measured by comparing results from the survey with those from the accelerometer for the moderate-intensity, vigorous-intensity, and recommended activity measures is shown in Table 4. On survey 1, κ = 0.31 for moderate-intensity activity, κ = 0.17 for vigorous-intensity activity, and κ = 0.19 for recommended activity. There were no significant differences by sex.

Criterion validity of physical activity measures comparing the Actigraph accelerometer with the survey (N = 55).

When comparing the physical activity log against the survey, the κ values for validity for vigorous and recommended activity (Table 5) were generally higher than those comparing the survey with the accelerometer. Agreement was fair to poor for the moderate-intensity standard, with κ= 0.07 using survey 1 and κ = 0.25 using survey 3. For vigorous-intensity activity, κ = 0.51 for survey 1. Also on survey 1, κ = 0.40 for recommended activity, κ = 0.19 for walking, and κ = 0.40 for strengthening. No significant differences in agreement by gender were noted. Validity also was calculated using the second survey, but results were not shown because they were very similar to those for the first survey.

Concurrent validity of physical activity measures comparing the physical activity log with the survey.

Correlation coefficients of time spent in activity reported on the survey as compared with the log, accelerometer, and pedometer are reported in Table 6. Correlations for moderate-intensity activity were always lower than for vigorous-intensity activity. Correlations between mean pedometer steps per day, and the mean minutes per week on the surveys, ranged from ρ = 0.03 to 0.28 for moderate-intensity PA and from ρ = 0.27 to 0.41 for vigorous-intensity physical activity (Table 6). Time spent in moderate and vigorous activity within each survey had low correlations (ICC ρ ≤ 0.11).

Pearson correlation coefficients for time spent (min·wk−1) in moderate- and vigorous-intensity physical activity comparing three surveys against the physical activity log, Actigraph accelerometer, and Digiwalker pedometer.

Median walking minutes reported on the survey were lower than reported on the log (Table 2), and correlation coefficients were < 0.10 (data not shown). The mean daily step count was 6357 for people walking less than the standard amount (≥ 30 min·d−1 on ≥ 5 d·wk−1) and 10,506 (F = 24.7, P < 0.0001) for people meeting the standard. Those who met physical activity recommendations and walked ≥ 30 min·d−1 on ≥ 5 d·wk−1 reported walking an average of 642.5 min·wk−1 (SD = 726.6) on survey 3. In contrast, those who met physical activity recommendations and walked less than the standard amount reported walking an average of 259.4 min·wk−1 (SD = 302.3) on survey 3. The average number of days walked per week was 5.4 (SD = 1.7) on the log and 4.9 (SD = 2.8) on the survey; ρ = 0.12 (P = 0.36). The correlation between minutes per week walking on survey 2 and pedometer steps was ρ = 0.23 (P = 0.08).


This study examined the evidence for reliability of scores and validity of inferences of the physical activity questions implemented in the BRFSS in 2001. In this analysis of the BRFSS physical activity questions, evidence for test-retest reliability was fair to moderate for moderate-intensity activity (κ = 0.35-0.53) and substantial for vigorous-intensity and recommended activity (κ = 0.67-0.86). For the walking measure, evidence for reliability was fair to moderate (κ = 0.34-0.56), and evidence for the reliability for the strengthening measure was excellent (κ = 0.85-0.92). Interestingly, the reliability scores for moderate- and vigorous-intensity activity were consistent for women but not for men. These reliability results are similar to the reliability results of these BRFSS questions tested among a racially and ethnically diverse sample of women (ICC = 0.5-0.8) (12) and among a diverse sample of men and women (ICC = 0.44) (13); they also are similar to the results of a study of the BRFSS in an Australian population (κ = 0.40) (5).

Evidence for the validity of inferences comparing the self-report surveys compared with the accelerometer data were fair to poor (κ ≤ 0.31 for all measures). Evidence for the validity of inferences comparing the self-report survey with the physical activity log was moderate for vigorous-intensity activity and recommended activity (κ = 0.40-0.51), fair to poor for moderate-intensity activity (κ ≤ 0.25), fair (κ = 0.19-0.23) for the walking measure, and moderate (κ = 0.40-0.52) for the strengthening measure. Correlations between time spent (min·wk−1) from the surveys and other physical activity assessments were higher for vigorous-intensity activity (ρ = 0.60-0.68) than for moderate-intensity activity (ρ = 0.05-0.21), and accelerometer and log results were similar. As a comparison, the International Physical Activity Questionnaire had a median validity of ρ = 0.3 for minutes per week of total physical activity from the self-report instrument compared with accelerometers (10). This evidence for validity of inferences for moderate-intensity, vigorous-intensity, and recommended physical activity is similar to results from previous studies of physical activity questionnaires (3,15).

Ainsworth et al. (1), who tested reliability and validity for a developmental version of these BRFSS questions, found modest to good correlations between the survey questions and physical activity logs for moderate-intensity activity (ρ = 0.26) and for walking (ρ = 0.38), low correlations between the survey questions and physical activity logs for vigorous-intensity activity (ρ = 0.09), and low to moderate correlations between the survey questions and accelerometer data (ρ = 0.0-0.41, depending on the analytical method and questionnaire item). By comparison, time spent in activity on these revised questions compared with the log shows lower validity for moderate-intensity activity and higher validity for vigorous-intensity activity. It is important to note that these questions were designed to be operationalized as categorical variables when using the BRFSS data. We found in this survey, with data categorized into Healthy People 2010 levels (24), that validity was moderate when the survey was compared with the activity log for vigorous-intensity activity (survey 1, κ = 0.5; survey 3, κ = 0.44). This improvement may be related to the survey question order. The questionnaire used in the study by Ainsworth et al. (1) placed questions regarding vigorous-intensity activity before questions regarding moderate-intensity activity, whereas in this survey respondents recalled moderate before vigorous activity. Because the order of questions on a survey influences the prevalence of physical activity reported, with a positive bias toward reporting more vigorous activities when the questions on these activities precede those for moderate intensity, the order was switched (14).

The current BRFSS physical activity questions were designed to provide statistics for the Healthy People 2010 for regular physical activity. As such, the questions inquire about activity in three domains: leisure time, household, and transportation. These questions are cognitively more challenging for the respondents to answer compared with questions about only leisure-time physical activity. First, respondents are asked about three domains of physical activity instead of one (leisure time), potentially increasing the number of activities that must be held in working memory to answer the questions. Second, these questions include examples of activities that people may not consider to be physical activity, thus leading to confusion as respondents reconcile their expectations about activity types with unexpected examples in the questions. Third, interviewers asked respondents to sum time spent in several activities instead of one activity, as had been done in the pre-2001 version of BRFSS, thus increasing the difficulty of the mental arithmetic required to compute accurate responses. Fourth, the concepts of moderate and vigorous physical activity are unfamiliar to many adults. The primary reasons for designing such difficult survey questions were that the physical activity domains were those recommended in the CDC/ACSM recommendations and that question design was constrained by the requirements of the BRFSS, which had limited survey space for physical activity questions on the core questionnaire. It is difficult to determine whether measurement properties were improved from the survey questions used before 2001 versus the current survey questions. Reliability has improved compared with the pre-2001 questions (6), but with no validity data on the pre-2001 questions, it is unknown whether validity has improved or not.

These questions are most appropriately used for population monitoring rather than for an assessment of individual physical activity levels. This study was conducted in a population that was mostly white, highly educated, and active. Validity of these questions had been previously tested in a white, highly educated sample (22). Reliability and validity results cannot be extrapolated to population groups that either had small representation or that were not represented in validity studies. Future reliability and validity tests should include a larger cross-section of the population and should examine subjects' comprehension of the questions.

Reliability scores were higher when comparing survey 1 with survey 2 (1- to 5-d retest), more so than when comparing survey 1 with survey 3 (10- to 19-d retest). Survey 1 and 2 are essentially comparisons of scores with overlapping time periods. People may have better recall of their recent activity during a short time frame than they would during several weeks. Differences between surveys 1 and 3 suggest that some people might have reported the previous week's activity instead of the habitual activity implied by the time frame of "a usual week."

Several limitations should be considered regarding this study. First, results from this study group may have limited generalizability to other demographic groups, because respondents were recruited as a convenience sample from a university setting. This sample also had a high education level, which may have led them to have a better understanding of the survey questions, physical activity logs, and accelerometers. Second, as with all self-report instruments, bias in recall may have been inherent with use of the physical activity log or answering the survey. The log was completed at the end of each day, and participants may not have remembered all of their daily activities. The intense record keeping required by the log, or recording the pedometer steps, may have affected responses given to survey 3. People might not limit their activities to those lasting 10 min or more as requested on this survey. They also might mentally aggregate data or include rest periods or intermittent activities, or they might round times during recall or reporting. Further, the log inquired about specific activities but not about the intensity at which people did those activities. Because some activities could be done at either moderate or vigorous intensity (i.e., bicycling), these activities were given a general intensity level according to the compendium of physical activities (2). Third, the κ statistic may be less stable with unbalanced cells that occur with low and high participation rates, because chance agreement increases with uneven population distribution. Fourth, the survey questions used in the BPAS asked for a "usual week" of activity, whereas the log and accelerometer measured a specific week of activity, allowing for dissonance in the reports. It is not known what respondents consider a usual week; it could be the last week, or it could be a week from the past month or the past year. Finally, accelerometers do not measure activities such as swimming or weight lifting, and this may account for some dissonance in the results.

In summary, the 2001 BRFSS questions on physical activity have moderate to substantial (κ = 0.34-0.92) evidence for test-retest reliability. Evidence for validity scores is poor to fair (κ = 0.07-0.25) for moderate-intensity activity and walking and is fair to moderate (κ = 0.40-0.52) for vigorous-intensity, recommended, and strengthening activities. It seems that these questions can classify a group of adults into levels of physical activity as defined by the Healthy People 2010 objectives, and classification into levels of vigorous, recommended, or strengthening activity is better than having a classification of moderate activity. A single administration of a self-report questionnaire may not be adequate to measure habitual physical activity, because the time period in question may not reflect a person's normal levels of activity. However, consistent use of these questions over time, as in a surveillance instrument such as the BRFSS, will help to identify trends in physical activity. Future work should concentrate on producing additional evidence for the validity of interpretations associated with these questions in various age and racial/ethnic subpopulations and on improving measurement of moderate-intensity activity.

The findings and conclusions in this report are those of the authors and do not necessarily represent the CDC. The results of the present study do not constitute endorsement of the product by the authors or ACSM.


1. Ainsworth, B. E., D. R. Bassett Jr., S. J. Strath, et al. Comparison of three methods for measuring the time spent in physical activity. Med. Sci. Sports Exerc. 32(9 Suppl.):S457-S464, 2000.
2. Ainsworth, B. E., W. L. Haskell, M. C. Whitt, et al. Compendium of physical activities: an update of activity codes and MET intensities. Med. Sci. Sports Exerc. 32(9 Suppl.):S498-S504, 2000.
3. Ainsworth, B. E., D. R. Jacobs Jr., and A. S. Leon. Validity and reliability of self-reported physical activity status: the Lipid Research Clinics questionnaire. Med. Sci. Sports Exerc. 25:92-98, 1993.
4. BRFSS Web site. Available at: Accessed November 20, 2006.
5. Brown, W. J., S. G. Trost, A. Bauman, K. Mummery, and N. Owen. Test-retest reliability of four physical activity measures used in population surveys. J. Sci. Med. Sport. 7:205-215, 2004.
6. Brownson, R. C., A. A. Eyler, A. C. King, Y. L. Shyu, D. R. Brown, and S. M. Homan. Reliability of information on physical activity and other chronic disease risk factors among US women aged 40 years or older. Am. J. Epidemiol. 149:379-391, 1999.
7. Cauley, J. A., R. E. LaPorte, R. B. Sandler, M. M. Schramm, and A. M. Kriska. Comparison of methods to measure physical activity in postmenopausal women. Am. J. Clin. Nutr. 45:14-22, 1987.
8. Centers for Disease Control and Prevention. 2003 Behavioral Risk Factor Surveillance System State Questionnaire, v. 1.5. Atlanta, GA: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2002.
9. Cohen, J. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20:37-146, 1960.
10. Craig, C. L., A. L. Marshall, M. Sjostrom, et al. International physical activity questionnaire: 12-country reliability and validity. Med. Sci. Sports Exerc. 35:1381-1395, 2003.
11. Durante, R., and B. E. Ainsworth. The recall of physical activity: using a cognitive model of the question-answering process. Med. Sci. Sports Exerc. 28:1282-1291, 1996.
12. Evenson, K. R., A. A. Eyler, S. Wilcox, J. L. Thompson, and J. E. Burke. Test-retest reliability of a questionnaire on physical activity and its correlates among women from diverse racial and ethnic groups. Am. J. Prev. Med. 25(3 Suppl. 1):15-22, 2003.
13. Evenson, K. R., and A. P. McGinn. Test-retest reliability of adult surveillance measures for physical activity and inactivity. Am. J. Prev. Med. 28:470-478, 2005.
14. Ham, S. A., C. A. Macera, D. A. Jones, B. E. Ainsworth, and K.M. Turczyn. Preliminary considerations for physical activity research: variations on a theme. J. Phys. Act. Health 1:98-113, 2004.
15. Jacobs, D. R. Jr., B. E. Ainsworth, T. J. Hartman, and A. S. Leon. A simultaneous evaluation of 10 commonly used physical activity questionnaires. Med. Sci. Sports Exerc. 25:81-91, 1993.
16. Landis, J. R., and G. G. Koch. The measurement of observer agreement for categorical data. Biometrics 33:159-174, 1977.
17. Macera, C. A., D. A. Jones, M. M. Yore, et al. Prevalence of physical activity, including lifestyle activities among adults-United States, BRFSS 2000-2001. MMWR Morb. Mortal. Wkly. Rep. 52:764-769, 2003.
18. Nelson, D. E., D. Holtzman, J. Bolen, C. A. Stanwyck, and K.A. Mack. Reliability and validity of measures from the Behavioral Risk Factor Surveillance System (BRFSS). Soz. Praventivmed. 46(Suppl. 1):S3-S42, 2001.
19. Pate, R. R., M. Pratt, S. N. Blair, et al. Physical activity and public health. A recommendation from the Centers for Disease Control and Prevention and the American College of Sports Medicine. JAMA 273:402-407, 1995.
20. Shea, S., A. D. Stein, R. Lantigua, and C. E. Basch. Reliability of the behavioral risk factor survey in a triethnic population. Am. J. Epidemiol. 133:489-500, 1991.
21. Stein, A. D., R. I. Lederman, and S. Shea. The Behavioral Risk Factor Surveillance System questionnaire: its reliability in a statewide sample. Am. J. Public Health 83:1768-1772, 1993.
22. Strath, S. J., D. R. Bassett Jr., S. A. Ham, and A. M. Swartz. Assessment of physical activity by telephone interview versus objective monitoring. Med. Sci. Sports Exerc. 35:2112-2118, 2003.
23. Troiano, R. Accelerometer-measured physical activity prevalence in NHANES 2003-2004. Med. Sci. Sports Exerc. 38(Suppl.):40, 2006.
24. U.S. Department of Health and Human Services. Healthy People 2010, 2nd ed. With Understanding and Improving Health and Objectives for Improving Health. 2 vols. Washington, DC: U.S. Government Printing Office, 2000.
25. Washburn, R. A., and H. J. Montoye. The assessment of physical activity by questionnaire. Am. J. Epidemiol. 123:563-576, 1986.


©2007The American College of Sports Medicine