Journal Logo

SPECIAL COMMUNICATIONS: Methodological Advances

Accelerometer Validation of Questionnaires Used in Clinical Settings to Assess MVPA


Author Information
Medicine & Science in Sports & Exercise: July 2015 - Volume 47 - Issue 7 - p 1538-1542
doi: 10.1249/MSS.0000000000000565
  • Free


Lack of physical activity (PA) is associated with increased all-cause mortality and, as such, is a major public health concern (6,11). Despite this, accelerometry estimates suggest that as few as only 2%–15% of adults age ≥20 yr are meeting the PA guidelines (3,16). Although recognized as a risk factor, PA status is not routinely assessed by clinicians (11). It is possible that the reluctance of clinicians to assess PA status is concerned with a lack of standardized PA assessment methods (15). As such, a decision matrix was proposed recently to help guide physicians and researchers in selecting the best PA assessment method based on their desired outcomes (15). The decision matrix indicates that questionnaire-based methods would be better suited to the needs of a clinician. Certainly, the ease of administration, concision, and outcome (e.g., classification as active or inactive) associated with PA questionnaires make them desirable for use during brief office visits with clinicians. However, data from PA questionnaires tend to lack the detail gained from objective measures (e.g., accelerometry), especially when light-to-moderate PA intensities are examined (15).

The increasing awareness of PA as a risk factor has resulted in the development of initiatives such as the Exercise is Medicine® (EIM) campaign to promote more PA, particularly by urging physicians to prescribe PA, and thereby improve each patient’s health. However, as with other risk factors (e.g., hypertension), physicians should have data on a patient’s status (e.g., resting blood pressure) so that individualized recommendations can be made. Nonetheless, the medical community at large has not implemented the assessment of PA during routine clinical visits (10).

Some clinicians have begun using simple questionnaires to gain insight into their patients’ PA habits. Two questionnaire methods have been developed for use in the primary care setting. In the United States, the exercise vital sign (EVS) has been proposed and initially validated for use by clinicians to assess patient PA levels (2), whereas in the United Kingdom, the General Practice Physical Activity Questionnaire (GPPAQ) is used in the primary care setting to assess patient PA levels (7). The EVS is a three-item questionnaire that reports the patient’s frequency and duration of exercise sessions. The GPPAQ uses seven questions to categorize patients into one of four groups (inactive, moderately inactive, moderately active, or active) termed the Four-Level Physical Activity Index. From these questionnaires, the clinician gains an estimate of a patient’s PA in an effort to identify those who are not meeting the PA guidelines. Although these simple, brief questionnaires are appealing for use by clinicians, data comparing the outcomes of these tools to a valid measure of PA are lacking.

An accepted method of objectively assessing free living PA habits is through the use of accelerometers (15). Accelerometry can provide a range of detail on PA behaviors, including time spent completing sedentary, light, moderate, and vigorous activities, as well as step counts and counts per minute. One benefit of accelerometry is the determination of minutes of moderate-to-vigorous PA (MVPA) in bouts, allowing for a simple comparison to the PA guidelines. Previous studies have found the prevalence of those considered “adherent” to the PA guidelines to be lower when measured by accelerometry compared to self-report methods (16). Despite this, it is unknown if the EVS and GPPAQ can accurately determine PA status.

Therefore, the primary purpose of this study was to compare the accuracy of the EVS and GPPAQ, using accelerometry as the criterion measure, in identifying subjects who are not meeting the PA guidelines. The secondary and tertiary aims of this study were to compare estimated MVPA between the EVS and the ActiGraph GT3X+ (ActiGraph, Pensacola, FL) and to compare the number of subjects meeting the PA guidelines between the US and the UK populations, respectively. It was hypothesized that the two questionnaires would overestimate the number of subjects meeting the PA guidelines, as well as the minutes of MVPA per week.


A total of 88 subjects were recruited from Ball State University, US (BSU, N = 45), and the University of Worcester, UK (UW, N = 43), to participate in the present study. A total of 12 subjects were removed from the analysis for the present study for the following reasons: 1) invalid accelerometer wear time (BSU, n = 5; UW, n = 4) and 2) incomplete PA questionnaires (BSU, n = 2; UW, n = 1). The data from the remaining 76 subjects (BSU, n = 38, age 49 ± 20 yr; UW, n = 38, age 43 ± 21 yr) were used in the analysis for the present study. All subjects provided written informed consent in accordance with the Institutional Review Board at BSU and Ethics Committee at UW. Subjects had to be ≥18 yr of age, free of any known cardiovascular disease, and able to ambulate without any limitations. Subjects were excluded from participating in the study if they had a history of arthritis or orthopedic conditions, which limited their ability to ambulate, had other medical conditions limiting PA (e.g., diabetic neuropathy), or were pregnant. Subjects reported to the respective laboratory (BSU or UW) on two separate occasions. On the first visit, subjects completed a health history questionnaire and were given an initialized accelerometer and a PA log sheet. At the second visit (7 d later), subjects returned the accelerometer and PA log sheet and then completed both the GPPAQ and EVS questionnaires. Subject characteristics are presented in Table 1.

Subject characteristics.

Data collection

Accelerometer data were collected using the ActiGraph GT3X+ (ActiGraph), which has been shown to be both valid and reliable in measuring PA (5,12). All subjects wore a GT3X+ on their right hip at the waist midline with the knee for a 7-d period. GT3X+ monitors were initialized using a 60-Hz sampling rate. Subjects were instructed to wear the GT3X+ during all waking hours, removing the unit for water-based activities only (e.g., showering, bathing, and swimming).

Data handling

Raw triaxial accelerometer data were downloaded using ActiLife version 6.8.0. Raw triaxial accelerations were subsequently scored using a 60-s epoch. To validate the wear time, GT3X+ files were scanned for a minimum wear time of 10 h·d−1, with a minimum of 4-d wear time across the week (with at least one weekend day). To validate the wear time, nonwear time was recognized as ≥60 min of consecutive zeros. Subjects whose GT3X+ data did not meet these minimum criteria were excluded from the analysis. The vector magnitude of uniaxial accelerations formed the triaxial accelerations, which were scored to quantify MVPA using the Santos-Lozano age-specific cut-points (adults (18–64 yr): ≥3208 counts per minute; and older adults (≥65 yr): ≥2751 counts per minute) (13). Subjects were classified as meeting the PA guidelines if they accumulated ≥150 min of MVPA in ≥10-min bouts over the week of observation. Subjects who did not accumulate ≥150 min of MVPA in ≥10-min bouts were classified as inactive. The questionnaires were scored using previously published methods (2,7), and those scored as moderately active or active by the GPPAQ were considered to be meeting the PA guidelines. Subject responses from the EVS were multiplied (days engaged in moderate-to-strenuous exercise × minutes engaged in exercise at this level) to estimate the number of MVPA minutes per week, and subjects reporting ≥150 min of MVPA in ≥10-min bouts were considered active. Subjects achieving <150 min of MVPA in ≥10-min bouts were classified as inactive.

Statistical analysis

The sensitivity (ability of the EVS or GPPAQ to identify subjects not meeting the PA guidelines) and specificity (ability of the EVS or GPPAQ to identify subjects meeting the PA guidelines) were calculated as a measurement of validity. To determine the sensitivity and specificity, the following definitions were used; 1) true active: both accelerometer and questionnaire identified the subject as meeting the PA guidelines, 2) false active: the questionnaire identified the subject as meeting the PA guidelines, but the accelerometer identified the subject as not meeting the PA guidelines. Similarly, the term “true sedentary” was used when both the questionnaire and accelerometer identified the subject as not meeting the PA guidelines, and the term “false sedentary” was used when the questionnaire identified the subject as sedentary, but the accelerometer identified the subject as meeting the PA guidelines. Sensitivity was calculated by dividing the number of subjects identified as true sedentary by subjects identified as true sedentary + false active subjects, whereas specificity was calculated by dividing the number of true active subjects by the number of true active subjects + false sedentary subjects. To test the level of agreement between the GT3X+ and the EVS, Bland–Altman plots were generated using the GT3X+ as the criterion assessment tool. Limits of agreement were set at ±1.96 SD of the difference scores, as described previously (1). Subject descriptive characteristics were compared between gender and country by a 2 × 2 ANOVA. The effect of PA assessment method (within-subjects effect), as well as gender and country (between-subjects effect), on MVPA was compared using a 2 × 2 × 2 repeated measures ANOVA. Data were analyzed by SPSS v20 (SPSS Inc., Chicago, IL), and the alpha level was set at P < 0.05 for all tests. Comparisons of MVPA per week between the GPPAQ and the GT3X+ could not be performed because the GPPAQ does not give an estimate of minutes of MVPA per week; rather, it categorizes data into one of four categories.


When the GT3X+ method was used for scoring subject PA data, 46 of the 76 subjects did not meet the PA guidelines. However, when the EVS and GPPAQ methods were used, 34 and 36 subjects were identified as not meeting the PA guidelines, respectively. The values of the sensitivity of the EVS and GPPAQ in identifying subjects categorized as not meeting the PA guidelines by the GT3X+ were 59% and 46%, respectively. The values of the specificity of the EVS and GPPAQ in identifying subjects categorized as meeting the PA guidelines by the GT3X+ were 77% and 50%, respectively.

Time spent in MVPA as quantified by the GT3X+ and EVS are presented in Table 2. The mean difference between the time (min) spent in bouts of MVPA per week recorded by the EVS averaged 66 min (49%) more than the GT3X+ (F = 17.2, P < 0.001). The upper and lower limits of agreement between these two PA assessment tools were 349 and −217 min, respectively, as illustrated by the Bland–Altman plot in Figure 1. There was a significant interaction between the PA assessment method and the gender (F = 8.8, P < 0.01), which is likely driven by the similar PA estimates between methods for UK women. There was also a significant interaction between the method, the gender, and the country (F = 4.4, P < 0.05), as highlighted in Table 2. The most notable finding was that US men and women and UK men all overestimated MVPA on the EVS.

Bland–Altman plot for minutes per week spent in bouts of MVPA assessed by the GT3X+ and the EVS.
MVPA comparison between US and UK subjects using the GT3X+ and the EVS.


The primary finding from the present study was that the EVS and GPPAQ had a similar ability to identify those subjects not meeting the PA guidelines (59% vs 46%, respectively) when the GT3X+ was used as the criterion measure. The implication of this finding is that clinicians who are using the EVS and/or GPPAQ to identify patients in need of lifestyle interventions would be missing ∼50% of that population. Furthermore, although these simple questionnaires provide a more feasible option than accelerometry, clinicians must be familiar with their potential shortcomings (i.e., false-negative identification of patients). For example, a blood test which only detected hypercholesterolemia in 50% of patients would likely not be accepted by clinicians.

Owing to the simplistic integration of both the EVS and the GPPAQ into the clinical setting, it is important to compare the validity of the two questionnaires. Doing so should aid clinicians in identifying which questionnaire is best for a given population. Our results indicate that the EVS has greater specificity (77% vs 50%) and sensitivity (59% vs 46%) than the GPPAQ in both of the cohorts we studied. The specificity and sensitivity of the GPPAQ in the current study is considerably less than has been previously reported by Puig et al. (8) in Spanish subjects (70% and 82%, respectively). However, the present study used the GT3X+ as the criterion measure, whereas Puig et al. used the short version of the International PA Questionnaire as the criterion measure. Given the inherent biases associated with PA questionnaires, and that Puig et al. used two questionnaires, the higher specificity and sensitivity values found in the Puig et al. study relative to the present study, where objective and subjective methods were compared, are not surprising.

In a study of the face and discriminant validity of the EVS, Coleman et al. (2) reported 36% of their cohort were completely inactive (0 min of exercise per week), and a further 33% were insufficiently active (>0 and <150 min of exercise per week). Coleman et al. concluded that the EVS had good face and discriminant validity because their data from the EVS showed similar trends to data from national surveys (2). However, in that study, the EVS was the only measure of PA and no criterion measure, such as accelerometry, was used. The present study compared the EVS to the GT3X+, using the latter as a criterion measure. Accelerometry has become the de facto tool for PA assessment and thus serves as a criterion measure for validating the EVS. Given the results of the present study, it is clear that the EVS has better specificity and sensitivity than the GPPAQ in identifying individuals who are not meeting the PA guidelines. Although given the inherent flaws with questionnaire-based methods, it is not surprising that the specificity and sensitivity were less than perfect.

The present study found that the EVS overestimated MVPA per week (in ≥10-min bouts) in US men and women and UK men. As a group, UK women did not overestimate their MVPA per week on the EVS. The magnitude of the overestimation for both genders and countries is presented in Table 2. Men tended to overestimate their MVPA to a larger degree than women (86% vs 17%), although the mean overestimates are confounded by the UK women. Although the reason for this different finding in UK women is not clear, they were younger, had lower body mass index (BMI), and had greater MVPA per week (as measured by the GT3X+) than the other groups. However, no strong evidence exists suggesting that age, BMI, or PA status would affect an individual’s ability to estimate their MVPA per week. It is possible that the overestimation by the EVS is due in part to the inclusion of ≥10-min bouts in the accelerometry data scoring. Despite the overestimation of MVPA by the EVS, it remains marginally better than the GPPAQ in terms of dichotomizing subjects as meeting or not meeting the current PA guidelines.

Although the EVS performed slightly better, its use in the clinical setting would result in 41% (∼4 in 10) of patients being falsely classified as meeting the PA guidelines. The ramification of this is that the four patients who were determined to be meeting the PA guidelines, but were in fact not meeting the PA guidelines, would miss out on PA counseling and services to increase PA. Conversely, using the EVS in the clinical setting would give rise to 23% of patients who were currently meeting the PA guidelines (according to accelerometry data) to be classified as not meeting the PA guidelines. This may result in ∼2 in 10 patients receiving PA counseling/services to increase PA when they are already meeting the PA guidelines. Although the latter would not accrue any health detriment to those patients, it would mean that healthcare costs and resources were being spent on patients who would be able to get by with only encouragement to continue their lifestyle, rather than meeting with a counselor to increase PA. Although the administration costs may be negligible, physicians should use the EVS with caution if indeed they chose to use it at all. It is evident that either improvements to these questionnaires are needed before their widespread adoption can be justified or physicians need better methods for PA assessment. The EIM approach should be advocated such that all physicians provide at least some recommendation for their patients to meet the PA guidelines. Considering the scarce data currently available on the validity of the EVS and GPPAQ, more studies are needed to assess the validity of these questionnaires in the clinical setting.

PA questionnaires have previously been shown to overestimate the prevalence of those meeting the PA guidelines relative to accelerometry (16). Given this flaw with PA questionnaires, clinicians should use their clinical judgment for providing lifestyle interventions to individuals they feel may not be meeting the national PA guidelines. Indeed, clinical judgment alongside questionnaire use for monitoring changes in PA may prove fruitful because it has previously been suggested that PA questionnaires should be used for monitoring changes in PA and that detailed interpretation (e.g., minutes of MVPA) of questionnaire methods may not be valid (14). Ideally, objective measures of PA (e.g., accelerometer) should be used if an absolute quantification of PA is desired (i.e., when directly comparing PA levels to PA guidelines) (9).

The Bland–Altman plot has several outliers beyond the upper limits of agreement, indicating the systematic bias of the EVS. It is likely that the systematic bias of the EVS is due to the gross overestimation of PA relative to the GT3X+ (Table 2). Despite this systematic bias, it is possible that the EVS yields a slightly better estimate of PA status than the GPPAQ because it is shorter and seeks to quantify purposeful activity, which may be easier to recall (2,7). Furthermore, the GPPAQ may be less accurate because it includes activities associated with PA (e.g., gardening/DIY and housework/childcare) opposed to exercise (i.e., planned, structured, and repetitive movements), and these types of PA may be classified as light PA when accelerometry is used.

The third purpose of the present study was to compare the number of subjects who met the PA guidelines between the US and the UK populations we studied. According to the accelerometry data, only 30% and 50% of US and UK subjects met the PA guidelines, respectively. This is reflected by the average minutes of MVPA per week for US and UK subjects (108 ± 113 vs 164 ± 120, respectively). These figures are notably greater than a larger population study (N = 2357) which found that between 3% and 7% of US men and women age 16–59 yr old met the PA guidelines when accelerometry data were analyzed (16). Similarly, a population level study in the United Kingdom (N = 2133) found only 6% of men and 4% of women met the PA guidelines (4). It is pertinent to note that both communities where data were sampled from (Muncie, IN, and Worcester, UK) offer similar opportunities for PA (e.g., university-based fitness programs, abundant recreational space, and dedicated cycle paths). Although both sites attempted to recruit subjects with a range of PA levels, it is apparent that our university settings attracted a more active group of subjects. Future studies should consider recruiting sites in other settings in the community. Furthermore, although the communities have different extremes in climate (i.e., Muncie has hotter summers and colder winters), the authors do not believe this would have influenced the PA of the subjects.

To the authors’ knowledge, this is the first study to directly compare PA classifications between EVS, GPPAQ, and GT3X+. Because the data were collected across two countries and a range of ages and clinical diagnoses (Table 1), the findings have good generalizability. However, only one non-Caucasian subject volunteered to participate in the present study and an interesting avenue for future studies may be to examine the potential differences in subjective and objective PA classifications across ethnicities. Furthermore, although the present study and one other study have reported reasonable validity of the EVS, it is not clear if the EVS is a reliable instrument (2). It would be beneficial to assess the validity and reliability of the EVS across a large range of ages (e.g., 18–90 yr olds) and disease states. It seems feasible that different PA questionnaires would yield better validity and reliability within a certain age group based on the language used, type of activities reported (e.g., gardening vs sport), and number of questions.

In conclusion, the present study demonstrated important limitations of the EVS and GPPAQ, including the overestimation of minutes of MVPA per week and the misclassification of patients as meeting the PA guidelines. If physicians choose to use a PA assessment tool, then they must be cognizant of their limitations and use caution when attempting to interpret the results. These limitations need to be improved before these questionnaires can be adopted for widespread use in the clinical setting. Although the assessment of a patient’s PA is crucial when trying to provide interventions to those patients who are not meeting the PA guidelines, a better approach may be for physicians to advocate all of their patients to adopt an active lifestyle, including the achievement of ≥150 min of MVPA per week.

The authors wish to thank the graduate assistants in the Clinical Exercise Physiology program at BSU, as well as Jamie Simmonds and James Tutty at the Institute for Sport and Exercise Science at UW for their assistance with data collection.

No funding sources were used for this study. None of the authors have any conflicts of interest to declare.

The results of the present study do not constitute endorsement by the American College of Sports Medicine.


1. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986; 8 1(8476): 307–10.
2. Coleman KJ, Ngor E, Reynolds K, et al. Initial validation of an exercise “vital sign” in electronic medical records. Med Sci Sports Exerc. 2012; 44 (11): 2071–6.
3. Colley RC, Garriguet D, Janssen I, et al. Physical activity of Canadian adults: accelerometer results from the 2007 to 2009 Canadian Health Measures Survey. Health Rep. 2011; 22 (1): 15–23.
4. Craig R, Mindell J, Hirani V. Volume 1: Physical activity and fitness. In: Health Survey for England 2008. 2009; 1: 8–395.
5. Jarrett H, Fitzgerald L, Routen AC. Inter-instrument reliability of the ActiGraph GT3X+ ambulatory activity monitor during free-living conditions in adults. J Phys Act Health. 2015; 12 (3): 382–7.
6. Paffenbarger RS, Hyde RT, Wing AL, et al. Physical activity, all-cause mortality, and longevity of college alumni. N Engl J Med. 1986; 314 (10): 605–13.
7. Physical Activity Policy and Health Improvement Directorate. The General Practice Physical Activity Questionnaire (GPPAQ): a screening tool to assess adult physical activity levels, within primary care. [Internet]. 2009. [Accessed 2014 Feb]. Available from:
8. Puig RA, Peña CO, Romaguera BM, et al. How to identify physical inactivity in primary care: validation of the Catalan and Spanish versions of 2 short questionnaires. Aten Primaria. 2012; 44 (8): 485–93.
9. Sallis JF, Saelens BE. Assessment of physical activity by self-report: status, limitations, and future directions. Res Q Exerc Sport. 2000; 71 (2 Suppl): S1–14.
10. Sallis R. Developing healthcare systems to support exercise: exercise as the fifth vital sign. Br J Sports Med. 2011; 45 (6): 473–4.
11. Sallis RE. Exercise is medicine and physicians need to prescribe it! Br J Sports Med. 2009; 43 (1): 3–4.
12. Santos-Lozano A, Marin PJ, Torres-Luque G, et al. Technical variability of the GT3X accelerometer. Med Eng Phys. 2012; 34 (6): 787–90.
13. Santos-Lozano A, Santín-Medeiros F, Cardon G, et al. ActiGraph GT3X: validation and determination of physical activity intensity cut points. Int J Sports Med. 2013; 34 (11): 975–82.
14. Shephard RJ. Limits to the measurement of habitual physical activity by questionnaires. Br J Sports Med. 2003; 37 (3): 197–206.
15. Strath SJ, Kaminsky LA, Ainsworth BE, et al. Guide to the assessment of physical activity: clinical and research applications: a scientific statement from the American Heart Association. Circulation. 2013; 128 (20): 2259–79.
16. Troiano RP, Berrigan D, Dodd KW, et al. Physical activity in the United States measured by accelerometer. Med Sci Sports Exerc. 2008; 40 (1): 181–8.


© 2015 American College of Sports Medicine