There has been extraordinary progress in the field of physical activity epidemiology in the last 50 yr. The lack of participation in moderate–vigorous exercise (38), and more recently prolonged time spent in sedentary behavior—or sitting, have been associated with increased risk for mortality and chronic diseases (e.g., [30,45]), including certain cancers (e.g., ). Clearly, the exposure assessments used in these studies, typically questionnaires that estimate usual amounts of physically active and sedentary behaviors (e.g., past year), have been successful in identifying many strong behavior–disease associations. On the other hand, these same questionnaires probably contain a substantial amount of measurement error (10,32), which leads to a loss of statistical power to test etiologic hypotheses, attenuation in of the strength of the associations observed, and difficulties characterizing dose–response relationships that are critical in the development of evidence-based recommendations (42).
Better measurements are needed to address the limitations of traditional questionnaire-based tools, and recent summaries of the current state of the art of exposure assessment provide insight into why both device-based (12) and self-report methods (6) can and should play complementary roles in future studies. Recent systematic reviews have noted that few physical activity questionnaires have validity coefficients (correlations) greater than 0.5 (19,34,49) compared with objective measures, and sedentary behavior questionnaires have been noted to have a similarly modest level of validity (3,17).
Short-term recalls (e.g., diaries, previous-day recalls [PDR]) have been suggested as an alternative to traditional questionnaires that typically require longer-term recall of behavior (32,35,48). New technologies such as Web-based surveys and mobile devices coupled with emerging measurement error correction techniques (11,35) now make it feasible to use such methods as a primary exposure assessment tool in large scale studies. PDR offer several advantages over questionnaire-based estimates of usual activity and sedentary behavior. First, they allow respondents to rely on episodic memory to generate reports about time spent in specific activity-related behaviors rather than use of estimation strategies and long-term averaging (25). Thus, the information reported on the PDR may be more accurate. Second, PDRs capture more detailed information about different types of activities and offer a unique opportunity to assess body posture (i.e., sitting vs standing) as well as information about behavioral context (e.g., location and purpose) not available from other measures. Hence, PDRs may be particularly valuable for studies interested in posture-based estimates of sedentary behavior, or studies that require information about where and why physically active and sedentary behaviors occur.
An important first step in establishing the proof of principle for PDR for use in future studies is to test the validity of the method. Accordingly, the purpose of this report is to evaluate the validity of an interviewer administered PDR of physically active and sedentary behaviors in free-living adolescents and adults compared with the activPAL, an accurate and precise reference measure for distinguishing between active and sedentary behaviors (15,23). To provide insight into the measurement properties of the PDR compared with another instrument, we conducted a parallel analysis evaluating the validity of the ActiGraph monitor compared with the activPAL. In secondary analyses, we also evaluated PDR measures of light and moderate–vigorous physical activity using common ActiGraph cut points.
MATERIALS AND METHODS
During the 7-d study period, adolescents (12–17 yr) and middle-age adults (18–71 yr) from Amherst, Massachusetts, and Nashville, Tennessee, wore two activity monitors and received three unannounced telephone-administered PDRs (two weekdays and one weekend day). Eligible participants for the study were 12–75 yr of age and were free of debilitating chronic diseases (e.g., heart failure, severe claudication, and terminal cancer), major cognitive or psychiatric disorders (e.g., dementia, and schizophrenia), and major orthopedic problems. They were also fluent in English and agreed to be available by phone during the study period. Our study population was enrolled as a convenience sample rather than a random sample from the general population. Height and weight were measured, and surveys were completed to gather demographic information. Social desirability, or the tendency to avoid criticism and portray one’s self in a more favorable manner (46), was measured using two scales. In adolescents, we used the nine-item Revised Children’s Manifest Anxiety Scale–Lie Scale (39), which was developed specifically for 6- to 19-yr-olds. It has established psychometric properties (39) and has been associated with reporting bias in diet and physical activity in 8- to 10-yr-old girls enrolled in an intervention (22). In adults, we used the 33-item Marlowe–Crowne Social Desirability Scale that has established psychometric properties (24) and has been linked to reporting biases in diet and physical activity in adults (1,18). Higher scores on both scales indicate higher levels of social desirability. A social desirability bias would be observed if the scales were associated with underreporting sedentary time and overreporting physically active time. Informed consent and/or assent was signed by 224 participants (and parents of adolescents), and 213 of these individuals (95%) provided information for the measurements being evaluated. The institutional review boards at Vanderbilt University and the University of Massachusetts approved all study activities.
The recall used was an updated version of our 24-Hour Physical Activity Recall that has been evaluated as a measure of physical activity (8) and used as a reference instrument in an earlier study (27). We use the name PDR here because, in addition to physical activity, the instrument now gathers more detailed information about sedentary behaviors. Interviewers were certified to complete the recalls using a standard training protocol composed of didactic and experiential training sessions designed to develop interviewing skills, expertise in interacting with the computer interface, and the integration of these two skills. During the study, interviewers led participants chronologically through the previous day (midnight to midnight) using a semistructured interview based on methods developed and refined for the 7-Day Physical Activity Recall (41). Interviewers gathered information about specific active and sedentary behaviors reported in three segments of the recall day (i.e., morning, afternoon, and evening). Individual behaviors lasting at least 5 min in a given period were recorded/coded, and the duration of the activities was entered directly into a database. Each behavior reported was coded as physically active or sedentary using reported body position and activity type (i.e., all exercise and sports pursuits were classified as “active”) and by the location and purpose of the activity. Additional information about the PDR is provided (see Appendix 1, Supplemental Digital Content, http://links.lww.com/MSS/A255; Previous-Day Recall Protocol). After completing each recall, interviewers assessed the overall reliability of the interview. Interviewers classified recalls as unreliable if the respondent was clearly unable to complete the recall or provide useful information for most the recall day. Sixteen of the 635 recalls (2.5%) were judged by the interviewers to be unreliable and were excluded from analysis. We defined sedentary behaviors as any behavior that was performed while sitting, reclining, or lying down during the waking day and that did not require substantial energy expenditure (typically <1.8 METs) (36). In contrast, physically active behaviors were defined as standing activities, or activities performed in any position that resulted in higher MET levels (typically > 1.8 METs). Exercise, sports, and active recreation pursuits were classified as active regardless of body position. Each activity in the database was derived from the compendium of physical activities, along with the associated MET values (2). To summarize the recall data, we summed the duration estimates of the individual sedentary and active behaviors that were reported (h·d−1), typically 15–30 different activities per recall. For the physically active behaviors, we also calculated time reported in light-intensity (<3 METs), moderate-intensity (3–5.9 METs), and vigorous-intensity (6+ METs) activity.
The activPAL (PAL Technologies, Glasgow, Scotland) is worn on the mid–right thigh and uses information about thigh position to estimate time spent in different body positions (horizontal = lying or sitting; vertical = standing or stepping). To do so, the instrument records the start and stop time of each individual bout (or event) of lying or sitting, standing, and stepping. Participants wore the device during waking hours, exclusive of bathing and swimming. They were asked to record the time they got out of/into bed and the times they wore the monitor each day. For the activPAL, we defined sedentary behavior as time spent sitting or lying during the waking day and physically active behavior as the sum of time spent standing or stepping. The device also estimates the energy cost of ambulatory activities using a prediction equation that uses stepping cadence and duration as the predictor variables (MET-hours = (1.4 × duration [h]) + (4 − 1.4) × (cadence [steps per minute] / 120) × duration (37). For descriptive purposes, we also calculated time recorded in moderate–vigorous stepping activities (i.e., 3+ METs). The ActivPAL accuracy for measuring body posture in laboratory settings is 95%–100% (15), and Kozey Keadle et al. (23) reported strong agreement for posture (R 2 = 0.94) between activPAL and direct observation (DO) in a free-living study. In an internal validity study, we examined 27 participants over 47 free-living periods of DO. Linear mixed models, which included a subject-specific random intercept, revealed a strong linear relationship between the activPAL and the DO. For sedentary time (min·d−1), the regression equation was activPAL = −1.43 + 1.00DO; and for active time (min·d−1), the regression equation was activPAL = 1.41 + 1.00DO. The correlation for both measures was r = 0.98 (unpublished observations).
The ActiGraph (model GT3X) is a triaxial accelerometer that was secured to the right hip using an elastic belt. The monitor was initialized to record vertical acceleration in 1-s epochs using the low-frequency extension. Sedentary time was defined as the sum of hours below 100 counts per minute (cpm), and active time was defined as time spent at or above 100 cpm (17,28). Light-intensity activity was estimated as time recorded between 100 and 759 cpm, and moderate–vigorous time was estimated using two cut points. We used the 760-cpm cut point that was calibrated to capture a broad range of lifestyle and ambulatory activities with an energy expenditure of 3 METs or greater (26). This cut point has been cross validated in free-living studies against indirect calorimetry (26), pattern recognition monitors (50), and time-use diaries (48). We also used the moderate–vigorous cut point of Freedson (1952 cpm) that was calibrated to capture walking and running behaviors (13) and that has been cross-validated against indirect calorimetry (14,20), an activity diary (43), and other accelerometers (50). Because of the small amount of time recorded in vigorous activity (5725+ cpm), we combined vigorous with moderate activity time for analysis.
Activity monitor summary and wear time estimation.
To determine monitor wear time for both devices, we used a combination of the wear log information and the automated wear time estimate of Choi et al. (9). The algorithm was set to use any nonzero value of activity counts (ActiGraph) or device movement (activPAL), the time window for consecutive minutes of 0 counts per movement was set at 60 min, and the artifact movement detection was set to allow interruptions of 2 min or less. Minimum wear time for a “valid” day was 10+ h. For analysis we calculated estimates of time estimated in sedentary and active behaviors in terms of absolute duration (h·d−1) and as a proportion of total wear time (% wear).
Participants eligible for this analysis (N = 213) provided 619 valid PDR days, 1178 valid activPAL days, and 1277 valid ActiGraph days. We first matched each instrument by date of assessment. Next, because our data collection protocol allowed for shorter valid assessment days for the monitors (minimum 10 h) compared with the PDR (no minimum), and more than 90% of PDR days had 12+ h of waking time reported, we also matched each instrument on daily observation time (±2 h). Of the 448 PDR–activPAL date matches, 345 d of assessment were within ± 2 h·d−1 for each method (n = 179 participants). From the 1029 ActiGraph-activPAL date matches, 915 d of assessment were within ± 2 h·d−1 (n = 185 participants). To investigate the possibility that our decision to minimize the effect of extraneous variation in daily observation time between measures of absolute duration by matching on PDR observation and/or monitor wear time could have influenced our results, we conducted sensitivity analyses. First, we fitted the measurement error models described below to the 448 d of PDR–activPAL observation not matched on observed/wearing time (n = 197) as well as the 1029 d of unmatched ActiGraph–activPAL data using the absolute duration values (h·d−1). Next, we fitted models for the PDR–activPAL comparisons using the percentage of sedentary time estimates (i.e., % observed and % wear) on both the matched and the unmatched days.
Measurement error modeling.
Ideally, we would want to assess the level of agreement between the true, but unobserved, hours of time spent in active and sedentary behaviors on a given day with the corresponding values estimated by the PDR and ActiGraph instruments. Specifically, we would inquire about the relationships between Sij and
and between Sij and
, where Sij,
are the hours individual i spent in sedentary behaviors on day j in truth, as estimated by the PDR, and as estimated by the ActiGraph. Similarly, we would inquire about the relationships between Aij and
and between Aij and
, where, Aij,
are the hours spent in active behavior in truth, and as estimated by the PDR and ActiGraph. For our purposes, as fully explained in the supplementary material (see Appendix 2, Supplemental Digital Content 2, http://links.lww.com/MSS/A221, Full Description of Measurement Error Modeling Methods and Assumptions), we chose to treat the activPal measures of sedentary and active behavior,
, respectively, as error-free estimates of the truth. Therefore, we model the desired relationships as
where the general superscript T can be replaced by either PDR or ActiGraph, is the person-specific bias, and is the random errors for the test instrument. We further assume that r 1 and [Latin Small Letter Open E]ij are independent and normally distributed with mean 0 and variances
The four parameters describing the quality of the test instrument are β 0, β 1,
. The intercept, β 0, and slope β 1 indicate whether the test instrument, on average, correctly estimates the duration of a given behavior. The ideal values of β 0 and β 1 would be 0 and 1, respectively, indicating that the time reported by the test instrument measure is, on average, proportional to the reference instrument. The variance,
, of the person-specific bias (i.e., between-individual variance) measures the magnitude of systematic over- or underestimation, while the variance,
, or the random error (i.e., within-individual variance) reflects nonsystematic or random measurement error. Ideally, both variances should be near 0. To obtain estimates of the desired coefficients, we fitted the models to the individual days of observation using linear mixed models by lmer from the lme4 package in R and calculated standard errors from 1000 bootstrapped samples. In addition, the mean difference of each participant’s average values (i.e., mean of available days), the SD for those differences (SDdif), and the coefficient of variation for those differences
is the average value, were also compared. Further comparisons of these data were made by the Bland–Altman approach (5) and using Spearman correlations.
In secondary analyses, we evaluated the PDR reports of light- and moderate–vigorous-intensity activity. To do so, we used ActiGraph estimates of these metrics as the reference measure using the structure of equation 2 but replaced
. This approach assumes that the estimate of light and moderate–vigorous activity from the ActiGraph is an unbiased and precise estimate of the truth. Given the uncertainty regarding this assumption in free-living conditions, estimates of the four parameters of interest for the PDR, in this case, could be biased away from their true values. We also report the mean differences and 95% confidence intervals between the PDR and the ActiGraph estimates of light and moderate–vigorous activity.
Descriptive characteristics of our study sample are presented in Table 1.
The level of agreement between the PDR and the activPAL is reported in Table 2, listing both the estimated coefficients for the mixed model and their correlations. Agreement between PDR and activPAL was high in the adults and boys. Among adults, the slope of the regression of PDR on activPAL, for both sedentary and active time, were approximately one (β 1 = 0.97–1.13), and the correlation between relevant pairs of measures was high (ρ = 0.77–0.81). The decomposition of the error variance in the recalls revealed that random errors (σ 2 [Latin Small Letter Open E]) tended to be larger than person-specific biases (σ 2 r). Among boys, slope values were also close to one (β 1 = 0.88 and 0.96), and the correlations were similar to those for adults (ρ = 0.75 and 0.80). Agreement was slightly weaker in girls. Girls had lower slope values (β 1 = 0.64 and 0.80) and lower correlations (ρ = 0.52 and 0.60) in comparison with adults and boys. Girls also had the lowest person-specific bias.
Results comparing the ActiGraph and the activPAL are presented in Table 3. For all groups, slope terms were less than 1 (β 1 = 0.61–0.73). Correlations were high for adults (ρ = 0.74–0.79) but slightly lower for adolescents (ρ = 0.57–0.70). The decomposition of the error variance for the ActiGraph measures revealed that device-specific bias (σ 2 r) tended to be larger than random errors (σ 2 [Latin Small Letter Open E]), particularly among males.
The t-tests for mean differences and Bland–Altman results are reported in Table 4. The evaluation of PDR versus activPAL revealed no statistically significant mean differences in time spent in active behaviors (all P ≥ 0.16), but reported sedentary time was greater than activPAL sedentary time in all groups (P ≤ 0.01). The CVdif% ranged from 15% to 32%, and the limits of agreement were wide. Spearman correlations for the difference scores between measures and the average of both measures were generally positive and were statistically significant in adults only. The evaluation of mean differences between ActiGraph and activPAL revealed no statistically significant differences in adults, but the ActiGraph underestimated sedentary time and overestimated active time in adolescents (both P < 0.01). The CVdif% ranged from 12% to 35%, and the limits of agreement were wide. Spearman correlations between the ActiGraph difference scores tended to be negative and were statistically significant in women.
Sensitivity analyses of the measurement error models on data not matched on PDR observation or monitor wear time revealed that in unmatched analyses, there was a substantial increase in the amount of random error (σ 2 [Latin Small Letter Open E]) and modest reductions of 0.1–0.2 units in the slope terms (β 1) and the correlations (ρ) for both the PDR (see Table 1, Supplemental Digital Content 3, http://links.lww.com/MSS/A218, Results for Previous-Day Recall without Matching) and ActiGraph (see Table 2, Supplemental Digital Content 4, http://links.lww.com/MSS/A219, Results for ActiGraph without Matching) compared with matched analyses presented in Tables 2 and 3, respectively. The evaluation of the PDR–activPAL data for days matched and unmatched on observation time using % sedentary indices, another method to control for differences in observation time, revealed only minimal variation in results for the slope, correlation, and random error terms by matching status (see Table 3, Supplemental Digital Content 5, http://links.lww.com/MSS/A220, Results for Previous-Day Recall % Sedentary Time with and without Matching).
Correlates of reporting errors in PDR.
The unexplained difference between PDR and activPAL (i.e., residuals) were not significantly correlated with age, sex, body mass index (BMI), or social desirability in either adults or adolescents. For example, the Spearman correlations between PDR residuals for time reported in active behaviors and BMI (kg·m−2) and social desirability were 0.03 (P = 0.81) and −0.002 (P = 0.98) in adults, and 0.01 (P = 0.94) and −0.05 (P = 0.63) in adolescents, respectively. Spearman correlations between PDR residuals for time reported in sedentary behaviors and BMI and social desirability were 0.02 (P = 0.85) and −0.03 (P = 0.79) in adults and 0.08 (P = 0.44) and 0.14 (P = 0.20) in adolescents, respectively.
Estimates of light and moderate–vigorous physical activity by PDR.
We also evaluated PDR-reported light- and moderate–vigorous-intensity activity using common ActiGraph cut points. The comparison of mean differences revealed that PDR reports of light activity tended to be lower than ActiGraph (100–759 cpm), but there were no significant differences in moderate–vigorous activity by PDR and the ActiGraph (760+ cpm) (i.e., the 95% confidence intervals include 0; Fig. 1). Approximately 1–1.5 h more moderate–vigorous activity was reported on the PDR than recorded by ActiGraph 1952+ cpm estimates. The evaluation of PDR-reported overall, light, and moderate–vigorous activity using measurement error models with the ActiGraph as the reference measure are reported in Table 5. Briefly, for light activity, the slope terms were less than 1 (β 1 = 0.34–0.84), and correlations were ρ = 0.41–0.63 in adults and boys but lower in girls (ρ = 0.18). Using the 760+ cpm moderate–vigorous activity cut point as reference, among adults and boys, the slope terms were approximately 1 (β 1 = 0.88–1.14) and the correlations were ρ = 0.49–0.63. Both indicators were lower in girls (β 1 = 0.67 and ρ = 0.39). In all models, person-specific bias (σ 2 r) tended to be less than random error (σ 2 [Latin Small Letter Open E]).
In this study of free-living adolescents and adults, we found self-report of time spent in physically active and sedentary behaviors by PDR to be strongly correlated with activPAL measures, particularly in adults, and that random reporting errors were larger than person-specific biases. Consistent with our finding of relatively low amounts of person-specific bias, or a person’s proclivity to systematically over- or underreport physical activity, we also found no correlation between age, BMI, or social desirability and reporting errors on the PDR. Notably, the validity of the PDR and the ActiGraph was comparable with each other as compared with the activPAL for physically active and sedentary behaviors. The PDR also appeared to provide useful estimates of light- and moderate–vigorous-intensity activity in comparison with commonly used ActiGraph cut points. Collectively, results from this study indicate that PDR-based estimates of physically active and sedentary time are valid and unbiased measures of time reported in active and sedentary pursuits. PDR may be a valuable alternative to traditional questionnaire-based measures of these behaviors in future epidemiological studies, particularly those interested in measuring time spent in different body postures (i.e., sitting vs standing/active), in specific types of behavior, as well as the location and purpose of activity-related behaviors.
In contrast to most physical activity questionnaires that are designed to assess usual activity levels (e.g., past year), which typically have validity coefficients of 0.3–0.5 when compared with doubly labeled water or accelerometer-based measures (34,49), we found much higher levels of validity in adults and boys (0.75–0.81) and somewhat better results for girls (0.52–0.60) for overall time in physically active and sedentary behavior. Several studies that have examined the validity of various short-term recall approaches are consistent with our results. Hart et al. (16) compared activPAL sitting time to estimates from a physical activity log completed throughout the day and reported a strong correlation (r = 0.87) and no mean differences between measures. van der Ploeg et al. (48) compared diary-based time-use estimates of nonoccupational time on two separate days to the ActiGraph using 100 and 760 cpm cut points and reported correlations that were similar to our results for sedentary time (r = 0.57–0.59), light activity (r = 0.27–0.39), and moderate–vigorous activity (r = 0.57–0.69). Ridley et al. (40) compared a computer-based PDR to accelerometer counts in youth and, for those 11 yr or older, reported correlations of 0.57 for overall physical activity level and 0.41 for moderate–vigorous activity reported on the recall. Calabro et al. (8) compared an earlier version of the PDR to two pattern recognition monitors and reported strong correlations (r = 0.89–0.91) with total energy expenditure (kcal·kg−1·d−1) and slightly lower correlations for moderate–vigorous activity (r = 0.57–0.70). Consistent with our findings indicating little correlation between reporting bias and BMI and social desirability for physical activity, Adams et al. (1) reported no evidence of association with reporting bias and these correlates in multiple PDR in comparison with doubly labeled water in 81 postmenopausal women. The present study extends the finding of an apparent absence of social desirability on the PDR to sedentary behaviors, as well as to men and adolescents. In contrast, Klesges et al. (22) found a positive correlation between physical activity reporting errors and social desirability in African American girls (8–10 yr) enrolled in an intervention. Differences between our studies may be due to the physical activity measures used, cultural factors, additional demands associated with being in an intervention, or the younger age group in the Klesges et al. (22) report.
In contrast to our findings of small mean differences and the predominance of random reporting errors in self-report, Nusser et al. (35) reported that estimates of total energy expenditure (TEE, kcal·d−1) derived from a PDR were greater than values estimated by a pattern recognition activity monitor among 171 women. Also using measurement error models, they reported that person-specific biases were three to four times greater than random error in the recalls. There are several possible explanations for the differences between our studies. First, the use of TEE values (kcal·d−1) as a proxy measure of “physical activity” is susceptible to errors in estimating resting energy expenditure—the major component of TEE. A positive bias in TEE by recall may have been introduced if resting expenditure was estimated as 1 MET because this method overestimates this quantity, and the bias gets larger as BMI level increases (7). This type of bias, unrelated to participant reporting, may have inflated their estimates of person-specific bias. Second, the relatively strict modeling assumption that the reference instrument is unbiased (32) may not have been adequately fulfilled in their study. The reference measure used has been found to systematically underestimate TEE at higher levels of expenditure (21,44), and it may be that an underestimate of TEE by the reference instrument could result in the appearance of larger amounts of person-specific bias in the study (35), regardless of actual reporting accuracy. Additional analyses of these data using different indices of physical activity and considering potential limitations of the available reference instrument would be valuable to further our understanding of the similarities and differences in results for our respective studies.
The present study had several strengths that merit comment. Our study sample was relatively large (n = 180), it was composed of both adolescents and adults, and we were able to evaluate the validity of the most basic information reported on the PDR (i.e., activity type [active and sedentary] and duration) compared with the activPAL instrument that used similar definitions of these constructs. Our inclusion of the parallel analysis of the validity of the ActiGraph, an established instrument (28,47), provides a benchmark against which the PDR results can be compared. Results suggest that that the PDR was comparable with the ActiGraph as a measure of physically active and sedentary behaviors with respect to the linear relation, strength of correlation, and low levels of systematic error (person-specific bias) when compared with the activPAL. We also evaluated time reported in light and moderate–vigorous activity in comparison with estimates from the ActiGraph using a cut point (760 cpm) that was consistent with the scope of the PDR (i.e., assessment of the full range of moderate–vigorous-intensity activities). Results provide some evidence for the value of light and moderate–vigorous activity reported on the PDR. The similarity of PDR-reported moderate–vigorous activity time with the ActiGraph 760 cpm cut point values support the accuracy of both instruments for this metric, and this finding is largely consistent with several free-living studies (8,48,50). Our finding that PDR reports of moderate–vigorous activity was greater than estimates derived from the ActiGraph cut point calibrated only to walking and running (1952 cpm) is also consistent with reports indicating underrecording of moderate–vigorous time by monitors calibrated in this way, in comparison with other accelerometers (50), activity diaries (29,43), and indirect calorimetry (14). An additional strength was our detailed sensitivity analyses to evaluate key aspects of our matching approach to minimizing the known variation in daily observation time between measures. Results suggested that variation in observation time between the instrument being tested and the reference measure was a substantial source of random error that may have exerted a modest negative influence on the slope terms and the correlations between measures and thus supported our decision to use matching to minimize this source of variability.
There are also limitations to the present study that must be considered. First, our study population was a convenience sample of adolescents and adults that were primarily Caucasian, well educated, and largely working or going to school during their time of study participation. Results may be different in study populations with different demographic characteristics and work and school schedules. Second, our study design limited our ability to investigate possible differences between weekdays and weekend days using measurement error models. To do so would require replicate measures on both weekdays and weekend days, and we only assessed weekend days once. Future studies should examine this issue more closely because measures of behavior on both types of day may be needed to generate useful long-term averages (e.g., per week or year) (35). In addition, we did not address the important question regarding the number of replicate PDR measures that are needed to account for seasonal and day of the week effects as well as true variation in behavior from day to day (e.g., ). In the context of large-scale epidemiologic investigations, we have recently shown that a relatively small number of replicate recalls (e.g., three to four recalls) obtained using random sampling can substantially reduce the effect of day-to-day variation on behavior–disease associations (32), but more research is needed to enhance our knowledge in this area (e.g., ). Another limitation of our study was the lack of an accurate and precise measure of physical activity intensity for use as a reference measurement. Although the convergence of the PDR and ActiGraph (760 cpm) estimates for moderate–vigorous activity lends some support for both instruments, uncertainty remains regarding the precision of the ActiGraph measures, and this source of error could underestimate the apparent validity of the PDR. Indeed, we observed that the correlations with overall active time were 15%–25% lower, and the variance estimates for person-specific bias and random error were larger, when the ActiGraph was used to evaluate the PDR (see Table 5), compared with results using the activPAL for reference (see Table 2). Thus, the use of the ActiGraph in this context may modestly underestimate the validity of PDR-reported light- and moderate–vigorous-intensity activity. Clearly, future studies are needed to extend our understanding of the validity of reports for different activity intensities using better reference measures as well as the validity of the contextual information reported on the PDR (i.e., location, purpose of activity).
This report provides proof of principle that the PDR may be a valid method for measurement of physically active and sedentary behaviors in epidemiologic studies that seek to rank-order individuals by level of a given behavioral exposure. PDR may be particularly useful for studies that desire to assess the full range of human behavior, body position, and also gather details about where and why these behaviors occur. Future studies are needed to replicate these findings among larger, more ethnically diverse study populations and to evaluate the ability of the interviewer-based PDR to be translated to self-administered PDR suitable for large-scale studies (e.g., Internet-based instruments, mobile devices).
This research was supported by funding from the National Institutes of Health (grant no. R01NR011477) to Drs. Fowke and Freedson. The Intramural Research Program of the National Institutes of Health supported Dr. Matthews work on this project.
The authors thank Cara Hanby, Mary Kay Fadden, Stacey Peterson, and Sara Hollis for their integral work in helping develop and refine the PDR method and the initial infrastructure for the present study.
Patty S. Freedson is a member of the ActiGraph Scientific Advisory Board. No other potential conflicts of interest are declared.
The results of the present study do not constitute endorsement by the American College of Sports Medicine.
1. Adams SA, Matthews CE, Moore CG, Cunningham JE, Fulton J, Hebert JR. The effect of social desirability and social approval on self-reports of physical activity. Am J Epidemiol
. 2005; 161 (4): 389–98.
2. Ainsworth B, Haskell W, Whitt M, et al. Compendium of physical activities: an update of activity codes and MET intensities. Med Sci Sports Exerc
. 2000; 32 (9): S498–516.
3. Atkin AJ, Gorely T, Clemes SA, et al. Methods of measurement in epidemiology: sedentary behaviour. Int J Epidemiol
. 2012; 41 (5): 1460–71.
4. Baranowski T, Masse LC, Ragan BWG. How many days was that? We’re still not sure, but we’re asking the question better! Med Sci Sports Exerc
. 2008; 40 (1 suppl): S544–9.
5. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet
. 1986; 1 (8476): 307–10.
6. Bowles HR. Measurement of active and sedentary behaviors: closing the gaps in self-report methods. J Phys Act Health
. 2012; 9 (Suppl 1): S1–4.
7. Byrne NM, Hills AP, Hunter GR, Weinsier RL, Schutz Y. Metabolic equivalent: one size does not fit all. J Appl Physiol
. 2005; 99 (3): 1112–9.
8. Calabro MA, Welk GJ, Carriquiry AL, Nusser SM, Beyler NK, Matthews CE. Validation of a computerized 24-Hour Physical Activity Recall (24PAR) instrument with pattern-recognition activity monitors. J Phys Act Health
. 2009; 6: 211–20.
9. Choi L, Liu Z, Matthews CE, Buchowski MS. Validation of accelerometer wear and nonwear time classification algorithm. Med Sci Sports Exerc
. 2011; 43: 357–64.
10. Ferrari P, Friedenreich C, Matthews CE. The role of measurement error in estimating levels of physical activity. Am J Epidemiol
. 2007; 166 (7): 832–40.
11. Freedman LS, Schatzkin A, Midthune D, Kipnis V. Dealing with dietary measurement error in nutritional cohort studies. J Natl Cancer Inst
. 2011; 103: 1086–92.
12. Freedson PS, Bowles HR, Troiano R, Haskell WL. Assessment of physical activity using wearable monitors: recommendations for monitor calibration and use in the field. Med Sci Sports Exerc
. 2012; 44 (1 suppl): S1–4.
13. Freedson PS, Melanson E, Sirard J. Calibration of the Computer Science and Applications, Inc. accelerometer. Med Sci Sports Exerc
. 1998; 30 (5): 777–81.
14. Freedson PS, Lyden K, Kozey-Keadle S, Staudenmayer J. Evaluation of artificial neural network algorithms for predicting METs and activity type from accelerometer data: validation on an independent sample. J Appl Physiol
. 2011; 111 (6): 1804–12.
15. Grant PM, Ryan CG, Tigbe WW, Granat MH. The validation of a novel activity monitor in the measurement of posture and motion during everyday activities. Br J Sports Med
. 2006; 40 (12): 992–7.
16. Hart TL, Ainsworth BE, Tudor-Locke C. Objective and subjective measures of sedentary behavior and physical activity. Med Sci Sports Exerc
. 2011; 43 (3): 449–56.
17. Healy GN, Clark B, Winkler EAH, Gardiner PA, Brown WJ, Matthews CE. Measurement of adults’ sedentary time in population-based studies. Am J Prev Med
. 2011; 41 (2): 216–27.
18. Hebert JR, Ebbeling CB, Matthews CE, et al. Social desirability and approval-related biases in middle-aged women’s estimates of energy intake: comparing structured dietary questionnaires to total energy expenditure from doubly labeled water. Ann Epidemiol
. 2002; 12: 577–86.
19. Helmerhorst HJ, Brage S, Warren J, Besson H, Ekelund U. A systematic review of reliability and objective criterion-related validity of physical activity questionnaires. Int J Behav Nutr Phys Act
. 2012; 9 (103): 1–55.
20. Hendelman D, Miller K, Baggett C, Debold E, Freedson P. Validity of accelerometry for the assessment of moderate intensity physical activity in the field. Med Sci Sports Exerc
. 2000; 32 (Suppl 9): 442–9.
21. Johannsen DL, Calabro MA, Stewart J, Franke W, Rood JC, Welk GJ. Accuracy of armband monitors for measuring daily energy expenditure in healthy adults. Med Sci Sports Exerc
. 2010; 42 (11): 2134–40.
22. Klesges LM, Baranowski T, Beech B, et al. Social desirability bias in self-reported dietary, physical activity and weight concerns measures in 8- to 10-year-old African-American girls: results from the Girls Health Enrichment Multisite Studies (GEMS). Prev Med
. 2004; 38 (Suppl): S78–87.
23. Kozey-Keadle S, Libertine A, Lyden K, Staudenmayer J, Freedson P. Validation of wearable monitors for assessing sedentary behavior. Med Sci Sports Exerc
. 2011; 43 (8): 1561–7.
24. Marlowe D, Crowne DP. Social desirability and response to perceived situational demands. J Consult Psychol
. 1961; 25 (2): 109–15.
25. Matthews CE. Techniques for physical activity assessment: self-report instruments. In: Welk G, Dale D, editors. Physical Activity Assessments for Health-Related Research
. Champaign (IL): Human Kinetics; 2002. pp. 107–23.
26. Matthews CE. Calibration of accelerometer output for adults. Med Sci Sports Exerc
. 2005; 37 (11 suppl): S512–22.
27. Matthews CE, Ainsworth BE, Hanby C, et al. Development and testing of a short physical activity recall questionnaire. Med Sci Sports Exerc
. 2005; 37 (6): 986–94.
28. Matthews CE, Chen KY, Freedson PS, et al. Amount of time spent in sedentary behaviors—United States 2003–2004. Am J Epidemiol
. 2008; 167: 875–81.
29. Matthews CE, Freedson PS. Field trial of a three-dimensional activity monitor: comparison with self-report. Med Sci Sports Exerc
. 1995; 27 (7): 1071–8.
30. Matthews CE, George SM, Moore SC, et al. Amount of time spent in sedentary behaviors and cause-specific mortality in US adults. Am J Clin Nutr
. 2012; 95: 437–45.
31. Matthews CE, Hebert JR, Freedson PS, et al. Sources of variance in daily physical activity levels in the seasonal variation of blood cholesterol study. Am J Epidemiol
. 2001; 153 (10): 987–95.
32. Matthews CE, Moore SC, George SM, Sampson J, Bowles HR. Improving self-reports of active and sedentary behaviors in large epidemiologic studies. Exerc Sports Sci Rev
. 2012; 40 (3): 118–26.
33. Moore SC, Gierach GL, Schatzkin A, Matthews CE. Physical activity, sedentary behaviours, and the prevention of endometrial cancer. Br J Cancer
. 2010; 103 (7): 933–8.
34. Neilson HK, Robson PJ, Friedenreich CM, Csizmadi I. Estimating activity energy expenditure: how valid are physical activity questionnaires? Am J Clin Nutr
. 2008; 87 (2): 279–91.
35. Nusser SM, Beyler NK, Welk GJ, Carriquiry AL, Fuller WA, King BMN. Modeling errors in physical activity recall data. J Phys Act Health
. 2012; 9 (Suppl 1): S56–67.
36. Owen N, Sparling PB, Healy GN, Dunstan DW, Matthews CE. Sedentary behavior: emerging evidence for a new health risk. Mayo Clin Proc
. 2010; 85 (12): 1138–41.
37. PAL Technologies. activPAL Operating Guide. Appendix A—Technical Description. p. 15–17. 2010.
38. Physical Activity Guidelines Advisory Committee. Physical Activity Guidelines Advisory Committee Report
. Washington (DC): U.S. Department of Health and Human Services; 2008. pp. A1–10.
39. Reynolds CR, Paget KR. National normative and reliability data for the Revised Children’s Manifest Anxiety Scale. School Psych Rev
. 1983; 12: 324–36.
40. Ridley K, Olds T, Hill A. The multimedia activity recall for children and adolescents (MARCA): development and evaluation. Int J Behav Nutr Phys Act
. 2006; 3 (1): 10.
41. Sallis JF. A collection of physical activity questionnaires for health-related research: Seven-day physical activity recall. In: Kriska AM, Casperson CJ, editors. A Collection of Physical Activity Questionnaires
. Med Sci Sports Exerc
. 1997; 29 (6 suppl): S89–103.
42. Schatzkin A, Subar AF, Moore S, et al. Observational epidemiologic studies of nutrition and cancer: the next generation (with better observation). Cancer Epidemiol Biomarkers Prev
. 2009; 18 (4): 1026–32.
43. Sirard JR, Melanson EL, Li L, Freedson PS. Field evaluation of the Computer Science and Applications, Inc. physical activity monitor. Med Sci Sports Exerc
. 2000; 32 (3): 695–700.
44. St-Onge M, Mignault D, Allison DB, Rabasa-Lhoret R+. Evaluation of a portable device to measure daily energy expenditure in free-living adults. Am J Clin Nutr
. 2007; 85 (3): 742–9.
45. Thorp AA, Owen N, Neuhaus M, Dunstan DW. Sedentary behaviors and subsequent health outcomes in adults: a systematic review of longitudinal studies, 1996–2011. Am J Prev Med
. 2011; 41 (2): 207–15.
46. Tourangeau R, Rips LJ, Rasinski K. Editing of responses: reporting about sensitive topics. In: Tourangeau R, Rips LJ, Rasinski K, editors. The Psychology of Survey Response
. 1st ed. Cambridge: Cambridge University Press; 2000. pp. 257–8.
47. Troiano RP, Berrigan D, Dodd KW, Masse LC, Tilert T, McDowell M. Physical activity in the United States measured by accelerometer. Med Sci Sports Exerc
. 2008; 40 (1): 181–8.
48. van der Ploeg HP, Merom D, Chau JY, Bittman M, Trost SG, Bauman AE. Advances in population surveillance for physical activity and sedentary behavior: reliability and validity of time use surveys. Am J Epidemiol
. 2010; 172 (10): 1199–206.
49. van Poppel MNM, Chinapaw MJM, Mokkink LB, van Mechelen W, Terwee CB. Physical activity questionnaires for adults: a systematic review of measurement properties. Sports Med
. 2010; 40 (7): 565–600.
50. Welk GJ, McClain JJ, Eisenmann JC, Wickel EE. Field validation of the MTI Actigraph and BodyMedia armband monitor using the IDEEA monitor. Obesity
. 2007; 15 (4): 918–28.
Keywords:© 2013 American College of Sports Medicine
EXPOSURE ASSESSMENT; MEASUREMENT ERROR; PHYSICAL ACTIVITY; BEHAVIORAL EPIDEMIOLOGY