Accurate assessment of physical activity is critical when examining the associations between physical activity and chronic diseases and disabilities. Accelerometer-based monitors have emerged as valid tools to directly quantify movement that results from physical activity and sedentary behaviors (1,2). However, this rapidly evolving technology has led to a need for population-based studies to constantly keep pace with the latest accelerometer models and best practices, particularly related to data reduction and processing techniques. ActiGraph LLC (Pensacola, FL) is one of the leading manufacturers of research-grade accelerometers. Since the company’s inception, multiple models have been released for use in studies, including the AM 7164 (uniaxial), GT1M (uniaxial), and GT3X series (triaxial), which includes the GT3X+ (extra battery storage) and wGT3X-BT (extra battery storage plus wireless Bluetooth capabilities).
Prior population-based studies have used the ActiGraph 7164 to assess physical activity and sedentary behaviors (3,4); however, the older-generation models are no longer commercially available and have since been replaced with newer-generation devices. Little is known about the comparability of the old- and new-generation ActiGraphs. This gap in knowledge must be addressed to make accurate comparisons within studies, which are examining change in activity patterns over time when using two different types of instruments, as well as between studies. Some evidence suggests that the ActiGraph 7164 and GT3X+ models are comparable (when using the low-frequency extension for the GT3X+) for time spent sedentary and in light- (LPA), moderate- (MPA), and vigorous-intensity physical activity (VPA) (5); however, findings are not consistent (6,7).
The Coronary Artery Risk Development in Young Adults (CARDIA) study included accelerometer-based measures of physical activity and sedentary time in 2005–2006 (year 20) and again in 2015–2016 (year 30) using the ActiGraph 7164 and wGT3X-BT models, respectively. The primary objective of assessing physical activity at two separate examinations was to evaluate the 10-yr changes in physical activity and sedentary behaviors from early to late midlife in relation to cardiovascular disease risk. However, to do so, it is first necessary to examine the comparability of these two monitors. For this purpose, a subset of participants who had worn the ActiGraph 7164 at the year 20 examination were asked to simultaneously wear the ActiGraph 7164 and the wGT3X-BT for 7 consecutive days at the year 30 examination.
The primary objective of the present study was to examine the comparability, including agreement between the ActiGraph 7164 and wGT3X-BT (using the low-frequency extension) in wear time, count-based estimates (vertical axis), and average time spent per day in sedentary, LPA, MPA, VPA, and moderate-to-vigorous physical activity (MVPA), assessed during the year 30 CARDIA examination. If differences were observed, our secondary objective was to develop and test a calibration formula to enhance comparability of the two monitors. Together, these steps outline a methodological approach that could be applied to other research scenarios where accelerometer data harmonization is necessary. In exploratory analyses, we also examined differences in comparability by race/sex group, as well as body mass index (BMI) category.
CARDIA is an ongoing prospective cohort study of 5115 black and white men and women recruited for an in-person clinical examination in 1985–1986 (year 0) at the ages of 18–30 yr from four centers: Birmingham, Alabama; Minneapolis, Minnesota; Chicago, Illinois; and Oakland, California. Participants completed additional in-person clinic examinations in 1987–1988 (year 2), 1990–1991 (year 5), 1992–1993 (year 7), 1995–1996 (year 10), 2000–2001 (year 15), 2005–2006 (year 20), 2010–2011 (year 25), and 2015–2016 (year 30). Details on eligibility criteria, methods of participant selection, and follow-up have been previously reported (8).
As part of the CARDIA Activity Study, an ancillary study conducted in conjunction with the year 30 CARDIA examination, 100 ambulatory participants (25 from each of the four race/sex groups: black men, black women, white men, white women) from the Oakland, CA, site were asked to simultaneously wear the 7164 and wGT3X-BT accelerometers for 7 d. The wGT3X-BT was included as part of the general CARDIA Activity Study protocol that was conducted at each of the four CARDIA sites. A total of 91 participants conformed to wear rules. Four devices malfunctioned (all 7164, which had been purchased in 2004); therefore, data from 87 participants were available for analyses. The institutional review board at each center approved all study protocols for the core CARDIA examination and the CARDIA Activity Study. Written informed consent was obtained at each examination, separately for the core and ancillary study.
Data handling and processing
Both the 7164 and wGT3X-BT were initialized to start collecting data at 12:00 AM on day 1 of data collection (day after the in-person examination) using the participant’s unique study ID number. The monitors were worn at the waist on the same elastic belt, approximately 10 cm apart. The 7164 model was initialized to collect data in 60-s epochs, and for the wGT3X-BT devices, raw triaxial data were sampled at 40 Hz. When the accelerometer and wear-time log were returned to the Oakland site, data from both devices were downloaded and prepared for processing and analysis. Raw data from the wGT3X-BT were reintegrated to 60-s epochs with the low-frequency extension applied using ActiLife software to maximize compatibility with the 7164 data. Files from both devices were screened for wear time using the Troiano algorithm via modified SAS programs developed for processing National Health and Nutrition Examination Survey 2003–2004 accelerometer data (4,9). The low-frequency extension was applied to the data based on previous work, indicating that this option has great sensitivity at lower-intensity activities and resulted in more comparable results with older accelerometer models (5,6). Weekly summary physical activity and sedentary behavior estimates were averaged (across days) for all participants with at least 4 valid days of 10 or more hours of wear time based on criteria established for National Health and Nutrition Examination Survey 2003–2004 and 2005–2006 cycles (4). Total and average accelerometer counts per day were calculated using summed counts detected over wear periods and time (i.e., minutes) spent in different intensity levels using standardized cut-point threshold values (4,10,11). For this analysis, estimates that apply Freedson cut-point threshold values were applied. Sedentary time in counts per minute was defined as <100, LPA as 100–1951, MPA as 1952–5724, and VPA as ≥5725 (11).
Descriptive variables include self-reported race, sex, and age. BMI was calculated using measured height and weight (kg·m−2). Smoking behaviors were self-reported via a tobacco use questionnaire (previously validated by a study using serum cotinine levels) (12).
Minute-by-minute within-person Pearson correlations of counts per minute per day were examined. Next, paired t-tests or Wilcoxon signed ranks tests (VPA only) compared the values for the 7164 and wGT3X-BT in estimates from the vertical axis, including the following: wear time, count-based estimates, and average minutes per day in sedentary, LPA, MPA, VPA, and MVPA. Count-based estimates systematically differed between the two accelerometers; therefore, a calibration formula was applied to the wGT3X-BT values. We chose to calibrate the wGT3X-BT values, rather than the 7164 values, because the Freedson cut points for physical activity intensities were originally developed using the 7164 accelerometer (11). Linear regression of the wGT3X-BT on the 7164 counts per minute per day passed very close to the origin; therefore, the slope of the 7164 counts per minute per day (1.088) was the calibration proportionality. The wGT3X-BT count-based estimates and average minutes per day in sedentary, LPA, MPA, VPA, and MVPA were calibrated by dividing the count in each 60-s epoch by 1.088. Then, paired t-tests or Wilcoxon signed ranks test were repeated to compare values for count-based estimates, sedentary, LPA, MPA, VPA, and MVPA between the 7164 and calibrated wGT3X-BT (see Figure, Supplemental Digital Content 1, Conceptual schematic of the calibration method, http://links.lww.com/MSS/B207).
Agreement between the models, using both the original and calibrated wGT3X-BT values, was further evaluated by calculating intraclass correlation coefficients (ICC), with values of <0.5, 0.5–0.75, 0.75–0.9, and >0.9 indicative of poor, moderate, good, and excellent reliability (13). Bland–Altman plots were created in Microsoft Excel using both the original and calibrated wGT3X-BT values to assess the limits of agreement, which were set at ±2 SD of the difference scores. The difference between the 7164 and calibrated wGT3X values by race/sex group and BMI category was evaluated using Kruskal–Wallis tests. Statistical analyses were conducted using SAS 9.4, (SAS Institute, Inc., Cary, NC) and SPSS 24 (IBM Corp., Chicago, IL). All tests were two-sided, with statistical significance set at P < 0.05.
As seen in Table 1, participants’ age averaged 55.4 ± 3.4 yr (range, 48.0–61.0 yr). Average BMI (kg·m−2) was 28.3 ± 5.3 for black men, 28.0 ± 3.9 for white men, 30.6 ± 7.6 for black women, and 25.4 ± 8.6 for white women. Participants wore the two accelerometers for an average of 6.7 ± 1.1 d (range, 4.0–9.0 d) and 14.7 ± 1.3 h·d−1 (range, 11.9–19.0 h·d−1). Before calibration, the minute-by-minute within-person correlations of counts per minute per day identified four dysfunctional 7164 devices (average correlation = 0.03). After exclusion of these devices, the minute-by-minute correlations of counts per minute per day in the final sample (N = 87) averaged 0.74 (r = 0.34–0.49, n = 12; r = 0.50–0.74, n = 31; r ≥ 0.75, n = 44). Although not statistically significant, in comparison to those with high minute-by-minute correlations (r ≥ 0.75), individuals with lower correlations (r < 0.75) wore the accelerometers for fewer days (6.9 ± 0.7 vs 6.5 ± 1.3, P = 0.08) and for fewer hours per day (14.9 ± 1.3 vs 14.4 ± 1.3, P = 0.06).
Comparisons between 7164 and original wGT3X-BT values
There were no significant differences between the 7164 and original wGT3X-BT values for total wear time (mean difference, 1.3 ± 33.5 min·d−1; P = 0.72; see Table 2). The 7164 reported significantly lower values for total accelerometer counts per day (−23,924.6, P < 0.001), average counts per minute per day (−27.5, P < 0.001), LPA (−6.8, P = 0.02), MPA (−3.8, P < 0.001), and MVPA (−4.1, P < 0.001), and significantly higher values for sedentary minutes per day (12.1, P < 0.001) as compared with the wGT3X-BT. No differences were observed for VPA minutes per day (0.0, P = 0.22). The overall difference between monitors in total counts per day was approximately 7% (mean counts per day difference between monitors using original wGT3X-BT data divided by mean counts per day for the 7164).
Comparisons between 7164 and calibrated wGT3X-BT values
Calibrated wGT3X-BT values were obtained by dividing counts per minute per day by 1.088 on the basis of regression of the wGT3X-BT on the 7164. The value 1.088 was similar to the value obtained (1.081) when only examining the 44 people with minute-by-minute count correlations of ≥0.75. When comparing the 7164 and calibrated wGT3X-BT values, there were no significant differences between the two devices in total accelerometer counts per day (3098.9, P = 0.48), average counts per minute per day (3.0, P = 0.54), sedentary (3.6, P = 0.23), LPA (−3.5, P = 0.22), MPA (0.7, P = 0.31), or MVPA minutes per day (1.2, P = 0.13). However, there was a significant difference between the two monitors in VPA, with the 7164 reporting more minutes per day as compared with the wGT3X-BT (0.1, P < 0.01). However, the absolute difference between monitors was marginal. There were 52 participants who reached the threshold for VPA with the 7164 accelerometer compared with 31 for the wGT3X-BT (using calibrated data). Significant differences in VPA minutes per day remained when including only those who reached the threshold for VPA on both monitors (N = 28, 0.4, P = 0.01; data not shown). Using the calibrated wGT3X-BT data, the overall difference between monitors in total counts per day was less than 1%. There were no significant differences by race/sex group (see Table, Supplemental Digital Content 2, Activity comparisons by race/sex group, http://links.lww.com/MSS/B208) or BMI category (see Table, Supplemental Digital Content 3, Activity comparisons by BMI category, http://links.lww.com/MSS/B209) when examining the difference in accelerometer values between monitors.
Agreement between the 7164 and original wGT3X-BT values for estimating total wear time, count-based estimates, and average time spent in physical activity of varying intensities was excellent (ICC range, 0.948–0.975; see Table 3). All ICC values increased when examining the agreement between the 7164 and calibrated wGT3X-BT values (ICC range, 0.973–0.986). When examining only those with high minute-by-minute correlations (r ≥ 0.75, n = 44), there were additional improvements in ICC, using both the original (ICC range, 0.957–0.979) and calibrated wGT3X-BT values (ICC range, 0.979–0.990; data not shown).
In Figure 1, the Bland–Altman plots show the agreement between the 7164 and wGT3X-BT, using the original or calibrated values, for total wear time, count-based estimates, and average time spent in sedentary, LPA, MPA, VPA, and MVPA. The plots indicate that the differences between monitors, whether using original or calibrated values for the wGT3X-BT, were generally within 2 SD, with no clear pattern of disagreement observed across the x-axis, with few exceptions. For example, the limits of agreement for MVPA ranged from −21.9 to 13.8 min·d−1 using the original wGT3X-BT data, and from −12.8 to 15.1 min·d−1 for the calibrated wGT3X-BT data. The mean difference in MVPA using original data was −4.1 min·d−1 compared with 1.2 min·d−1 for the calibrated values, indicating that bias was reduced after calibration.
In this comparison of two different Actigraph accelerometer models worn simultaneously by 87 CARDIA participants at the year 30 examination, significant differences between the older-generation 7164 and newer-generation wGT3X-BT existed for count-based estimates and physical activity of different intensities, with the 7164 reporting lower values for counts per day, counts per minute per day, LPA, MPA, and MVPA, and higher values for sedentary minutes per day. No significant differences were observed for VPA. However, after applying a calibration formula to the wGT3X-BT, there were no significant differences in count-based estimates or time spent in sedentary, LPA, MPA, and MVPA between the two accelerometers. There was a difference between the two monitors in VPA after calibration, with the 7164 reporting on average an additional 0.1 min·d−1 than the wGT3X-BT. Agreement as assessed by ICC between the two accelerometers using either original or calibrated wGT3X-BT values was excellent. Bland–Altman plots also showed reasonable agreement across measures. Although agreement between the 7164 and the original wGT3X-BT was high, significant differences in the output before calibration indicate that the two monitors should be compared after calibration.
Our findings that the 7164 and wGT3X-BT values (before calibration) were significantly different across measures, with the 7164 reporting lower values for total physical activity (−23,925 counts per day), average physical activity (−27.5 counts per minute per day), LPA (−6.8 min·d−1), MPA (−3.8 min·d−1), and MVPA (−4.1 min·d−1), and higher values for sedentary time (12.1 min·d−1), are not consistent with the existing literature. Sandroff and Motl (7) also reported significant differences in counts per day between the 7164 and GT3X among 41 participants with multiple sclerosis and 41 matched controls; however, in their study, the 7164 reported higher, rather than lower, values as compared with the GT3X (12,487 counts per day). It is worth noting that Sandroff and Motl used the GT3X and not the GT3X+; therefore, they did not have the ability to use the low-frequency extension option, which has shown to enhance comparability between the older- and newer-generation monitors (5,6).
Prior studies to examine the comparability of the 7164 and GT3X+ using the low-frequency extension option, as done in the present study, are limited and findings are mixed. Cain et al. (5) reported nonsignificant differences between the 7164 and the GT3X+, using the Freedson accelerometer cut points, for sedentary time (3.5 min·d−1), LPA (4.3 min·d−1), MPA (0.3 min·d−1), and VPA (0.4 min·d−1) among 25 adults who wore the monitors consecutively for 3 d. Ried-Larsen et al. (6) examined the comparability of the two monitors among 20 adults over 24 h, reporting nonsignificant differences between the two monitors for mean physical activity (3 counts per minute per day), sedentary time (2.5 min·d−1), and LPA (2.2 min·d−1). However, significant differences were observed for MPA (−3.0 min·d−1) and VPA (1.3 min·d−1). A consistent finding between our study and that by Ried-Larsen et al. was the significant difference between monitors in MPA, with the 7164 reporting lower values in both studies. Overall, both Cain et al. and Ried-Larsen et al. reported more comparable outputs between the two monitors than observed in the present study, particularly for lower-intensity activities. These discrepancies in findings between our study and others using the low-frequency extension option may be due in part to differences in total observation period (1–3 d as compared with at least 4 of 7 d in the present study), smaller sample sizes (N = 20 and N = 25 as compared with N = 87), and differences in average ages of the cohorts. The samples in the studies by Cain et al. and Ried-Larsen et al. were younger than in our study, which may potentially reflect different movement patterns. Differences in methods for processing and scoring the accelerometer data may also explain discrepancies in findings. For example, Ried-Larsen and colleagues initialized data with a 10-s epoch (compared with a 60-s epoch as done in the present study and in the study by Cain et al.), and they used different accelerometer cut points for sedentary time and LPA, although Freedson cut points were used for MPA and VPA. Furthermore, the processing software differed across the three studies (ActiLife vs MeterPlus vs Propero). The age of the 7164 device may have played a role in the discrepancy in counts per minute per day from the wGT3X-BT, given that the minute-by-minute count correlations were close to zero for four devices. However, the calibration formula was almost identical when derived only from those devices that had high minute-by-minute count correlations compared with deriving it from all 87 functioning 7164 devices.
Interestingly, there was no difference between the monitors for VPA before calibration; however, after calibration, this difference became statistically significant. Overall levels of VPA were very low, with a median of 0.2 min·d−1 with the 7164 and 0.0 min·d−1 with the wGT3X-BT (original and calibrated data). The median difference between monitors was also small, at 0.0 min·d−1 using the original wGT3X-BT data and 0.1 min·d−1 (or 6 s·d−1) using the calibrated data. Therefore, it is difficult to draw meaningful conclusions about the comparability of the monitors for VPA given the low levels observed in this sample and small observed differences between monitors, which may not be physiologically relevant. Furthermore, it is unlikely that this difference between monitors and how they detect VPA would lead to different observed measures of association with health outcomes or change the overall interpretations of these associations.
Because of observed differences in count-based estimates between monitors, we developed and applied a calibration factor to the wGT3X-BT data. Before calibration, the net difference between devices for total physical activity (counts per day) was 7%, after calibration, this difference decreased to 1%. Our findings indicate that the 7164 and wGT3X-BT are not interchangeable in their original state, and researchers should be aware of these differences. However, it is possible to make valid comparisons between the monitors after calibration. It is important to note that the overall agreement between monitors as assessed by ICC was excellent before calibration, with small improvements after calibration. Therefore, if researchers are interested in overall agreement and do not plan to compare activity intensity categories, then calibration may not be necessary.
This methodological approach could apply to a variety of real-world scenarios. For example, there is a need for longitudinal studies assessing accelerometer-measured physical activity to change the types of monitors used in their study to stay current with evolving technology. In less than 20 yr, accelerometers have evolved from a uniaxial piezoelectric system to a triaxial capacitive microelectromechanical system (14). These rapid changes in technology greatly deter efforts to evaluate patterns of physical activity change across important life-course transitions, such as midlife in CARDIA. Our calibration approach can be used to more accurately assess physical activity change within studies using different devices over time. The importance of calibration in this scenario can be illustrated by examining the Bland–Altman plots, using MVPA as an example. We found greater bias when using the original wGT3X-BT data as compared with the calibrated data (mean MPVA difference = −4.1 min·d−1 vs 1.2 min·d−1, respectively). Therefore, if we were to assess change in activity, first using the 7164 and later using the wGT3X-BT, then change in MVPA over time would be exaggerated without calibration, possibly leading to biased associations with health outcomes.
An additional scenario where this methodological approach could be implemented is in community-based studies with limited resources, where physical activity is measured using multiple types of monitors based on monitor availability. A third potential use for this approach is in harmonizing accelerometer data across studies using different devices. For example, calibration provides researchers the ability to harmonize and pool data from various cohort studies using different types of monitors. Using this methodological approach in the three scenarios described previously would require a small substudy where participants wear both types of monitors to determine the appropriate calibration coefficient, rather than using the calibration coefficient reported in this study because of concerns about generalizability. It would also be necessary for the different devices to provide the same type of output (e.g., counts per minute per day or minutes per day) to calculate the calibration proportionality and to have the same activity categories (e.g., sedentary, LPA, MPA, and VPA) to directly compare the monitors. It would be feasible, for example, to calibrate the ActiGraph and Actical using the approach described, because both devices provide information on minutes per day in sedentary, LPA, MPA, and VPA. However, the ActivPAL classifies time spent sitting, standing, and walking, and therefore would not be directly comparable to an ActiGraph (or Actical) using this methodological approach.
Strengths of this study include the larger sample size and longer duration of wear time as compared with the existing literature. Furthermore, a unique aspect of our study was the ability to examine differences between monitors by race/sex group and BMI category. We found no differences across groups, indicating that the accelerometers, after calibration, are comparable regardless of body dimensions. This article makes an important contribution by calculating and applying a calibration formula that allows for greater comparability between the older- and newer-generation accelerometers, and can be applied to multiple real-world scenarios.
Limitations of the study include the 10-yr age and previous use of the 7164 accelerometers. Earlier work has shown the 7164 output alters over time when used repeatedly (15). Therefore, the differences observed between the 7164 and original wGT3X-BT values may be due to changes in the mechanical properties of the older-generation accelerometer. In addition, VPA in our sample was very low; therefore, it is difficult to draw conclusions about the nonsignificant difference observed before calibration, or the significant difference observed after calibration. Finally, although our ability to examine differences between monitors by race/sex group and BMI category is a strength of our study, the findings should be interpreted with caution because of the small sample sizes across groups.
In conclusion, we found that the ActiGraph 7164 and wGT3X-BT accelerometers had high levels of agreement but significantly different outputs for count-based estimates and physical activity of different intensities. However, after applying a calibration factor to the wGT3X-BT, the differences in monitors were attenuated across all measures, with the exception of VPA. Comparison of values between the older-generation 7164 and newer-generation wGT3X-BT accelerometers should only be done after calibration.
The authors thank the investigators, the staff, and the participants of the Coronary Artery Risk Development in Young Adults Study for their valuable contributions.
The Coronary Artery Risk Development in Young Adults Study is supported by contracts HHSN268201300025C, HHSN268201300026C, HHSN268201300027C, HHSN268201300028C, HHSN268201300029C, and HHSN268200900041C from the National Heart, Lung, and Blood Institute (NHLBI), the Intramural Research Program of the National Institute on Aging, and an intra-agency agreement between the National Institute on Aging and the NHLBI (AG0005). Accelerometer data collection was supported by grants R01 HL078972 and R56 HL125423 from the NHLBI. K. M. W. was supported by T32 HL007779 from the NHLBI.
The authors report no conflicts of interest. The results of the study are presented clearly, honestly, and without fabrication, falsification, or inappropriate data manipulation. The results of the present study do not constitute endorsement by the American College of Sports Medicine.