# Refined Two-Regression Model for the ActiGraph Accelerometer

Purpose: The purpose of this study was to refine the 2006 Crouter two-regression model to eliminate the misclassification of walking or running when starting an activity in the middle of a minute on the ActiGraph clock.

Methods: Forty-eight participants (mean [SD] age = 35 [11.4] yr) performed 10-min bouts of various activities ranging from sedentary behaviors to vigorous physical activity. Eighteen activities were divided into three routines, and 20 participants performed each routine. Participants wore an ActiGraph accelerometer on the hip, and a portable indirect calorimeter was used to measure energy expenditure. Forty-five routines were used to develop the refined two-regression model, and 15 routines were used to cross validate the model. Coefficient of variation (CV) was used to classify each activity as continuous walking or running (CV ≤ 10) or intermittent lifestyle activity (CV > 10).

Results: An exponential regression equation and a cubic equation using the natural log of the 10-s counts were developed to predict METs every 10 s for walking or running and intermittent lifestyle activities, respectively. The refined method examines each 10-s epoch and all combinations of the surrounding five 10-s epochs to find the lowest CV. In the cross-validation group, the refined method was not significantly different from measured METs for any activity (*P* > 0.05), except cycling (*P* < 0.05). In addition, the 2006 and the refined two-regression models had similar accuracy and precision for estimating energy expenditure during structured activities.

Conclusion: The refined two-regression model should eliminate the misclassification of transitional minutes when changing activities that start and stop in the middle of a minute on the ActiGraph clock, thus improving the estimate of free-living energy expenditure.

^{1}University of Massachusetts Boston, Boston, MA; ^{2}Michigan State University, East Lansing, MI; ^{3}Cornell University, Ithaca, NY; ^{4}University of South Carolina, Columbia, SC; and ^{5}The University of Tennessee Knoxville, Knoxville, TN

Submitted for publication April 2009.

Accepted for publication September 2009.

Address for correspondence: Scott Crouter, Ph.D., Department of Exercise and Health Sciences, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, MA 02125; E-mail: scott.crouter@umb.edu.

Accelerometers are objective measurement tools that allow researchers to track the frequency, the intensity, and the duration of physical activity bouts that individuals perform. The ActiGraph (formerly Manufacturing Technology Incorporated ActiGraph and Computer Science Applications Inc.) accelerometer is a commonly used device for assessing physical activity. Since the first version of the ActiGraph was created, over 15 different regression equations have been developed relating ActiGraph counts to energy expenditure (EE). The original ActiGraph equations used a single linear regression line to predict EE from the ActiGraph counts per minute. These single regression equations were developed using either walking and running (^{3,9,10,12,14,16}) or moderate-intensity lifestyle activities (^{10,15}). In general, regression equations developed on walking and jogging slightly overestimate the energy cost of walking and light activities while greatly underestimating the energy cost of moderate-intensity lifestyle activities. In contrast, regression equations developed using moderate-intensity lifestyle activities provide a closer estimate of EE for moderate-intensity activities but greatly overestimate the energy cost of sedentary and light activities and underestimate the energy cost of vigorous activities (^{1,5}).

To overcome the limitations of single regression equations, Crouter et al. (^{7}) developed a two-regression model for the ActiGraph that distinguishes between continuous walking or running and intermittent lifestyle activity on the basis of the variability in the accelerometer counts. Specifically, the 2006 Crouter two-regression model for the ActiGraph incorporated three parts: 1) an inactivity threshold so that when the ActiGraph recorded ≤50 cpm, the individual was credited with 1 MET; 2) when the counts per minute were >50 and the coefficient of variation (CV) of six consecutive 10-s epochs was ≤10% (indicating that the individual was performing continuous walking or running), an exponential regression equation was used; and 3) when the counts per minute were >50 and the CV was >10% (indicating that the individual was performing an intermittent lifestyle activity), a cubic regression equation was used. By differentiating continuous walking or running from intermittent lifestyle activities, the 2006 Crouter two-regression model provided a substantial improvement compared with single regression equations for estimating EE and time spent in light (<3 METs), moderate (3-6 METs), and vigorous (≥6 METs) physical activity during structured activity bouts (^{7}).

Recently, Kuffel et al. (^{11}) demonstrated that the 2006 Crouter two-regression model has a problem in detecting continuous walking and running bouts when the activity bout starts in the middle of a minute on the ActiGraph clock. The 2006 Crouter two-regression model was developed using structured activity bouts and assumes that each bout of activity starts exactly at the start of a minute on the ActiGraph clock. Thus, when an activity bout starts in the middle of a minute on the ActiGraph clock, the CV will be greater than 10% because of greater variability in the counts. Thus, that minute will be misclassified as a lifestyle activity, resulting in an overestimation of EE and activity level.

The purpose of this study was to refine the 2006 Crouter two-regression model to eliminate the misclassification of walking or running when starting the activity in the middle of a minute on the ActiGraph clock. We hypothesized that by examining each 10-s epoch and all combinations of the surrounding five 10-s epochs, it could be determined if each 10-s epoch was part of a continuous walking or running bout lasting at least 1 min. Using this method, it should eliminate the misclassification of walking or running bouts that begin partway through a minute on the ActiGraph clock.

## METHODS

#### Subjects.

Twenty-four men (mean [SD] age = 36 [12.8] yr, height = 177.8 [7.1] cm, body mass = 83.9 [20.2] kg; BMI = 25.9 [5.2] kg·m^{−2}; resting V˙O_{2} = 3.6 [0.8] mL·kg^{−1}·min^{−1}) and 24 women (mean [SD] age = 35 [10.3] yr, height = 165.4 [5.8] cm, body mass = 62.3 [12.3] kg, BMI = 22.7 [4.0] kg·m^{−2}, resting V˙O_{2} = 3.4 [0.8] mL·kg^{−1}·min^{−1}) from the University of Tennessee, Knoxville, and surrounding community volunteered to participate in the study. The procedures were reviewed and approved by the University of Tennessee institutional review board before the start of the study. Each participant signed a written informed consent and completed a Physical Activity Readiness Questionnaire before participating in the study. Participants were excluded from the study if they had any contraindications to exercise or were physically unable to complete the activities.

#### Procedures.

This study was part of a larger study using the same participants, and the methods are published in more detail elsewhere (^{4-7}). In addition, the data used in this study were used for the development and cross validation of the 2006 Crouter two-regression model (^{7}). Before testing, participants had their height and weight measured (in light clothing, without shoes) using a stadiometer (Seca Corp., Columbia, MD) and a physician's scale (Health-o-meter, Inc., Bridgeview, IL), respectively. Participants performed various lifestyle and sporting activities that were divided into three routines (Table 1). Twenty participants performed each routine, with two participants completing all three routines and eight participants completing two routines. Participants performed each activity in a routine for 10 min, with a 1- to 2-min break between each activity, which were performed in order from the lowest energy cost to the highest energy cost. Oxygen consumption (V˙O_{2}) was measured continuously throughout the routine by indirect calorimetry (Cosmed K4b^{2}, Rome, Italy). Participants wore an ActiGraph accelerometer on the right hip for the duration of the routine. For the Cosmed K4b^{2} and ActiGraph, 2 kg was added to account for the added weight of the devices. Routine 1 was performed in the applied physiology laboratory, routine 2 was performed at university facilities, and routine 3 was performed at either the participant's home or the investigator's home. The participants who did not perform routine 1 were asked to sit quietly for 5 min before the start of the routine so that a resting V˙O_{2} could be measured.

#### Indirect calorimetry.

The Cosmed K4b^{2} weighs 1.5 kg, including the battery and a specially designed harness. The Cosmed K4b^{2} has been shown to be a valid device when compared with the Douglas bag method during cycle ergometry (^{13}). Before each test, the oxygen and the carbon dioxide analyzers were calibrated according to the manufacturer's instructions. During each test, a gel seal was used to help prevent air leaks from the face mask. For more details, see Crouter et al. (^{7}).

#### ActiGraph accelerometer.

The ActiGraph accelerometer (model 7164) is a small (2.0 × 1.6 × 0.6 inches) and lightweight (42.5 g) uniaxial accelerometer and can measure accelerations in the range of 0.05*g*-2*g* and a band limited frequency of 0.25-2.5 Hz. These values correspond to the range in which most human activities are performed. An 8-bit analog-to-digital converter samples at a rate of 10 Hz, and these values are then summed for the specified period (epoch). The ActiGraph was worn at waist level at the right anterior axillary line in a nylon pouch that was attached to a belt. The ActiGraph was initialized using 1-s epochs, and the time was synchronized with a digital clock so the start time could be synchronized with the Cosmed K4b^{2}. At the conclusion of the test, the ActiGraph data were downloaded to a laptop computer for subsequent analysis. A total of three ActiGraph accelerometers were used during the study. For each subject, one of the three accelerometers was chosen at random to be used. The ActiGraph accelerometers were calibrated at the start and end of the study. On both occasions, the calibration fell within ±3.5% of the reference value, which is within the manufacturer's standards.

#### Data analysis.

Breath-by-breath data were collected by the Cosmed K4b^{2}, which was averaged over a 1-min period. For each activity, the V˙O_{2} (mL·min^{−1}) was converted to V˙O_{2} (mL·kg^{−1}·min^{−1}) and then to METs by dividing it by 3.5. For each activity, the MET values for minutes 4-9 were averaged and used for the subsequent analysis.

The ActiGraph accelerometer data were collected in 1-s epochs and were converted to counts per 10 s using a Visual Basic program. We chose to use 1-s epochs to allow greater flexibility during our data analysis, but to apply the newly developed method, data can be collected in 10-s epochs.

#### Statistical treatment.

Statistical analyses were carried out using the Statistical Package for the Social Sciences for Windows (version 16.0; SPSS Inc., Chicago, IL). For all analyses, an alpha level of 0.05 was used to indicate statistical significance. All values are reported as mean (SD). Independent *t*-tests were used to examine the difference between genders for anthropometric variables.

For the development of the refined two-regression model, the same 45 tests that were used to develop the 2006 Crouter two-regression model were chosen to develop the refined two-regression model, and the same 15 tests used for the cross validation of the 2006 Crouter two-regression model were also used to cross validate the refined two-regression model. This was done for the purpose of allowing direct comparisons to be made between the 2006 and the refined two-regression models. Due to waist-mounted accelerometers not being able to detect cycling activity, cycling was not used in the development of the refined two-regression model.

The same general principles were used to develop the refined two-regression model as were used in the development of the 2006 two-regression model. Each activity performed by an individual was classified into groups on the basis of the CV value of the 10-s counts: CV from 0.1% to 10% (CV ≤ 10) and CV of >10% or not able to calculate (CV > 10). During walking and running, the CV was almost always less than 10%, whereas for the other activities, the CV was almost always greater than 10%. One exception was during activities such as lying, sitting, and standing, where the counts per minute could be zero for a full minute; thus, the CV is not able to be calculated, and it was defined as a CV of zero. In these cases, they were placed in the CV > 10% group for the purpose of developing the regression equation.

To overcome the problem of misclassifying walking activity when starting and stopping in the middle of a minute, each 10-s epoch and all combinations of the five surrounding 10-s epochs were examined to determine whether each 10-s epoch was part of a continuous walking or running bout or an intermittent lifestyle activity bout. Specifically, each 10-s epoch and the surrounding five 10-s epochs were examined in the following manner: 10-s epoch of interest and 1) the five 10-s epochs before, 2) the four 10-s epochs before and one 10-s epoch after, 3) the three 10-s epochs before and two 10-s epochs after, 4) the two 10-s epochs before and three 10-s epochs after, 5) the one 10-s epoch before and four 10-s epochs after, and 6) the five 10-s epochs that followed (Tables 1-3, Appendix). After the CV was calculated for each possible condition, the lowest CV from the six possible strings of 10-s epochs was used. If this CV was less than or equal to 10%, it was determined that the CV was part of a rhythmic locomotor activity (i.e., walking or running), and the walk or run regression equation was applied to the 10-s epoch. If the lowest CV was greater than 10%, it was determined that it was part of an intermittent lifestyle activity, and the lifestyle equation was applied to the 10-s epoch. By examining each 10-s epoch in this manner, it could be determined if each 10-s epoch was part of a continuous walking or running bout lasting 1 min or longer, and the start and finish of that walking or running bout could be determined to the nearest 10 s, regardless of where the bout of activity started on the ActiGraph clock. Regression analyses were then used to predict METs from the counts per 10 s for the CV ≤ 10 and CV > 10 groups.

A one-way repeated-measures ANOVA was used to compare actual and predicted METs (2006 Crouter two-regression model and refined two-regression model) for each activity using the cross-validation group. In addition, a one-way repeated-measures ANOVA was used to compare actual and predicted METs for all 18 activities combined. Pairwise comparisons with Bonferroni adjustments were performed to locate significant differences when necessary.

Modified Bland-Altman plots were used to graphically show the variability in individual error scores (actual METs minus estimated METs) (^{2}). This allowed for the mean error score and the 95% prediction interval (% PI) to be shown. Devices that display a tight prediction interval around zero are deemed accurate. Data points below zero signify an overestimation, whereas points above zero signify an underestimation.

## RESULTS

Data for one participant in the developmental group (routine 3) were missing because of an error that occurred during the downloading process.

Similar to the 2006 Crouter two-regression model, for activities with a CV ≤ 10% (walking and running), an exponential curve was used for the refined two-regression model. For activities where the CV was >10%, however, a cubic curve was used for the natural log (ln) of the counts per 10 s. For the inactivity threshold, we propose using a threshold of eight counts per 10 s, which approximates the threshold of 50 cpm used with the 2006 two-regression model (^{7}) to distinguish inactivity (e.g., sitting and lying) from light activity. Thus, when the value is eight counts per 10 s or less, an individual will be credited with 1.0 MET because this more accurately predicts these sedentary activities.

The refined two-regression model to predict gross EE (METs) from the ActiGraph counts would consist of three parts (two-regression model with an inactivity threshold):

- If the counts per 10 s are ≤8, EE = 1.0 MET.
- If the counts per 10 s are >8
- a. and the CV of the counts per 10 s are ≤10, then EE (METs) = 2.294275(exp(0.00084679ActiGraph counts per 10 s)) (
*R*^{2}= 0.739; SEE = 0.250), - b. or the CV of the counts per 10 s are >10, then EE (METs) = 0.749395 + (0.716431(ln(ActiGraph counts per 10 s))) − (0.179874(ln(ActiGraph counts per 10 s))
^{2}) + (0.033173(ln(ActiGraph counts per 10 s))^{3}) (*R*^{2}= 0.840; SEE = 0.863). - 3. Finally, once a MET value has been calculated for each 10-s epoch within a minute on the ActiGraph clock, the average MET value of six consecutive 10-s epochs within each minute is calculated to obtain the average MET value for that minute (see Table 4, Appendix).

Table 2 shows the measured METs and estimated METs for the cross-validation group using the 2006 Crouter two-regression model and the refined two-regression model. The 2006 Crouter two-regression model and the refined two-regression model were within 0.75 and 0.89 METs, respectively, of mean measured METs for all activities (*P* > 0.05), except cycling (*P* < 0.05). For the slow walk, racquetball, and slow run, there were small but significant differences between the 2006 Crouter two-regression model and the refined two-regression model. In addition, the correlation between the predicted METs from the refined two-regression model and the measured METs was high (*r* = 0.97, *P* < 0.001), which was similar to the correlation between the 2006 two-regression model and the measured METs (*r* = 0.96, *P* < 0.001).

The Bland-Altman plots show that the 2006 Crouter two-regression model and the refined two-regression model had similar accuracy during structured 10-min bouts (Fig. 1). The refined two-regression model had a mean bias of 0.10 METs (% PI = −1.28 to 1.48), whereas the 2006 Crouter two-regression model had a mean bias of 0.08 METs (95% PI = −1.38 to 1.54). Similar accuracy between the 2006 Crouter two-regression model and the refined two-regression model was also confirmed by the differences in the root mean square error (RMSE) values. The RMSE values between the 2006 and the refined two-regression models were not significantly different (*P* > 0.1).

## DISCUSSION

This study describes a refinement to the 2006 two-regression model to predict EE using the ActiGraph accelerometer. The refined two-regression model examines each 10-s epoch and the surrounding five 10-s epochs to determine whether the 10-s epoch is part of a continuous walking or running bout lasting 1 min or longer. In addition, the refined two-regression model estimates EE every 10 s. These changes were needed to eliminate the misclassification of walking or running activity and overestimation of EE when walking or running bouts started and stopped in the middle of a minute on the ActiGraph clock, which was a limitation of the 2006 Crouter two-regression model. Lastly, the refined two-regression model has similar accuracy and precision as the 2006 Crouter two-regression model during structured activities.

The 2006 Crouter two-regression model provided a significant improvement for estimating EE compared with single linear regression models (^{7}), but the 2006 method was developed on 10-min structured bouts of physical activity that always started exactly in synchronization with the ActiGraph clock. We later found that when a walking bout started in the middle of a minute, it was incorrectly classified as intermittent lifestyle activity for that minute because of the high CV (^{11}). Thus, the refined two-regression model was developed to overcome this issue by examining each 10-s epoch and the surrounding five 10-s epochs. Figure 2 shows an example of three different activities, with the transitions from one activity to the next occurring in the middle of a minute on the ActiGraph clock. It can be seen that with the 2006 Crouter two-regression model, the transitional minutes from rest to walk, walk to rest, and rest to vacuuming have much higher CV than the other minutes for each respective activity (Fig. 2B). This is due to there being a mixture of rest and activity of interest and is not a true reflection of the type of activity performed. With the refined two-regression model, the CV is calculated for each 10-s epoch; thus, when the transition from one activity to the next occurs, the new activity can be detected within 10 s, resulting in an improved ability to recognize the general type of activity performed (Fig. 2C).

The refined two-regression model has some distinct differences from the original 2006 Crouter two-regression model that should be discussed. The major change is that the CV and the METs are calculated for each 10-s epoch, which has several implications. First, the inactivity threshold is now examined for each 10-s epoch; therefore, when the counts per 10 s are ≤8, the individual is credited with 1.0 MET, whereas when using the 2006 two-regression model, an inactivity threshold of ≤50 cpm was used. Second, with the 2006 two-regression model, the CV was examined for six consecutive 10-s epochs within each minute on the ActiGraph clock, which did not consider changing of activities that may occur within the minute on the ActiGraph clock. The refined two-regression model now examines each 10-s epoch and all combinations of the surrounding five 10-s epochs. By examining all combinations of the surrounding five 10-s epochs, it can be determined if each 10-s epoch is part of a continuous walking or running bout, regardless of where it starts on the ActiGraph clock. Third, the refined two-regression model uses the natural log of the counts per 10 s for the cubic equation used to predict METs for the lifestyle equation.

The refined two-regression model has important improvements over the 2006 Crouter two-regression model. First, the refined method provides a closer estimate of EE to measured EE during free-living activities. Recently, we have shown that the refined two-regression model significantly improved upon the 2006 Crouter two-regression model for estimating METs and time spent in light, moderate, and vigorous physical activity during 6 h of free-living activity (^{8}). Specifically, the refined two-regression model predicted on average a mean (SD) of 2.09 (1.5) METs for the 6-h period compared with a mean (SD) of 2.33 (1.6) METs for the 2006 Crouter two-regression model and 1.91 (1.2) METs for the indirect calorimetry. In addition, the 2006 Crouter two-regression model and the Freedson, Hendelman, and Swartz equations were all significantly different from indirect calorimetry for time spent in light, moderate, and vigorous physical activity, whereas the refined two-regression equation was not significantly different. Specifically, the measured time in MVPA was 51.6 (56.3) min during the 6-h period, whereas the refined two-regression model predicted 74.4 (62.8) min and the 2006 Crouter two-regression model predicted 94.9 (70.5) min. Because individuals start and stop activities at will in a free-living environment, most of these transitions would occur partway through a minute on the ActiGraph clock. Thus, the 2006 Crouter two-regression model would frequently misclassify the first and the last minutes of walking or running bouts as intermittent lifestyle activity, resulting in an overestimation of EE. The refined method should correct this overestimation and provide a closer estimate of actual EE during free-living activity.

A second improvement is that the refined equation allows for walking or running bouts as brief as 1 min in duration to be detected, whereas with the 2006 Crouter two-regression model, if a walking bout started in the middle of a minute, it would need to be at least 2 min long to be detected. It should be noted, however, that METs are predicted every 10 s, and walking or running bouts could be estimated to the nearest 10 s with the refined two-regression model, which becomes important considering that the majority of activities performed in a free-living setting will not start and stop exactly in synchronization with the beginning and end of a minute on the ActiGraph clock. Thus, although the bout duration needed to detect a walking bout for the refined and 2006 two-regression models is not drastically different, the ability to detect when the walking or running bout started to the nearest 10 s using the refined two-regression model is an important improvement.

For reporting purposes, we have chosen to average the MET values each minute on the ActiGraph clock for a single summary MET value for each minute of activity, but researchers should not feel this is the only way to present EE using this equation. For each 10-s epoch, the CV and the MET values are calculated; thus, depending on the outcomes of interest, the METs could be reported in different ways. For example, if one was interested in only walking behavior, an individual could choose to average the MET values during the walking bouts rather than each minute to get an average MET value for the walking bouts. On the basis of preliminary results from a separate study in which 57 women wore an ActiGraph accelerometer for 5 d, the refined two-regression model predicted 26.1 (17.1) min per day of continuous moderate and vigorous walking or running compared with 18.1 (13.9) min per day for the 2006 Crouter two-regression model (unpublished data). This translates into 8.0 min of continuous walking or running time being misclassified as lifestyle activity with the 2006 Crouter two-regression model. Assuming an average count per minute value of 4000, this would mean that the 2006 Crouter two-regression model would overestimate the misclassified minute by approximately 3.5 METs per minute. Future research should investigate these issues to better understand how the data should be reported.

In conclusion, the refined two-regression model, which examines each 10-s epoch and all combinations of the surrounding five 10-secound epochs, improves upon the 2006 Crouter two-regression model. The refined two-regression model examines each 10-s interval and determines whether it is part of a string of six consecutive 10-s epochs with consistent accelerometer counts. It then predicts the EE (METs) of the 10-s bout on the basis of a regression equation representing intermittent, lifestyle activity, or rhythmic locomotion and eliminates the misclassification of transitional minutes when activities change, with the greatest effect seen on the transition between rest and walking or running bouts that start and stop partway through a minute. In addition, the refined two-regression model has similar accuracy to the 2006 Crouter two-regression model for predicting MET values of structured bouts of activity. Further research is needed to validate the refined two-regression model in free-living environments.

This research was supported by the Charlie and Mai Coffey Endowment in Exercise Science and an NIH grant no. 01R21 CA122430-01. No financial support was received from any of the activity monitor manufacturers, importers, or retailers.

The results of the present study do not constitute endorsement by the American College of Sports Medicine.

## REFERENCES

*Med Sci Sports Exerc*. 2000;32(9 suppl):S471-80.

*Lancet North Am Ed*. 1986;1:307-10.

*Med Sci Sports Exerc*. 2003;35(8):1447-54.

*Br J Sports Med*. 2008;42(3):217-24.

*Eur J Appl Physiol*. 2006;98(6):601-12.

*Eur J Clin Nutr*. 2008;62:704-11.

*J Appl Physiol*. 2006;100(4):1324-31.

*Med Sci Sports Exerc*. 2009;41(5 suppl):S129.

*Med Sci Sports Exerc*. 1998;30(5):777-81.

*Med Sci Sports Exerc*. 2000;32(9 suppl):S442-9.

*Med Sci Sports Exerc*. 2008;40(5 suppl):S415.

*Int J Sports Med*. 2003;24(1):43-50.

*Int J Sports Med*. 2001;22(4):280-4.

*Res Q Exerc Sport*. 2000;71(1):36-43.

*Med Sci Sports Exerc*. 2000;32(9 suppl):S450-6.

*Med Sci Sports Exerc*. 2003;35(2):320-6.

## APPENDIX

**Keywords:**

MOTION SENSOR; PHYSICAL ACTIVITY; OXYGEN CONSUMPTION; ACTIVITY COUNTS VARIABILITY