Validation of the SenseWear Pro Armband Algorithms in Children : Medicine & Science in Sports & Exercise

Journal Logo


Validation of the SenseWear Pro Armband Algorithms in Children


Author Information
Medicine & Science in Sports & Exercise 41(9):p 1714-1720, September 2009. | DOI: 10.1249/MSS.0b013e3181a071cf
  • Free


Physical activity is a complex behavior that is quite difficult to assess in field-based research. Accelerometry-based activity monitors have become the most widely used strategy for assessing physical activity under free-living conditions, but despite considerable work, many challenges remain (17). Researchers interested in youth's physical activity have additional challenges to overcome including the complexities associated with the more sporadic and intermittent physical activity patterns in children (7) and the inherent variability due to growth and maturation (21).

The recent developments of pattern-recognition activity monitors offer considerable promise for improving the accuracy of physical activity assessment techniques. The SenseWear Pro Armband® (SWA; BodyMedia, Pittsburgh, PA) monitor, for example, integrates motion sensor data with a variety of heat-related sensors to estimate the energy cost of free-living activity. The SWA is similar in cost to most accelerometers (∼$400) and can be worn comfortably on the upper arm for extended periods. An advantage of this multichannel approach is that the heat-related sensors provide additional information that cannot be obtained solely from movement sensors. The heat-related sensors, for example, provide a way to assess the energy cost of complex, nonambulatory activities. The sensors can also detect the increased work required to walk up a grade or to carry a load (12). A final advantage of the SWA is that the SWA automatically reports actual wear time (thereby avoiding the considerable challenge in determining whether a monitor was, in fact, worn as directed). These features provide several advantages over traditional uniaxial accelerometers for assessing physical activity in the field. The validity of energy expenditure (EE) estimates from the SWA has been supported in studies using both indirect calorimetry (IC) (8,9,11) and doubly labeled water (13,16). Furthermore, recent research has demonstrated potential advantages in accuracy when compared with traditional accelerometry-based monitors (20).

Although the validity studies in adults are fairly consistent, results of validation studies in youth have been more equivocal. Arvidsson et al. (1) reported that the SWA significantly underestimated EE for a variety of standardized physical activities in a sample of 20 children. In contrast, Dorminy et al. (6) reported consistent overestimation of EE with the SWA in a sample of 21 youth. The nature of the discrepancies in these results is not clear but is likely due to the use of algorithms that were not developed specifically for children (personal communication, November 2008). A unique characteristic of the SWA armband is that the company continually upgrades the algorithms as new data become integrated into the pattern recognition system. The purpose of this study was to assess the validity of new proprietary algorithms that were developed specifically from children's data. Comparisons are made between estimates from the old algorithms (versions 4.2 or earlier) and recently released algorithms (versions 6.1) to clarify the nature of the errors reported in previous studies.



Twenty-two healthy children (15 boys, 7girls) were recruited from a summer youth fitness camp hosted by the local university. The camp provides activity programming to youth in the summer (as a form of daycare) and tends to attract participants from diverse cultural (28% minority) and socioeconomic backgrounds. Although it is an activity-based program, the participants show a wide range in habitual activity levels and sports participation as it is viewed by parents as a daycare option for the summer (rather than a sports program). Approval from the institutional review board was obtained before the beginning of the study. Written parental consent and children's assent were obtained after informing about the procedures and purposes of the study. One of the participants had to be excluded from the analysis because of faulty metabolic analyzer data.

Description of the SWA.

The SWA is a wireless, noninvasive, multisensor activity monitor that is worn over the triceps muscle. The SWA armband monitor integrates data from five sensors including a two-axis accelerometer (the newer SWA version available includes a triaxial accelerometer), heat flux sensor, galvanic skin response (GSR) sensor, skin temperature sensor, and a near-body ambient temperature sensor to estimate EE under free-living conditions. The heat-related sensors provide additional information about the energy cost of activity because periods of increased work are associated with increased heat production. The GSR sensor may also contribute to EE estimation because it detects changes in the skin's electrical properties due to sweat gland activity and psychological stimulus (periods of increased stimulus are associated with increased skin conductance). The direct contributions of heat indices and GSR in the prediction algorithms are not shared by the company, but all five channels are used in estimations of EE (BodyMedia, personal communication).

A unique aspect of the SWA monitor is that the company continues to upgrade and enhance the software as new training data become integrated into the pattern recognition algorithms. The manufacturer recently released new versions of the algorithms on the basis of data collected from three independent research teams (including our group) to improve the accuracy of algorithms for children. The present study compared the estimates from the old version of the software (version 4.2) with the recently released (new) algorithms (software version 6.1) on an independent sample. The inherent goal was to determine whether the new algorithms improve the accuracy for assessing EE.

Data collection.

Participants were instructed about the procedures of the study before signing assent documents. Anthropometric measures were obtained at the beginning of the data collection session. Standing height was measured to the nearest 0.1 cm with the use of a wall mounted Harpenden stadiometer (Harpenden, London, United Kingdom) using standard procedures. Body mass was measured with participants in light clothes and barefooted on an electronic scale (Seca 770) to the nearest 0.1 kg. The body mass index (BMI) was calculated as weight (kg) per squared height (m2).

All activities took place in a pediatric exercise laboratory with two researchers guiding and supervising each trial. After completing the anthropometric measurements, participants were fitted with the SWA monitor and a pediatric mask for the assessment of IC via a metabolic analyzer (TrueMax 2400; ParvoMedics, Sandy, UT). The metabolic analyzer was calibrated before each trial following the manufacturer's recommendations. The participants were instructed about the protocol and were asked to complete a 41-min activity protocol designed to simulate a variety of typical activities for children. The protocol consisted of seven activity stages (5 min each) separated by 1-min transition intervals, which were used to guide the participant from one activity to the next one (intermittent walking, standing, and sitting). Descriptions of each activity are provided below:

  • Resting. The participant rests in a supine position on a clinical examination table during 5 min of the stage. Lights were kept on, and participants were instructed not to talk during this stage. Researchers avoided making noises or talking during this stage to maintain a relaxed atmosphere in the room.
  • Coloring. The participant selects animal drawings from a group of different figures and proceeded to color them with crayons. The speed of coloring was selected by the participant, and the activity lasted 5 min, even without completion of the task.
  • Computer Games. Participants engage in computer games on a desktop personal computer throughout the duration of the stage. The games were selected from a Web site (, which involved pushing keys on the keyboard and did not include the use of a joysticks or a similar device.
  • Walking Paces (n = 3). Three walking paces of 2.0, 2.5, and 3.0 mph were completed by the participants on a treadmill (Trackmaster TMX 425C). The participants were instructed not to use the treadmill handrails during walking. Each stage lasted 5 min and was separated by the 1-min transition in which the participants straddled on the treadmill.
  • Biking. The seat height of the cycle ergometer (Monark Ergomedic 828E) was adjusted to the participant's leg length. Participants were instructed to pedal at 60 rpm following the rhythm of a metronome instrument, with 0.5 kp of resistance.

Data processing.

Breath-by-breath data from the metabolic cart were downloaded and aggregated to provide average minute-by-minute values to facilitate integration with the SWA data. The SWA data were downloaded using the (old) SenseWear Software (version 4.2). The present study was conducted before the release of the new software, so the values reflecting the "new" estimates (from version 6.1) were obtained by sending the raw data files (.swd) to the company. The company provided a corresponding minute-by-minute estimate with the newly developed algorithm, and these were merged with the metabolic data and the data obtained directly from the software (version 4.2).

Data analyses.

Traditional measurement agreement analyses were used to evaluate the validity of the two SWA algorithms. The primary statistical analyses involved evaluating overall group differences in EE estimates from the three methods across the whole monitoring period (41-min trial). Many validation studies have focused comparisons on evaluations of point estimates of individual physical activities, but consideration also needs to be given to the overall accuracy during a sustained period of monitoring. These analyses compared the total EE for the whole 41-min period to better reflect actual use under real-world conditions.

After examining the overall effects, we conducted supplemental analyses to evaluate the accuracy for each stage. These analyses help to determine how errors in individual activities may impact the overall estimates. Mixed-model ANOVA were used to account for the possible correlation across repeated observations taken on the same individuals in the study. The models used participant within gender as person-level random effect term and the residual variance as a second random effect term. These analyses assume a common variance for among-person and within-person random effects. The fixed effects included in the models for EE were Gender and Method (IC, SWAnew, SWAold). F-tests were used to determine whether factors were statistically significant, and Tukey-Kramer paired comparisons tests were used to test for differences among levels of fixed effects. Least squares means and SE for all effects were estimated within the model, and these values are reported in the descriptive tables.

Additional analyses were conducted to evaluate overall measurement agreement. Minute-by-minute Pearson correlations were computed for each participant's set of data to examine individual variability in associations. The mean and SD of the individual correlation coefficients were computed across individuals to reflect the overall consistency of associations in the sample. We also used Bland-Altman graphical procedures (2) to examine agreement across the range of intensities. The mean of the two estimates (x-axis) is plotted against the difference between the two estimates (y-axis) to allow for the detection of systematic forms of bias in the estimates. Confidence intervals defining the limits of agreement were established as 1.96 SD from the mean difference. All analyses were conducted using SAS version 9.0.


Descriptive statistics for the sample population are provided in Table 1. The sample was predominantly white (72%), but the demographics were typical of the surrounding community and more diverse than the state as a whole. Approximately 14% were overweight (BMI > 85th and <95th age and sex percentiles) and 5% were characterized as obese (BMI > 95th age and sex percentiles).

Sample characteristics (means ± SD).

The primary analyses involved method comparisons of the overall EE estimates. Mixed-model analyses revealed a significant method effect (F2,2393 = 66.81, P < 0.001). Post hoc tests revealed that there were significant differences between the EE estimate from IC and the EE estimates from the SWAold algorithm (F = −9.68, P < 0.0001). Least square mean differences revealed that the SWAold algorithm differed from the IC value by an average of more than 0.5 kcal·min−1 (0.52 ± 0.05). Expressed differently, this equates to an average overestimation of approximately 32%.

Mixed-model analyses also revealed significant differences between the estimates from the SWAold algorithm and the SWAnew algorithm (F = −10.31, P < 0.0001). The new values were significantly lower, and this correction led to nonsignificant differences between the SWAnew estimates and the criterion IC measure (F = 0.63, P = 0.53). Least square mean differences between IC and the SWAnew values varied by only 0.03 ± 0.05 kcal·min−1 (error of approximately 1.7%).

Subsequent analyses examined the agreement for individual activities performed during the 41-min protocol. The average absolute difference in EE estimates between methods for the various activities was 16.3 ± 12.4% with the SWAnew algorithm. The stage-specific EE (kcal·min−1) differences were 0.25, 0.05, 0.05, 0.02, −0.02, −0.09, and 1.00 for resting, coloring, computer games, walking on a treadmill (2.0, 2.5, and 3.0 mph), and biking, respectively. Biking was the only activity where EE estimations were significantly different (P < 0.001; Table 2). Data were processed separately by gender, and the differences were similar for both males and females (data not reported). Previous research in adults (14) demonstrated some differences in accuracy for lean and overweight samples, so we also evaluated possible differences due to weight status. The average percent errors from the IC estimates were similar for the overweight (16.0 ± 13.2%) and normal weight (16.7 ± 12.5%) subsamples. However, the differences in EE estimation between IC and the SWAnew values were correlated (r = −0.33) with BMI percentile, although not significantly. This demonstrates that errors may be somewhat related to body weight.

Mean (N = 21) EE (kcal·min−1) values for each stage (means ± SD), and differences (means ± SD) between methods (kcal·min−1).

To evaluate overall measurement agreement, we computed minute-by-minute correlations across the 41-min trial for each participant (Table 3). The average minute-by-minute correlation ranged from 0.36 to 0.87 with the SWAnew algorithm (mean = 0.71). The values for the SWAold algorithm exhibited a similar range (from 0.46 to 0.96) and average (mean = 0.72), suggesting that the revised algorithms shifted the estimates in a consistent way across individuals. Figure 1 shows a plot of the average minute-by-minute correlation for both the SWAold and SWAnew algorithms compared with the measured EE. The plot shows that both algorithms track the overall pattern of EE, but the estimates from the new algorithm exhibited lower error and improved fit. Examination of the individual values showed no systematic differences in the magnitude of the correlations across participants.

Individual minute-by-minute Pearson moment correlations between the SWA estimates (new and old) and IC estimates.
Average minute-by-minute correlation for both the old and new SenseWear Armband (SWA) algorithm compared with the measured EE.

The Bland-Altman plots in Figure 2 provide a more detailed view of the differences in measurement agreement between the measures. The plot shows a tighter clustering of data points about the mean for the SWAnew algorithm and less overall error compared with the measured EE values. A cluster of points at the upper right of the plot shows the continued underestimation with the estimate of biking activities. There was no evidence of any systematic bias across the range of EE values measured in the study.

Comparison between the old and the new SWA algorithms (kcal·min−1).


This study examined the agreement in EE estimates from the SWA monitor in children. A unique aspect of the study is that we directly compared an old version of the algorithms with the newly developed version that is currently available in the software. The results demonstrated that the new algorithms yield more accurate estimates than the previous versions (version 4.2 or earlier).

Two previous studies reported limitations with the use of the SWA in children, but the results were inconsistent. Arvidsson et al. (1) showed a tendency for underestimation, whereas Dorminy et al. (6) reveal a tendency for overestimation. The nature of the sample and the specific selection of activities can influence the estimates of EE in these types of studies. Our results are more consistent with the findings by Dorminy that indicated a tendency for the SWAold algorithms to overestimate EE in youth. We found average overestimations of approximately 32% across the 41-min protocol. The effect was consistent across most of the activities and individuals.

The manufacturer of the SWA developed the original algorithms with a predominantly adult population and used extrapolations to create estimates for youth (personal communication, November 2008). The newly developed algorithms (versions 6.1) were created on the basis of data obtained from three independent laboratories using different protocols and activities. The results from this study show that the newly developed algorithms provide accurate estimates of EE for most activities in children. The only activity tested that had values significantly different from the EE obtained from IC was biking. However, this activity has been notoriously difficult to assess with accelerometers. Although the effect size for the difference in biking was large (0.74), the error (25% average error) is lower than typically reported with other monitors (e.g., accelerometers) for biking activities. Jakicic et al. (10), for example, reported large EE differences (69% average error) when estimating the energy cost of biking with the TriTrac accelerometer. Treuth et al. (18) reported similarly large discrepancies in Actigraph counts for biking in adolescent girls and excluded this activity from the resulting prediction equation. The present results reveal limitations of the SWA for estimating the energy cost of cycling, but refinement of data from the heat-related sensors may ultimately help to improve predictions for biking and other complex movement tasks.

Overall, the new algorithms assessed in this study resulted in improved EE estimation for a variety of sedentary, light-, and moderate-intensity activities. Arvidsson et al. (1) reported average error values of −18.6% for resting activities and −35.7% for light-intensity game playing. The values in the present study show underestimation for resting activities (−20.7%) but low amounts of error for other light activities (−4.0% for coloring and −4.9% for computer games). Dorminy et al. (6) reported overestimation of EE for resting activity (21.2%) and also for other sedentary activities (21.1%). During walking, Arvidsson et al. (1) reported average errors of 0.8% (1.9mph), −8.6% (2.5 mph), and −9.7% (3.1 mph). In contrast, Dorminy et al. (6) reported an average error of 14.2% during walking at speeds ranging from 2.5 to 4.5mph. In the current study, the average errors for the different speeds were considerably lower (−0.89%, 0.64%, and 3.45% for 2.0, 2.5, and 3.0 mph walking, respectively). The versions of the software used in those previous studies by Arvidsson et al. (1) and Dorminy et al. (6) were 5.1 and 4.1, respectively. These results demonstrate improvements in the accuracy of the new SWA (version 6.1) algorithms in assessing physical activity in children.

In general, researchers have found it more difficult to accurately capture the EE cost of activity in youth compared with adults. A recent study by Trost et al. (19) assessed the accuracy of different Actigraph equations developed to estimate EE in children. During walking at 3 and 4 mph, the average errors ranged from 13.3% to 23.3% with the Trost equation, 23.5% to 29.4% with the Freedson equation for children, and 0.6% to 13.3% with the Puyau equation. A study by Corder et al. (4) showed that Actigraph overestimates EE during flat walking by 42.6% compared with IC. In the same study, the Actical monitor overestimated the energy cost of flat walking by 33.3%, whereas the Actiheart monitor (combines accelerometry with HR) overestimated by 5.6%. The values with the SWAnew algorithms yielded errors ranging from −0.89% to −3.45% for the three walking paces used in this study. Thus, these results compare favorably with the Actiheart estimates.

Accelerometry-based activity monitors have also been limited in their ability to assess low-intensity activities and lifestyle activities. A recent study by our group (21) reported a significant error when the Freedson equation was used to estimate the energy cost of low-intensity activities. Recently developed pattern recognition approaches offer potential to improve the predictive accuracy of accelerometers (5,15), but these have not been developed for use with children at this point. A recent study by Corder et al. (3) evaluated the accuracy of eight different EE-prediction models to estimate EE for six different activities (two sedentary and four nonsedentary) in children. The results revealed systematic errors for models incorporating accelerometry alone as well as for combination of accelerometry and HR. The systematic error was more pronounced in the accelerometry-alone models showing that HR improved the accuracy-particularly for the lower-intensity activities. HR provides additional information to improve the prediction of EE, but the results from the present study demonstrate that the revised SWA algorithms can produce accurate EE estimates using a noninvasive armband that collect supplemental data from heat sensors rather than from HR. Although HR data can provide useful physiological information, HR monitoring is typically not reliable across extended periods. The artifact and missing data common in HR monitoring are almost never an issue with the SWA.

A novel aspect of the SWA is that the multiple sensors may also enhance the accuracy of intermittent activities performed throughout the day. Most studies have focused on point estimates of specific activities, but the plot shown in Figure 2 demonstrates that the SWA estimates closely mirror the estimates during the transition periods during activities. The ability of the SWA to estimate EE cost of light activities and to adjust to changes in the intensity of the activity is possibly due to the contributions from other heat-related sensors. These sensors may pick up subtle changes that cannot be inferred with accelerometers or HR information. Because the algorithms are proprietary, it is not possible to directly determine the unique contributions from the different sensors. The output from the individual sensors is available so future work could examine the relative changes in values from the various data channels used in the SWA.

In conclusion, this study demonstrates that the recently developed algorithms (version 6.1) used in the SWA produce valid estimates of EE for assessing physical activity in children. The nonsignificant difference in EE across the full 41-min protocol is important because it provides an indicator of the ability of the monitor to assess EE across extended periods (reflective of ability to assess habitual free-living daily activity). We observed significant reductions in the differences between EE estimates with the new algorithms (0.13 kcal·min−1) compared with the old one (−0.39 kcal·min−1). Limits of agreement from the Bland-Altman plots (mean ± 2 SD) were also improved with the new algorithms (new = 1.04, −0.78 vs old = 0.69, −1.46).

These results support the iterative data-based efforts by Bodymedia to make incremental improvements in the accuracy of the algorithms. Investigators using traditional accelerometry-based monitors (e.g., MTI Actigraph) rely on published equations to interpret and process data, but the diverse array of available equations has contributed more confusion than clarity in the literature. The Bodymedia company has adopted a different model that allows periodic upgrades in pattern recognition algorithms to be released to investigators with simple software upgrades. This approach, however, also has limitations because it proves difficult to compare results of research conducted over time. A strength of the present study is that we directly compared estimates from the previous version to the newly developed one to directly determine the differences in estimates between these versions. Researchers using the Bodymedia monitor are encouraged to report software versions and to compare estimates obtained from sequential versions to evaluate potential improvements over time. Modifications in the algorithms may or may not change with new releases of the software, so researchers are encouraged to check for changes in the algorithms through the Bodymedia Web site (

A limitation of the study is that we did not evaluate other higher-intensity (vigorous) activities such as running. The goal of the study was to evaluate the accuracy of the monitor under simulated real-world conditions, so emphasis was placed on selecting activities that were more reflective of a child's typical activity level. Most validation studies have focused on assessing specific physical activities, but for energy balance research, it is important for monitors to be able to assess a range of intensities including sedentary and light activities because these accounts for the bulk of the day. Additional work is clearly needed to examine the validity of the SWA across a wider range of activities, over longer periods, and in different populations (including possible differences between normal weight and overweight individuals).

Funding for the project was provided by Bodymedia Inc.

The results of the present study do not constitute endorsement by ACSM.


1. Arvidsson D, Slinde F, Larsson S, Hulthen L. Energy cost of physical activities in children: validation of SenseWear Armband. Med Sci Sports Exerc. 2007;39(11):2076-84.
2. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;8:307-10.
3. Corder K, Brage S, Mattocks C, et al. Comparison of two methods to assess PAEE during six activities in children. Med Sci Sports Exerc. 2007;39(12):2180-8.
4. Corder K, Brage S, Wareham NJ, Ekelund U. Comparison of PAEE from combined and separate HR and movement models in children. Med Sci Sports Exerc. 2005;37(10):2076-84.
5. Crouter SE, Clowers KG, Bassett DR Jr. A novel method for using accelerometer data to predict energy expenditure. J Appl Physiol. 2006;100:324-31.
6. Dorminy CA, Choi L, Akohoue SA, Chen KY, Buchowski MS. Validity of a multisensor armband in estimating 24-h energyexpenditure in children. Med Sci Sports Exerc. 2008;40(4):699-706.
7. Freedson P, Pober D, Janz KF. Calibration of accelerometer output for children. Med Sci Sports Exerc. 2005;37(11):S523-30.
8. Fruin ML, Rankin JW. Validity of a multi-sensor armband in estimating rest and exercise energy expenditure. Med Sci Sports Exerc. 2004;36(6):1063-69.
9. Jakicic JM, Marcus M, Gallagher KI, et al. Evaluation of the SenseWear Pro Armband to assess energy expenditure during exercise. Med Sci Sports Exerc. 2004;36(5):897-904.
10. Jakicic JM, Winters C, Lagally K, Ho J, Robertson RJ, Wing RR. The accuracy of the TriTrac-R3D accelerometer to estimate energy expenditure. Med Sci Sports Exerc. 1999;31(5):747-54.
11. King GA, Torres N, Potter C, Brooks TJ, Coleman KJ. Comparison of activity monitors to estimate energy cost of treadmill exercise. Med Sci Sports Exerc. 2004;36(7):1244-51.
12. McClain JJ, Welk GJ, Wickel EE, Eisenmann JC. Accuracy of energy expenditure estimates from the Bodymedia Sensewear® Pro 2 Armband. Med Sci Sports Exerc. 2005;37(5 suppl):S116-7.
13. Mignault D, St Onge M, Karelis AD, Allison DB, Rabasa-LhoretR. Evaluation of the portable HealthWear armband. Diabetes Care. 2005;28:225-7.
14. Papazoglou D. Evaluation of a multisensor armband in estimating energy expenditure in obese individuals. Obesity. 2006;14:2217-23.
15. Pober DM, Staudenmayer J, Raphael C, Freedson PS. Development of novel techniques to classify physical activity modeusing accelerometers. Med Sci Sports Exerc. 2006;38(9):1626-34.
16. St Onge M, Mignault D, Allison DB, Rabasa-Lhoret R. Evaluation of a portable device to measure daily energy expenditure in free-living adults. Am J Clin Nutr. 2007;85:742-9.
17. Troiano RP. A timely meeting: objective measurement of physical activity. Med Sci Sports Exerc. 2005;37(11):S487-9.
18. Treuth MS, Schmitz K, Catellier DJ, et al. Defining accelerometer thresholds for activity intensities in adolescent girls. Med Sci Sports Exerc. 2004;36(7):1259-66.
19. Trost SG, Way R, Okely AD. Predictive validity of three Actigraph energy expenditure equations for children. Med Sci Sports Exerc. 2005;38(2):380-7.
20. Welk GJ, McClain JJ, Eisenmann JC, Wickel EE. Field validation of the MTI Actigraph and BodyMedia Armband monitor using the IDEEA monitor. Obesity. 2007;15:918-28.
21. Wickel EE, Eisenmann JC, Welk GJ. Predictive validity of an age-specific MET equation among youth of varying body size. Eur J Appl Physiol. 2007;101:555-63.


©2009The American College of Sports Medicine