Low levels of physical activity (PA) are believed to contribute to the current increase in childhood obesity (12), and may be independently associated with cardio-vascular risk factors and the related metabolic disturbances (11). The impact of physical activity on these risk factors has not been extensively investigated in young children for lack of an appropriate means.
There are several approaches to the measurement of PA, but few are suitable for use in young children. Observation techniques rely on a third party recording the child throughout the day, but this is a costly approach in terms of observer time (10). Children are unlikely to make accurate self-report assessments (17), especially when they are as young as five. Heart rate monitoring has rarely been used successfully, as small children find the equipment uncomfortable to wear (14). Electronic activity monitors can record intensity, frequency, and duration of PA with minimal intervention. The CSA (Computer Science & Applications, Inc., MTI, Fortwalton Beach, FL) device is small, tamperproof, and can store data for a 7-d period, the duration recommended for the assessment of PA (15).
Validation of activity monitors for clinical studies is important. The CSA activity monitor has been evaluated as a tool for recording energy expenditure by comparing the scores recorded with indirect calorimetry. Correlation coefficients between these two measures have been impressive, ranging from 0.82 to 0.94 (7,9,16), and match those obtained with other accelerometer-based activity monitors (18). Treadmill studies have concluded that the CSA monitor is reproducible (16), but as treadmills incur their own error they cannot reveal the intrinsic performance of the monitor. It is important to identify and quantify the different sources of variation present in any method of measurement as steps can then be made to try to reduce them. Also, if the measurement error intrinsic to the activity monitor is relatively small, the researcher can focus on other sources of variation. These include “position worn on the body” (7), “day-to-day” (5,6,15), and “week-to-week” variation. The only report to date of a technical trial on a motion sensor involved a simple mechanical pedometer and showed it to be reproducible on test-retest with a coefficient of variation (CV) = ± 1.5% (1). None of the studies using the CSA activity monitor has previously sought to establish its technical reliability. This study has completed a series of technical/bench trials to determine the intra- and inter-instrument variability of the CSA activity monitor as a motion-sensing instrument, and the effects of wearing the monitor at off-axis angles.
The EarlyBird Study is a nonintervention, prospective cohort study which aims to describe the factors, including PA, responsible for the development of obesity and its related metabolic disturbances. The trials described here were devised to measure the technical reliability of the 23 CSA monitors that would be used in the study. Ethical permission was obtained from the local Research Ethics Committee and informed written consent from the parents of the children involved.
Reliability incorporates accuracy and precision (reproducibility), both intra- and inter-instrument. The experiments were designed to reproduce conditions that might be obtained in the field, specifically variation in acceleration and position of the accelerometer relative to the vertical. The principal requirement was to impart a stable, controlled change of acceleration to the device along its axis of measurement. The conditions were achieved by securing the activity monitors to a horizontal turntable with mounts that allowed them to be tilted relative to the plane of the turntable. This arrangement is equivalent to the off-vertical position of the device that could be worn on a subject’s waist/hip.
The CSA activity monitor.
The CSA monitor (model 7164, Computer Science and Applications Inc, Shalimar, FL) is a uniaxial accelerometer that measures acceleration using a cantilevered beam with a mass attached to the unfixed end. The rate of change of acceleration is sampled 10 times per second and the data, in “monitor units,” summed into epochs. An epoch duration of one min is generally used in the field, and this setting was chosen for the trials reported here. The data were downloaded onto a PC using an infrared optical serial connection and reader interface unit. The monitor weighs 43 g and measures 50 × 41 × 15 mm. It is worn around the waist on an elasticized belt that passes through two slots on the sides of the outer casing, helping to maintain the axis of measurement parallel to that of the wearer. The monitor measures acceleration in a single plane, so the effect of wearing it vertically on the trunk (waist or hip) is to measure change in acceleration of the whole body in the vertical direction.
The turntable measured 202 mm in diameter, with four positions to secure the activity monitors, one in each quadrant. Each of the activity monitors was mounted with its axis of measurement along the radius of the turntable. The distance from the center of the turntable (pivot point) to the cantilever was a constant 60 mm. The turntable was driven by a DC electric motor via a power amplifier. The amplifier input was provided by a signal generator that allowed amplitude and frequency of the drive signal to be accurately and reproducibly set. The amplitude and frequency corresponded respectively to constant acceleration and rate of change of acceleration, sensed by the devices when attached to the turntable. Thus, changes could be simulated for acceleration at different rates of change and based on different baseline accelerations.
The tests were all carried out using a sinusoidal waveform, with a fixed positive offset voltage, as the driving signal. The basic and range of accelerations were respectively proportional to the DC (offset) and AC (amplitude) voltages set by the signal generator, and the rate of change of acceleration by the frequency. Sinusoidal waveforms were used to provide smooth changes in amplitude and direction of the acceleration, although this meant that the rate of change of acceleration was continuously changing. Triangular waveforms could have been used that would, theoretically, have generated constant change of acceleration with a sudden change at each peak and trough. The sinusoidal waveform more accurately mimics the real life situation where most changes of acceleration are smooth and instantaneous changes are rare, given the “damping” affect of the body’s musculoskeletal system.
To test the effect of changes of acceleration that are not along the measurement axis of the monitor, readings were taken with the device at an angle to the turntable. There were two ways in which the monitor could have been angled to the axis. One would have been to rotate it on the face of the turntable about its radius, equivalent to the device turning on the attachment belt. The method adopted, however, was to tilt the monitor up from the turntable along the radius, equivalent to the body surface not being vertical at the point of attachment. This was the more likely way the device would be angled in practice. The test was only intended to show whether the variation was greater or less than mathematically predicted, not to produce a correction factor. The system was designed to allow three angles from the horizontal (0°, 15°, 30°, and 45°) as shown in Figure 1. The tilted mounts were fitted so that the pivot to cantilever distance was the same as for the horizontal (0°) position. Angulation of the device allowed us to compare the monitor’s recordings resulting from nonaxial movement with those expected of a theoretically zero-mass cantilever with a point-source mass attached. In the case of nonaxial movement, the deflection would be proportional to the cosine of the angle between the axis of acceleration and the axis for maximum deflection.
The monitors were purchased new, and the tests were carried out within the first 8 wk of the children wearing them in the field. The turntable was allowed to rotate for at least 15 min at the beginning of each testing session to ensure that the mechanics were sufficiently warmed up. For all tests, the monitors were secured to the turntable using a rigid attachment that also located them accurately every time. The identification notch was always set facing outwards and upwards. This consistency was essential as the sensor element is not central within the device, thus the distance from the pivotal point (center of the turntable) to the sensor would be different if the notch on the monitor were facing otherwise.
Tests were carried out at two different speed settings, fast (baseline 120 rev/m, range ± 84 rev/m) and medium (baseline 72 rev/m, range ± 60 rev/m). The settings were established after a small trial that timed six adults and six 5-yr-old children walking and running over a set distance, three times each, wearing an activity monitor. Adults running at speeds of approximately 9.0–13.0 mph corresponded to monitor results ranging from 10,000–13,000 units per minute epoch. Children walking at speeds of 2.5–3.5 mph corresponded to monitor results ranging from 3,000–5,000 units. The trial therefore tested the monitors at outputs corresponding to high intensity and medium intensity of movement.
Eight of the 24 monitors were selected at random and tested in two batches of four. Each was tested three times on each of the four quadrants (all at 0°) on the turntable at two different speeds (12 runs at each speed altogether). Each test lasted 10 min, but the first and last minutes were ignored in the analysis to ensure sampling always captured eight full epochs. One monitor failed and was returned to the manufacturers for repair.
The remaining 23 of the original 24 monitors were tested in six batches of four (one batch had three working monitors and one broken monitor to keep the turntable balanced). Each batch was tested on a different day as we only ever had four available monitors not being worn by a child on any one day. Each monitor was tested three times on the same quadrant of the turntable (the intra-instrument tests showed there to be no quadrant effect), alternating between high and medium speeds. Again, each test lasted 10 min to obtain 8 min of full data.
Six randomly chosen monitors were tested (two batches of three), again at both speeds. The turntable was modified so that one position remained horizontal (0°) but the other three were tilted at 15°, 30°, and 45°. All monitors had already been tested at 0°, so a nonsampling monitor was secured in the horizontal position while three sampling monitors filled the three tilted positions.
Data was captured and sorted into Excel v4, then imported into SPSS v9 for analysis. The coefficient of variation (CV = standard deviation/mean) was calculated as one measure of variability, and intra-class correlation coefficients (ICC) (13) were calculated as another. The CV expresses the standard deviation (SD) as a percentage of the sample mean, eliminating the potential problem of heteroscedasticity when comparing variations at the two different speeds, and it was used in conjunction with ICC for ease of interpretation. The lower the CV, the better the repeatability. The ICC was analyzed as a two-way random effects model. An ICC close to 1 represents good repeatability. Level of significance was set at P < 0.05 for all hypothesis testing. Analysis at each speed was undertaken separately, but identically.
Uni-variate analysis of variance was used to determine whether there were significant differences between the four quadrants. There were two factors: one fixed (quadrant, four levels) and one random, (monitor number, seven levels).
For each of the seven monitors, the mean activity score was calculated for each of the 12 runs. The CV for each monitor was calculated by dividing the SD of the 12 means by the overall mean of the 12 runs. The mean CV and 95% CI (confidence interval) across all seven monitors was also calculated. The ICC was calculated to test whether the activity scores were consistent between runs.
Mean activity scores were calculated from the three 8-min runs for each of the 23 monitors. Uni-variate analysis of variance was used to test whether there were significant differences between monitors and/or batches. The overall mean activity score (with 95% CI) was calculated from all 23 monitors. An inter-instrument CV was calculated from the mean and SD of all monitor means. ICC were used to test whether the mean activity scores were consistent between monitors.
The means and 95% CI for the six monitors were calculated for each angle (0°, 15°, 30°, and 45°), and at both speeds. Activity scores for these four positions were expressed as a percentage of the score obtained in the horizontal (0°) position. Theoretical scores, again as percentages, were also calculated by the following formula:EQUATION
where θ = 15°, 30°, and 45° for the purposes of comparison with test results.
Table 1 shows no statistically significant differences in intra-instrument variation for the seven randomly-selected monitors at either speed (fast: CV 0.65% - 1.26%, mean CV 0.81%; medium: CV 1.03% - 1.83%, mean 1.40%). This was reinforced by high ICC at both speeds (fast: ICC = 0.93, F = 153.3, df = 66, P < 0.001; medium: ICC = 0.84, F = 63.7, df = 66, P < 0.001). There was no quadrant effect where recordings by the same seven monitors, mounted in different sections of the turntable, showed no statistically significant differences in their means at either speed (fast:F = 0.69, df = 56, P = 0.569, medium:F = 1.96, df = 56, P = 0.157).
The overall mean score of all 23 monitors is reported in Table 2a. As the differences between the batches in terms of mean activity scores neared significance at both speeds (fast:F = 2.80, df = 17, P = 0.050, medium:F = 2.76, df = 17, P = 0.053), both CV and ICC were calculated for each batch separately and overall. There were also statistically significant differences between monitors in terms of mean scores (fast:F = 95.0, df = 46, P < 0.001 and medium:F = 54.9, df = 46, P < 0.001). Although the pooled CV for all monitors was good for both speeds (fast: 4.6% and medium: 5.0%), it was the respective ICC of 0.30 (F = 10.8, df = 44, P < 0.001) and −0.02 (F = 0.63, df = 44, P = 0.535) that reflected these batch differences. However, the ICC based on each batch of four monitors separately were never less than 0.87 at fast and 0.71 at medium. (See Table 2b.)
Table 3 shows that the devices read on average 6% lower when angled at 15°, 16% lower when angled at 30°, and 29% lower when angled at 45°. The signals recorded by the activity monitor should, theoretically, diminish with the cosine of the angle between the vertical and the axis of the device. However, this would be strictly true only if the accelerometer within the activity monitor used a cantilever arm of infinitely small thickness with a point source mass at its end. As the real accelerometer uses a flat plate beam of finite thickness with a weight attached to the end, the effect of off-axis accelerations is to bend the beam in two planes simultaneously. In practice, the recordings for angles between 0° and 30° were lower than theoretically predicted by an average 3%, but the results at 45° were within the expected range. The discrepancy was consistent across all of the devices tested and at both speeds.
The proportion of overweight and obese children is rising rapidly in the UK as elsewhere (2,4), and physical inactivity is thought to be a key factor (12). In view of the importance of understanding the role of physical activity in reducing fat mass, both in young children and adults (3,8), it is perhaps surprising that none of the many studies using the CSA activity monitor has sought to report its technical performance. Although the treadmill can provide a ‘working’ validation of the monitor, the overall variance does not discriminate between treadmill variability, biological variability, and the variability of the monitor itself. Quantifying the variation of a measurement allows for better interpretation of any physical activity related findings. Physical activity group differences or correlations between physical activity and other factors may not be deemed statistically significant because of the noise within the measure.
Reliability has been analyzed here by two different statistical methods: coefficient of variation (CV) and intra-class correlation coefficient (ICC). Both methods are applicable, but neither in isolation. The ICC seems simple to interpret; the closer to 1, the better the reliability. Yet it cannot be interpreted clinically because the ICC gives no indication of the magnitude of disagreement between measurements (13). Furthermore, the ICC is affected by variation within each monitor, which is not ideal when calculating the variation between monitors in batches of four. The CV, on the other hand, is not affected by within-monitor variation, as it is based on the means of the runs. Bland and Altman repeatability tests were not used because they become very complex if there are more than two measurers or runs. They are primarily used for test-retest reliability, whereas the nature of a laboratory trial allows for many repeat runs.
The reliability trials reported here reveal impressive precision. Intra-instrument coefficient of variation did not exceed 2%, and the inter-instrument coefficient of variation never exceeded 5%, both within the limits of precision generally accepted for biomedical studies. However, the intra-class correlation coefficients based on all 23 monitors were poor for both speeds because the variation between batches of monitors was large relative to the low intra- and inter-monitor variation. As each batch was tested on a different day, batch variation is likely to reflect slight day-to-day variation of the testing equipment. This was confirmed by the high between-monitor ICC of each batch of four monitors analyzed separately.
The activity monitor is intended to be worn vertically on the waist so the cantilever is at 90° to the force acting upon it. Ideally, every child would wear the monitor at exactly the same angle, but in practice this is not achievable. However, our tests showed a loss in mean score of only 6% when the monitor was tilted from 0° to 15°. If care is taken to secure the belt firmly around the waist, this source of variance can be kept to a minimum.
The test speeds chosen corresponded to the range of physical activity from a child walking (medium speed) to an adult running (fast speed). Sinusoidal variation represents consistent change in acceleration, whereas in the field they will often be irregular. These tests did not set out to assess the robustness of the monitors, but only one of the original 24 monitors failed and was returned to the manufacturer for repair. The remaining 23 have been in constant use in the field for over 12 months with no further mechanical problems. This report does not evaluate the stability of the monitors over time, but regular quality-control checks are made during the course of the EarlyBird study and will be reported at a later stage.
In summary, the present trials have allowed us to examine the intra- and inter- reliability of the CSA activity monitor under conditions that might be obtained in the field. The type of study for which the activity monitors are being used will determine the optimum way to assign them to individuals. If the study were primarily interested in measuring longitudinal change in physical activity, it would be ideal if each child always wore the same monitor. If the study were interested in group changes, such as cross-sectional designs, then it would not matter which monitor each child wore. However, it is not yet known how much the performance of each monitor will change after a year in the field. Once this has been established we will be better qualified to advise on whether or not wearing the same monitor is important.
The authors are grateful to Roche Products, Smith’s Charity (London), The Child Growth Foundation, and the London Law Trust for their generous support of the EarlyBird Study.
Address for correspondence: Brad Metcalf, University Medicine, Level 7, Derriford Hospital, Plymouth PL6 8DH, UK; E-mail: [email protected]
1. Bassey, E. J., H. M. Dallosso, P. H. Fentem, J. M. Irving, and J. M. Patrick. Validation of a simple mechanical accelerometer (pedometer) for the estimation of walking activity. Eur. J. Appl. Physiol. Occup. Physiol. 56: 323–330, 1987.
2. Bundred, P., D. Kitchiner, and I. Buchan. Prevalence of overweight and obese children between 1989 and 1998: population based series of cross sectional studies. BMJ 322: 326–82001, 2001.
3. Despres, J. P., M. C. Pouliot, S. Moorjani, et al. Loss of abdominal fat and metabolic response to exercise training in obese women. Am. J. Physiol. 261: E159–167, 1991.
4. Dietz, W. H. The obesity epidemic in young children: reduce television viewing and promote playing. BMJ 322: 313–314, 2001.
5. Gretebeck, R. J., and H. J. Montoye. Variability of some objective measures of physical activity. Med. Sci. Sports Exerc. 24: 1167–1172, 1992.
6. Janz, K. F. Validation of the CSA accelerometer for assessing children’s physical activity. Med. Sci. Sports Exerc. 26: 369–375, 1994.
7. Melanson, E. L. Jr., and P. S. Freedson. Validity of the Computer Science and Applications, Inc. (CSA) activity monitor. Med. Sci. Sports Exerc. 27: 934–940, 1995.
8. Moore, L. L., U. S. Nguyen, K. J. Rothman, L. A. Cupples, and R. C. Ellison. Preschool physical activity level and change in body fatness in young children. The Framingham Children’s Study. Am. J. Epidemiol. 142: 982–988, 1995.
9. Nichols, J. F., C. G. Morgan, L. E. Chabot, J. F. Sallis, and K. J. Calfas. Assessment of physical activity with the CSA accelerometer: laboratory versus field validation. Res. Q. Exerc. Sport 71: 36–43, 2000.
10. Pate, R. R. Physical activity assessment in children and adolescents. Crit. Rev. Food Sci. Nutr. 33: 321–326, 1993.
11. Poehlman, E. T., R. V. Dvorak, W. F. Denino, M. Brochu, and P. A. Ades. Effects of resistance training and endurance training on insulin sensitivity in nonobese, young women: a controlled randomized trial. J. Clin. Endocrinol. Metab. 85: 2463–2468, 2000.
12. Prentice, A. M. and S. A. Jebb. Obesity in Britain: gluttony or sloth? BMJ. 311: 437–439, 1995.
13. Rankin, G., and M. Stokes. Reliability
of assessment tools in rehabilitation: an illustration of appropriate statistical analyses. Clin. Rehabil. 12: 187–199, 1998.
14. Rowlands, A. V., R. G. Eston, and D. K. Ingledew. Relationship between activity levels, aerobic fitness, and body fat in 8-to 10-yr-old children. J. Appl. Physiol. 86: 1428–1435, 1999.
15. Trost, S. G., R. R. Pate, P. S. Freedson, J. F. Sallis, and W. C. Taylor. Using objective physical activity measures with youth: how many days of monitoring are needed? Med. Sci. Sports Exerc. 32: 426–431, 2000.
16. Trost, S. G., D. S. Ward, S. M. Moorehead, P. D. Watson, W. Riner, and J. R. Burke. Validity of the Computer Science and Applications (CSA) activity monitor in children. Med. Sci. Sports Exerc. 30: 629–633, 1998.
17. Welk, G. J., C. B. Corbin, and D. Dale. Measurement issues in the assessment of physical activity in children. Res. Q. Exerc. Sport 71: S59–S73, 2000.
18. Westerterp, K. R. Physical activity assessment with accelerometers. Int. J. Obes. Relat. Metab. Disord. 23: S45–49, 1999.