Journal Logo

APPLIED SCIENCES

Activity Recognition in Youth Using Single Accelerometer Placed at Wrist or Ankle

MANNINI, ANDREA; ROSENBERGER, MARY; HASKELL, WILLIAM L.; SABATINI, ANGELO M.; INTILLE, STEPHEN S.

Author Information
Medicine & Science in Sports & Exercise: April 2017 - Volume 49 - Issue 4 - p 801-812
doi: 10.1249/MSS.0000000000001144
  • Free

Abstract

Traditional methods of measurement of physical activity include using a wearable device on the hip over a 7-d period of activity to classify participants into activity categories. Accelerometer-based activity monitors that output activity “count” values aggregate the motion of the device over a short window of time (“epoch”). The count value does provide a useful summary of overall motion along with some indication of gross motion and ambulation when placed at the hip. A recent trend, however, is to move the activity monitor from the hip to the wrist location to increase wear-time compliance and capture sleep-related behavior, as in the UK BioBank and US National Health and Nutrition Examination Survey studies (26,30). This trend extends to physical activity estimation in youth using wrist-worn accelerometers (7). Interpretation of wrist-based count values is challenging due to hand gesturing, which may confound the mapping between overall body ambulation and motion of the sensor (20). More detailed information about wrist motion captured in raw accelerometer data sampled at a high rate, however, may permit automatic identification of classes of activity types, such as ambulation versus sedentary behavior. This information could be used directly to characterize activity, or perhaps to improve accuracy of accelerometer-based energy expenditure estimation (1,8).

Prior work detecting activity type from raw accelerometer data from a variety of sensor locations on the body has primarily focused on detecting the activity of adults (e.g., [2,16,34]). Recent work on the detection of activity type from raw accelerometer data on the wrist has focused mainly on adults (16,24,34), with the only exception being the work by Trost et al. (29). In this study, we tested the applicability of an activity recognition algorithm based on ankle or wrist raw accelerometer data, previously developed for and validated in adults with youth age 11 to 15 yr (16). The type and amount and intensity of activity of children and youth may differ from those of adults (5,27). Algorithms developed for adults may not work well on children for one of two reasons: activities that children perform may not be represented in the adult models, and features used by the models for adults may not adequately capture important distinctions between activities in children, if children perform those activities in dramatically different ways. Whereas physical activity assessment in children and youth using activity count values obtained using accelerometry is common (7,21), researchers are still exploring whether the same models that worked for adults can be applied to children.

Activity Recognition in Youth: Related Work

Previous studies involving automatic activity recognition in youth from accelerometers have been surveyed (see Table 1). Four studies use activity count values. One such study using classification features extracted from activity counts (1-s epochs) gathered from 41 youth age 10.8 ± 1.3 yr classified activity into 10 categories (stationary, biking, crawling, walking, scooter, horseback riding, jumping and floor exercise) with 67% accuracy using monitor data from the hip and wrist simultaneously (23). De Vries et al. (9) tested single-sensor solutions on 58 participants, using ankle and hip sensors worn by 9- to 12-yr-old youth that output 1-s count values. When classifying seven activity types (sitting, standing, walking, running, rope skipping, playing soccer, and cycling) from 20 min of data per person, they obtained an overall accuracy of 68% ankle data, and 77% using hip data. More recently, Trost et al. attempted the classification of five activity types (sedentary, walking, running, light intensity household activities or games, moderate-to-vigorous intensity games, or sports) from activity count data with 1-s epochs, evaluating the algorithm on data from 100 participants (5 to 15 yr), with 2 min of classified data for each activity (28). An overall accuracy of 88.4% from a hip-worn sensor was reported. The most recent article to include 1-s epoch count processing for activity recognition was by Hagenbuchner et al. (12), who involved 11 preschool children (3 to 6 yr, with a total of 264 min of classified data). Four classes were recognized (sedentary activities, light activities, moderate to vigorous activities, walking, and running), with 82.6% accuracy.

TABLE 1
TABLE 1:
Studies involving automatic activity recognition in youth from accelerometers.

An alternative approach to using count values is to use the raw accelerometer data and compute a richer set of features that may help differentiate specific activities. Four recent studies have explored this approach for activity type detection in children. In Hikihara et al. (13), 32 Hz data from a waist-worn triaxial accelerometer were used to distinguish between two nonlocomotive and locomotive activities in 68 children age 6–12 yr with approximately 1 h of data from each child, classifying 99.1% of examples correctly. Nam and Park (19) proposed a method using a waist-worn accelerometer and a barometric pressure sensor to classify 11 classes (wiggling, rolling, standing still, standing up, sitting down, walking, toddling, crawling, climbing up, climbing down, and stopping) of 10 1.3- to 2.4-yr-old toddlers. With a total of 50 h of acquired data, they obtained 88.3% classification accuracy using the accelerometer alone and 98.4% using both the accelerometer and barometric pressure sensors. Del Rosario et al. (10) performed activity classification using a smartphone embedded accelerometer, gyroscope, and barometric pressure sensor, evaluating performance on 20 young adults (21.9 ± 1.7 yr) and 37 older adults (83.9 ± 3.4 yr). The same feature set was used in both age groups to classify nine activities (stand, sit, lie, walk, walk upstairs, walk downstairs) from 10 to 30 min of data per person collected when the smartphone was kept in a person's trouse front pocket. Overall recognition accuracies of 79.9% and 82.0% for young and older adults, respectively, were reported. Finally, Trost et al. (29) proposed a method based on wrist or hip accelerometer for recognizing 12 activities merged to seven categories (lying down, sitting, standing, walking, running, basketball, and dancing). A total of 52 (13.7 ± 3.1 yr) children were included in tests and 2 min of each activity were classified obtaining 91% and 88.4% accuracy on average at the hip and at wrist, respectively.

Previous work did not test activity classification methods designed for adult activity on youth data. In this work, we fill this gap by testing an activity recognition algorithm we previously developed for adults (16) on data from youth performing a similar set of activities. The objective behind this approach is to check if it is possible to apply existing methods to this different age group and if it is possible to improve results in a general solution that could effectively process data from both age groups. Although it is theoretically possible to create algorithms tuned for all age ranges, that would require large age-specific data collection efforts, with somewhat arbitrary age cutoffs. Moreover, deployment of algorithms would be simpler in large surveillance studies if the same algorithms could be used for both adults and children. Even if the algorithms have to be tuned differently based on age, using the same set of features for adults and children might ultimately allow algorithms to be developed that adapt smoothly to different age groups, rather than using a hard (and unrealistic) age threshold to process data using two entirely different algorithms and models. Both the youth and adult data in our work were acquired simulating free living conditions: participants were asked to do things as naturally as possible while doing a set of 23 activities for youth and 26 activities for adults classified among four general classes. Sensors were again placed at the ankle and wrist (each sensor was tested independently). Modifications to the algorithm were proposed— specifically in the selection of features computed from the raw data and fed to classification algorithm—that allow better detection of differences in the activities being performed by the youth. The resulting method was tested on the adults' data as well as the combined data set.

METHODS

Data sets

Participants performed a set of simulated daily activities in a laboratory environment while wearing a suite of synchronized sensors, following a similar protocol used when collecting data from 33 adults in prior work (16). Twenty youth (12 boys and 8 girls, ages 13 ± 1.3 yr) were recruited from a Stanford, California community. The Stanford University's institutional review board approved the data collection protocol, and written informed consent was obtained before participation. Triaxial Wocket accelerometers (14) were secured to the wrist and ankle positions on the body using custom Velcro bands. Wockets are small, thin, and lightweight devices (43 × 30 × 7 mm, 13 g), making them particularly suitable for long-term physical activity monitoring studies. Raw acceleration data (range ± 4 g) were acquired at 90 Hz and sent using the Bluetooth wireless protocol to a smartphone. The wrist sensor was placed on the dorsal aspect of the dominant wrist midway between the radial process and the ulnar process. The ankle sensor was placed on the outside of the ankle, just above the lateral malleolus. The ankle placement site was chosen because it is an ideal site for ambulation detection (16). The wrist is a practical site for long-term monitoring, because sensors can be attached using watch-like bands, can be worn during sleep comfortably, and do not need to be removed when changing clothes.

Participants were asked to perform a guided sequence of laboratory-based physical activities and common daily activities lasting 3–5 min each. Activities were annotated during the execution of tasks using a voice recorder, and then timings on the voice recording were used to annotate start/stop times for specific activities being observed. Data and annotation were synchronized using custom software (14). Data collected from the youth using this procedure will be identified as the youth (Y) data set. Similar data collected in prior work from adults will be identified as the adult (A) data set. Table 2 summarizes the list of available activities in each data set. Activities were grouped into four classes: sedentary, cycling, ambulation, and other activities. Multitasking behaviors were not included, except for the activity “walking-carrying a load” in A. Activity categories were reduced to four categories used in previous work on wrist/ankle activity recognition in adults (16). Activities that were done in an upright posture but that can be physical demanding, such as cleaning and wall painting, were included in other activities because they did not seem appropriate for the sedentary, ambulation, and cycling classes. By collapsing activities into categories, the machine learning algorithms had more training data; future work with much larger data sets could explore detection of specific activities as well. To facilitate comparison with past work, new sport and leisure activities of the Y data set involving movement in the upright position and not represented in A were included in the other activities class.

TABLE 2
TABLE 2:
Activities in the adult and youth data sets, grouped into four broad classes.

Data preprocessing and feature evaluation

Three-axis raw accelerometer data were preprocessed to extract the signal magnitude (SM) vector:

where acc indicated the recorded raw data in g-units (1g = 9.81 m·s−2). The resulting 90-Hz SM signal was independent of the orientation of the sensing node. SM were low pass filtered using a 15-Hz cutoff fourth-order Butterworth filter to limit the bandwidth of the signal to the frequencies common in human motion (3). To classify data within the four defined activity classes, the SM data were divided into 12.8-s nonoverlapping windows. This window size was proposed by Zhang et al. (34) and also applied in Mannini et al. (16). Although prior work has shown that other window sizes (e.g., 4 s) can be used with only modest degradation of performance (16), here use of the same window length value as in two prior studies allows a direct comparison between these results and those from the previous one about activity classification in adults (16).

The data sets used in this work were collected in the laboratory but include semistructured activities labeled in real time, introducing small errors in annotation of activity transitions due to reaction time and the difficulty of labeling activities when transitions occur quickly. For this reason, one window (12.8 s) was discarded before and after each label transition. Another type of annotation error is that some short activity changes during semistructured activities were not labeled at all. For example, the data set contains examples where participants stop briefly during non–treadmill walking, such as at a door that had to be opened. In such cases, even though a participant is standing still briefly, the label for the data is still ambulation. Some errors can be detected using the ankle acceleration recordings because in data labeled as ambulation, the SM of the ankle sensor is expected to be significant. Therefore the ankle sensor SM was used both to identify these labeling errors and to correct labels indicating ambulation. In particular, 2-s windows labeled as ambulation with a standard deviation less than 0.1g were marked as labeling errors and discarded. This value was set by empirical observation so that ambulation with a cadence of one impact per window (i.e., nearly any movement) would be detected with an ankle sensor. Data loss due to Bluetooth wireless transmission errors was handled by discarding windows with less than the 80% of the number of expected samples at the nominal 90-Hz sampling rate. In such cases, a new window was started at the end of the data gap. Some fluctuations in the sampling rate may occur in the remaining windows due to the wireless connection. Before extracting frequency domain features, SM in each window were linearly interpolated to obtain the same number of samples in every window.

Initially, the feature set defined in previous work was tested (16). Those features, listed in Table 3, encoded both temporal and frequency domain information, computed from acceleration SM. As in our prior work (16), the feature vectors (computed on every 12.8-s window) were used as input for a support vector machine (SVM) with radial basis function kernel (31). Also, as in prior work (16), the results were evaluated using cross-validation with leave-one-subject-out (LOSO). Finally, the parameters of the radial basis function used for SVM classification were retained from the previous study (upper complexity bound C = 100 and γ = 0.1) for evaluating the previous version of the algorithm and then optimized by running a grid search using classification accuracy as the optimization criterion. As described below, our initial testing suggested that the youth data set recognition could be improved by adding some additional features to the algorithm. The adult data set on which the algorithm was developed did not include sports activities, such as basketball, soccer, and tennis that were included in the youth data. To capture such activities that are more frequent in youth than in adults, the signal power at frequencies higher than 3.5 Hz, normalized by the total power, was included as a new frequency domain feature. The 3.5-Hz cutoff frequency was selected because previous studies pointed out that most of the energy of human movement during daily activities lies in the 0.3–3.5 Hz band (25). High-frequency components, which are present especially in lower limb recordings, are mainly from high impacts (4); therefore, their presence suggests that the movement being performed involves high impact, as is common in ambulation or sport.

TABLE 3
TABLE 3:
Feature sets considered in this work (shaded are new relative to prior work (16)). Features indicated with check marks are those retained by the automatic selection strategy.

New features were introduced to capture information about starts and stops of short activity bouts that may be common for children (21,22). Evaluating relevant acceleration bouts within the window allows extraction of information about the activity fragmentation while keeping the same window length used in previous work (16,34). The fragmentation of the acceleration SM was evaluated as it is done in evaluating onset of electromyographic signals (18):

  • SM were rectified by subtracting a constant value corresponding to gravitational acceleration (1g) and removing the sign of the result.
  • Windowed data were then low-pass filtered (Butterworth filter, 5-Hz cutoff frequency).

A threshold (Th = 0.2g; i.e., 1.96 m·s−2) to identify active samples was then applied. In previous studies, periods with unfiltered acceleration SM lower than 0.4 m·s−2 are considered static (32). Our threshold results from preliminary observation of rectified and filtered SM; it is significantly different, because our aim is to identify with high specificity periods of relevant activity, as opposed to periods with little motion.

Four different activity fragmentation features were evaluated as follows:

  1. Fragmentation, active samples:

where W is the window length in samples. This feature identifies the amount of activity within the window that is over the threshold, thereby providing a rough estimate of the amount of relevant activity being recorded in the window. This feature could distinguish between activities that result in relevant acceleration in most of the window and activities in which the relevant activity takes place for only a portion of the window.

  • 2. Fragmentation, number of activations:

This feature identifies the number of threshold crossings within the window (rising edges only), normalized to the number of active samples FS, thereby capturing movement fragmentation within the window. This feature could discriminate impulsive events from longer-lasting acceleration events, because it quantifies how many times within the window the acceleration passed from the inactive to the active condition.

  • 3. Fragmentation, mean activation interval duration

where,

This captures the mean duration of activation intervals within the window, normalized by the window length. An activation interval is defined as the amount of samples between two consecutive threshold crossings. This feature provides information on movement bout fragmentation within the window that could help discriminate between activities that involve stable movements, such as those in natural walking, and those with short bouts, such as sport ones.

  • 4. Fragmentation, activation interval duration variability

where,

This feature captures the standard deviation of the duration of activation intervals within the window, normalized by the window length, thereby providing information on uniformity of activation intervals within the window. This feature may help discriminate between activities with cyclic movements with a stable ratio between active and inactive phases, and more random activities. A stable cyclic movement would result in lower variability of activation intervals that repeat themselves within the window. Fast and aperiodic movements, however, such as those in recreational activities, would result in highly variable activation bouts.

If no samples over the threshold were observed, FS, FAN, FAM, and FAV were set to zero. FAV was set to zero also in the case of #threshold_exceedings less than 2.

In summary, six new features specifically developed for activities common in children were computed and available to the classification algorithm. All features are computationally efficient, permitting real-time recognition in future systems.

Feature selection, classification, and validation strategies

In prior work, manual testing was used to assess the contribution of some group of features to overall algorithm performance (16). Here, new features were added to the feature space, and an automatic sequential forward search feature selection process (15) was adopted, where the LOSO validation output was used as the selection criterion. Sequential forward search is a suboptimal algorithm for feature selection that cannot guarantee the optimality of the selected set (15). However, it is a computationally acceptable (linear time complexity) dimensionality reduction strategy that may improve classification results by discarding redundant features before the classification step (15). SVM classifier training does not rank features (6). Therefore, feature set dimensionality reduction is effective for SVM classifiers. Reducing feature space dimensionality typically reduces the amount of training examples required to obtain reliable recognition results, because there is a lower risk of overfitting examples and better generalization (15,33). Moreover, by reducing the number of features, the complexity of the classifier is reduced (less parameters are needed) and the computational cost of both training and real-time recognition is reduced (33). In prior work, the same feature sets were used for both the ankle and wrist algorithm training (16). Here, however, the algorithm could select different feature sets for the two sites; automatic feature selection was run on each site separately to obtain a location-specific feature set, given that signals may have different characteristics at different body sites (3,17).

The LOSO cross-validation approach was preferred over standard n-fold cross-validation. In standard n-fold cross-validation, data are mixed from all subjects, and held-out data are randomly selected; LOSO, alternatively, prevents similar data collected from the same participant at about the same time from ending up in both the training and test data sets (11,16). Therefore, LOSO results are more likely to demonstrate how a method may work under realistic conditions, where a new participant, not included in the training data, is tested. LOSO is particularly challenging if testing occurs across different populations of people, such as training on adults and testing on youth, as done here. Most previous studies on activity classification did not apply LOSO cross-validation (13,19,23,28,29). A few studies did use this type of cross-validation (9,10,12), but they limited their testing and evaluation to a homogeneous pool of healthy adult users, with the exception of Del Rosario et al. (10) who involved two age groups: 37 elderly (average age, 84 yr) and 20 young adults (average age, 22 yr).

The two data sets used in this work, adult activity (A) and youth activity (Y), were collected as part of two separate experiments. All activities in both data sets were chosen to represent common activities, and these are significantly different in an adult and youth population, as shown in Table 2. In LOSO validation using data from a single age group, all activities for that class listed in Table 2 were used. For LOSO validation using both age groups, classes were removed if no training data were available for that class. For example, when training on A and testing on Y, all the activities in A were trained, but those activities not available in A but in Y were removed from Y (i.e., the sport activities in the other activities class and the video gaming activities in the sedentary class were removed from Y for testing, because no training data were available for them). Alternatively, when training on Y and testing on A, activities in A that did not have training data were removed (i.e., painting with roller and painting with brush).

Testing proceeded as follows:

  • Experiment 1: Test the algorithm originally developed for A on Y. This first experiment was aimed at evaluating if the previous existing methodology could be extended without modification to Y by running a LOSO cross-validation on Y.
  • Experiment 2: Extend the feature set to incorporate additional features intended to capture more information about the youth activities, and train and the test using LOSO validation separately on the Y and A data sets. This was done to check if the proposed modification to the methodology improves the recognition accuracy respect to the original algorithm on both data sets.
  • Experiment 3: Cross-test the algorithm by training on A and testing on Y, and training on Y and testing on A. Both the original feature set and the newly proposed set are evaluated. As mentioned above, activities not represented in the training set (i.e., without training data) were removed from the test set. The goal of this experiment was to assess whether information learned from one of the groups could be used to recognize similar activities in the other.
  • Experiment 4: Test the algorithm with both data sets combined. LOSO validation is performed on A + Y, without removing any activities from either data set. This test was performed to check if it is possible to obtain a general classifier that works equally well on both groups.

RESULTS

Experiment 1: test the algorithm originally developed for A on Y

The first test consisted of running a LOSO validation on the youth data set, using the previously existing feature set (16). Ankle and wrist classifications were correct in 85.9% and 89.7% of cases, respectively. Results are reported in Table 4, part A.

TABLE 4
TABLE 4:
Wrist and ankle classification confusion matrices for the four target activity groups using the SVM classifier with LOSO cross-validation.

Experiment 2: extend the feature set

Table 3 shows the feature sets obtained for wrist and ankle activity classification after 1) adding the new features, 2) applying the feature selection approach, and 3) optimizing SVM parameters.

The feature selection procedure was run independently for the wrist and ankle, selecting nine wrist features and seven ankle features. These location-optimized feature sets (including features designed to capture more information about the youth activities) led to improved results, as summarized in Table 4, part B, and in Table 5. By using the new feature set with the automatic feature selection, an improvement in overall classification accuracy of 2.7% was obtained for the ankle and 5.1% for the wrist. The detailed classification results for each type of activity are reported in Table 5.

TABLE 5
TABLE 5:
Wrist and ankle classification details showing category recognition for each specific activity type.

SVM parameter optimization was used to find radial basis function parameters for the ankle (C = 16, γ = 0.25) and wrist (C = 128, γ = 0.0625) classifiers. However, comparing these parameter optimization results with the previously proposed configuration (C = 100, γ = 0.1) improved overall accuracy by less than 0.5% for both ankle-based and wrist-based classifications.

Solution robustness with respect to different window sizes (8, 6.4, 4 and 3.2 s) was tested. Classification accuracy decreased when reducing the window length. However, even in the worst case, the accuracy remained higher than 80% (83.5% at the wrist and 86.9% at ankle for the smallest window).

Experiments 3 and 4: tests on both A and Y data sets

Table 6 shows results obtained by training on one data set and testing on the other (experiment 3). Both the new and old feature sets were tested. Classification accuracies varied in this case from 79.3% to 87.4% at the ankle and from 58.8% to 71.8% at wrist.

TABLE 6
TABLE 6:
Accuracy results are summarized for all the experiments conducted (single group LOSO CV tests, crossed test and LOSO CV on the merged data set).

Table 4, part C, shows the results of running LOSO cross-validation with the combined data sets including all 53 participants (experiment 4). The classifier was trained without respect to age group. The overall accuracy for this validation test reached 88.5% at wrist and 91.6% at ankle. The contribution of the youth and adult results to the overall accuracy are reported in Table 6. Table 6 also summarizes accuracy results for all the tested experiments, showing the overall recognition accuracy and the recognition accuracy of each activity class.

DISCUSSION

Feature sets

The extension of the methodology presented in Mannini et al. (16) to the Y data set (experiment 1), using the same feature set and classification approach used on the original A data set, produced classification results comparable to those in most previous studies (see Table 4, part A, and Table 1). However, a significant improvement on this new group was obtained in experiment 2 by adding four (wrist) or three (ankle) new features and pruning eight of the old features, as identified by automatic selection (Table 4, part B). In addition to features related to basic signal structure (mean, standard deviation, acceleration range), activity fragmentation features appear to be an important source of information (see Table 3) for both ankle and wrist sensing sites. The ankle classifier exploited the ratio between the dominant frequency of the currently tested window and the dominant frequency of the previous window (i.e., the prior 12.8 s). This feature, used in the previous adult-only study as well (16), captures temporal information that is useful for identification of consistently periodic behaviors, such as ambulation or cycling at the ankle site. This feature was discarded at the wrist site; the activities tested here may exhibit more constant, periodic motion at the ankle than at the wrist.

The activity fragmentation features that were included to capture information about some of the youth activity, such as sports, encode information about signal power, taking into account the amplitude of the signal, the time duration and frequency of significant acceleration episodes within the window. Accordingly, when these features were available, several power-related features were discarded by the feature selection strategy. Similarly, the introduction of the “range” feature, jointly with the “maximum value” feature, led to the algorithm discarding the minimum-value feature at both ankle and wrist. The minimum may be less informative than the maximum acceleration, given that a low SM depends upon slight variations of the SM around 1g. Such values can result from noise or from downward accelerations that compensate for the gravitational acceleration measured by the sensor. Such direction of movement at the ankle site is necessarily followed by an impact, which is already observed by the “maximum value” feature.

Not all the frequency domain features proposed previously (34) and confirmed in our previous work (16) were actually selected by the automatic selection strategy. In particular, the first dominant frequency was retained for both ankle and wrist classifier, but the feature selector discarded the second dominant frequency and the dominant frequency in the band 0.6–2.6 Hz. The number of activity fragments within the window may capture the information missed by discarding those features: a large number of activations within the window should be associated with fast movements that involve frequent accelerations and decelerations; a low number of activations may be associated to slow or sporadic movement episodes in which the amplitude of the acceleration is smoother. At the wrist site, a newly introduced frequency domain feature, the power at frequency components higher than 3.5 Hz, was selected.

Activity classification using youth data

When using the new feature set on the newly available youth data set, both ankle and wrist overall accuracies exceeded 90% (see Tables 4A and B). Updating the feature set resulted in a significant improvement on youth activity classification accuracy, especially for the other activities class. This result was expected after the introduction of the new features that are capable of extracting information about movement fragmentation, typical of recreational activities. Moreover, misclassifications are consistent with intuition. With the ankle classifier, for example, 24.5% of basketball-passing windows were misclassified as sedentary, which is consistent with someone temporarily keeping feet still doing this task (for a short 12.8-s window). Similarly, cleaning room and tennis ball: throwing-catching were misclassified as sedentary from ankle data in 16.1% and 20.2% of cases, respectively — likely as a result of short bouts of no leg motion. Wii games were misclassified as other activities using ankle data in approximately the 10% of cases. Despite the sedentary nature of video gaming, some Wii games can actually be similar to sport activities, such as those included in other activities, if players engage in full-body motion when playing. When assigning activities to classes, we clustered the Wii games with sedentary video gaming in the “sedentary” class instead of grouping it with sport activities in the “other” class because Wii games can be played without moving feet, and even while sitting, and some children were observed doing so. At the ankle, misclassifications also occur between ambulation and other activity, basketball dribbling, and ambulation, and walking natural and other categories. In all instances, the relatively small percentage of errors can be explained by the variability in the behaviors being studied.

Table 5 illustrates challenges with wrist-only activity recognition. Approximately one fourth of walking natural data were misclassified as other activities, whereas slow speed treadmill walking (2 mph) had the highest number of misclassifications in the ambulation class. Wrist movement during slow treadmill walking may not be significant. Exercise bike pedaling was classified as sedentary in almost all cases (88.9%). During this activity, the wrist was placed on the exercise bike handlebar, and its movement could be negligible. Conversely, outdoor cycling was characterized by more significant wrist movement due to more variable wrist movement and vibrations that may result from real bicycle riding. Three of the youth participants were not comfortable with riding a bike, and so they only used the exercise bike. As a consequence their cycling data were misclassified (see Table 5). Sport activities may involve ambulation and active bouts followed by quasisedentary periods. This resulted in higher error rates for basketball passing, soccer dribbling and tennis ball fielding. As with the ankle, Wii activities at the wrist were sometimes classified as other activities instead of sedentary as expected.

Cross-tests (experiment 3)

When training the SVM classifier on all available adult data and testing on all available youth data (or vice versa, see Table 5), ambulation and sedentary activities were generally correctly recognized, whereas more errors are found in the other activities. Cycling detection using wrist data was problematic if the model was trained on youth data and tested on adults. Both the ankle and wrist classifiers obtained overall recognition accuracies higher than 75% for sedentary and ambulation classes, even if the tested population was completely different in terms of age.

Merged data set LOSO tests (experiment 4)

Merging both data sets in a single, larger data set allowed us to use LOSO validation to verify the classifier independently from the age group (Table 4, part C, and Table 6). When training data reflect testing data well, in this case, including both examples from adults and youth, better performance is expected, and we confirmed that here. For the wrist classifier, it was confirmed that the correct classification rate for cycling in adults was low in the merged data set as well. This was because most of the adult cycling data were acquired on an exercise bike. As stressed before, the wrist movement in exercise bike pedaling may be negligible.

Overall, the results confirm that the activity recognition method proposed previously (16) can be applied to recognize activity on youth data as well as on adults. The LOSO validation results on the complete data set (with both youth and adult data) show that training classifiers with data from different age groups does not significantly reduce activity classification performance (Table 4C). Merging data from the two groups into a single data set results in an accurate classifier that is not specific to one age group.

Comparison to literature

Prior work on activity classification using accelerometers in youth uses different activity sets, age ranges, number of participants, amount of processed data and experimental setups, and validation approaches (see Table 1), making direct comparison of results challenging. This study demonstrates a solution with overall accuracy higher than 90% for both wrist and ankle in a four-class activity problem, tested on structured and semistructured activities. Unlike prior work, adults and children are considered together. Del Rosario et al. (10) included activities from elderly and young adults, classified with 82.0% and 79.9% accuracy, respectively, but from a smartphone in the trouser pocket. Only two previous studies focus on wrist activity classification in youth. Accuracy results obtained by Ruch et al., (23) who used wrist and hip sensors simultaneously, were 67.0% on 10 classes. Activity counts were used instead of raw data to extract features. Trost et al. (29) recently achieved wrist activity classification in youth with five-class accuracy results similar to those presented here (88.4% ± 3.0% at wrist). In that study, a smaller number of activities were merged into five broad categories. LOSO cross-validation was not conducted; however, the authors applied a modified version of n-fold cross-validation to prevent data of the subject being tested from being included in the training set. Two previous studies that did not use LOSO cross-validation show recognition accuracies larger than ours. In Hikihara et al. (13), the classification was a two-class problem, discriminating data between locomotion and nonlocomotion using raw accelerometer readings from the waist, obtaining a 99.1% accuracy. Nam and Park (19) report 98.4% accuracy using sensors including the accelerometer and barometer, but in this case, the target population was significantly younger (1.3–2.4 yr), resulting in a very different activity vocabulary.

The data set and Matlab code used in this study are available to interested researchers (https://mhealth.ccs.neu.edu/data/).

CONCLUSIONS

In this work, good overall classification results were obtained using previously defined methods and features. However, given the nature of activities that are much more frequent in youth than in adults (such as video gaming or sports), the accuracy of the classifier may take advantage of the introduction of a different set of dedicated features. When validating methods with a LOSO approach starting from a merged A + Y data set, it was confirmed that the accuracy can be improved by using data from both groups, even if the tested subject was not included in the training set. At the same time, the algorithm preserves all the advantages of the previously proposed method in term of real-time implementation suitability and high comfort for the user given the single-sensor wrist- or ankle-worn proposed configuration.

In conclusion, because large surveillance studies include young participants in their evaluations with wrist-worn monitors, it will be possible to use previously available methods, provided that data from young participants are included in the definition of classification rules. Important accuracy improvement would be obtained if the feature set also includes features that capture the fragmented nature of many youth activities.

The authors have no conflicts of interest to disclose.

This study was funded by the National Heart, Lung and Blood Institute, National Institutes of Health award 5UO1HL091737 to the Massachusetts Institute of Technology and Northeastern University (Stephen Intille, PI) with a sub award to Stanford University (William Haskell, PI). Part of the study was funded by the Italian Ministry of Education and Research (MIUR). Dr. Rosenberger was partially supported by a National Institute on Ageing award (R37-AG008816, PI Laura Carstensen). The present study does not constitute endorsement by ACSM. The results of the study are presented clearly, honestly, and without fabrication, falsification, or inappropriate data manipulation.

REFERENCES

1. Albinali F, Intille SS, Haskell W, Rosenberger M. Using wearable activity type detection to improve physical activity energy expenditure estimation. In: Proceedings of the Int'l Conf on Ubiquitous Computing (UbiComp). Copenhagen, Denmark: 2010. pp. 311–20.
2. Bao L, Intille SS. Activity recognition from user-annotated acceleration data. Proc of Pervasive. 2004;301:1–17.
3. Bhattacharya A, McCutcheon EP, Shvartz E, Greenleaf JE. Body acceleration distribution and O2 uptake in humans during running and jumping. J Appl Physiol Respir Environ Exerc Physiol. 1980;49(5):881–7.
4. Bouten CV, Koekkoek KT, Verduin M, Kodde R, Janssen JD. A triaxial accelerometer and portable data processing unit for the assessment of daily physical activity. IEEE Trans Biomed Eng. 1997;44(3):136–47.
5. Caspersen CJ, Pereira MA, Curran KM. Changes in physical activity patterns in the United States, by sex and cross-sectional age. Med Sci Sports Exerc. 2000;32(9):1601–9.
6. Chen Y, Lin C. Combining SVMs with various feature selection strategies. Feature Extraction. 2006;207(1):1–10.
7. Crouter SE, Flynn JI, Bassett DR Jr. Estimating physical activity in youth using a wrist accelerometer. Med Sci Sports Exerc. 2015;47(5):944–51.
8. Crouter SE, Horton M, Bassett DR Jr. Use of a two-regression model for estimating energy expenditure in children. Med Sci Sports Exerc. 2012;44(6):1177–85.
9. de Vries SI, Engels M, Garre FG. Identification of children's activity type with accelerometer-based neural networks. Med Sci Sports Exerc. 2011;43(10):1994–9.
10. Del Rosario MB, Wang K, Wang J, et al. A comparison of activity classification in younger and older cohorts using a smartphone. Physiol Meas. 2014;35(11):2269.
11. Esterman M, Tamber-Rosenau BJ, Chiu Y-C, Yantis S. Avoiding non-independence in fMRI data analysis: leave one subject out. Neuroimage. 2010;50(2):572–6.
12. Hagenbuchner M, Cliff DP, Trost SG, Van Tuc N, Peoples GE. Prediction of activity type in preschool children using machine learning techniques. J Sci Med Sport. 2015;18(4):426–31.
13. Hikihara Y, Tanaka C, Oshima Y, Ohkawara K, Ishikawa-Takata K, Tanaka S. Prediction models discriminating between nonlocomotive and locomotive activities in children using a triaxial accelerometer with a gravity-removal physical activity classification algorithm. PLoS One. 2014;9(4):e94940.
14. Intille SS, Albinali F, Mota S, Kuris B, Botana P, Haskell WL. Design of a wearable physical activity monitoring system using mobile phones and accelerometers. In Conf Proc IEEE Eng Med Biol Soc. Boston, Massachusetts USA; 2011. pp. 3633–9.
15. Jain AK, Duin RPW, Mao J. Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell. 2000;22(1):4–37.
16. Mannini A, Intille SS, Rosenberger M, Sabatini AM, Haskell W. Activity recognition using a single accelerometer placed at the wrist or ankle. Med Sci Sports Exerc. 2013;45(11):2193–203.
17. Mannini A, Sabatini AM, Intille SS. Accelerometry-based recognition of the placement sites of a wearable sensor. Pervasive and Mobile Computing. 2015;21:62–74.
18. Morey-Klapsing G, Arampatzis A, Brüggemann GP. Choosing EMG parameters: comparison of different onset determination algorithms and EMG integrals in a joint stability study. Clin Biomech (Bristol, Avon). 2004;19(2):196–201.
19. Nam Y, Park JW. Child activity recognition based on cooperative fusion model of a triaxial accelerometer and a barometric pressure sensor. IEEE J Biomed Health Inform. 2013;17(2):420–6.
20. Rosenberger ME, Haskell WL, Albinali F, Mota S, Nawyn J, Intille S. Estimating activity and sedentary behavior from an accelerometer on the hip or wrist. Med Sci Sports Exerc. 2013;45(5):964.
21. Rowlands AV. Accelerometer assessment of physical activity in children: an update. Pediatr Exerc Sci. 2007;19(3):252–66.
22. Ruch N, Melzer K, Mäder U. Duration, frequency, and types of children's activities: potential of a classification procedure. J Exerc Sci Fit. 2013;11(2):85–94.
23. Ruch N, Rumo M, Mäder U. Recognition of activities in children by two uniaxial accelerometers in free-living conditions. Eur J Appl Physiol. 2011;111(8):1917–27.
24. Siirtola P, Laurinen P, Haapalainen E, Roning J, Kinnunen H. Clustering-based Activity Classification with a Wrist-worn Accelerometer Using Basic Features. In: Proceedings of the IEEE Symp on Computational Intelligence and Data Mining. Nashville, TN; 2009. pp. 95–100.
25. Sun M, Hill J. A method for measuring mechanical work and work efficiency during human activities. J Biomech. 1993;26(3):229–41.
26. Troiano R, McClain J. Objective measures of physical activity, sleep, and strength in U.S. National Health and Nutrition Examination Survey (NHANES) 2011–2014. In: Proceedings of the 8th Internat Conf on Diet and Activity Methods. Roma, Italy; 2012. p. 24.
27. Trost SG, Pate RR, Sallis JF, et al. Age and gender differences in objectively measured physical activity in youth. Med Sci Sports Exerc. 2002;34(2):350–5.
28. Trost SG, Wong WK, Pfeiffer KA, Zheng Y. Artificial neural networks to predict activity type and energy expenditure in youth. Med Sci Sports Exerc. 2012;44(9):1801–9.
29. Trost SG, Zheng Y, Wong WK. Machine learning for activity recognition: hip versus wrist data. Physiol Meas. 2014;35(11):2183–9.
30. UBC. UK Biobank Coordinating Centre. Category 2 enhanced phenotyping at baseline assessment visit in last 100–150,000 participants. Stockport Cheshire. 2009. pp. 1–33.
31. Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-Verlag; 2000. pp. 1–314.
32. Veltink PH, Bussmann HB, de Vries W, Martens WL, Van Lummel RC. Detection of static and dynamic activities using uniaxial accelerometers. IEEE Trans Rehabil Eng. 1996;4(4):375–85.
33. Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V. Feature selection for SVMs. In: Proceedings of the NIPS; 2000. pp. 668–74.
34. Zhang S, Rowlands AV, Murray P, Hurst TL. Physical activity classification using the GENEA wrist-worn accelerometer. Med Sci Sports Exerc. 2012;44(4):742–8.
Keywords:

ACTIVITY CLASSIFICATION; ACTIVITY IN CHILDREN; WEARABLE SENSOR; INERTIAL SENSOR; LEAVE-ONE-SUBJECT-OUT CROSS-VALIDATION; ENERGY EXPENDITURE

© 2017 American College of Sports Medicine