Secondary Logo

Journal Logo


Minimal Detectable Change for TUG and TUDS Tests for Children With Down Syndrome

Martin, Kathy PT, DHSc; Natarus, Michael DPT; Martin, Jeremy DPT; Henderson, Sarah DPT

Author Information
doi: 10.1097/PEP.0000000000000333
  • Free


Down syndrome (DS) is a genetic condition in which an extra copy of the 21st chromosome is present.1 The rate of acquisition of motor skills in children with DS is delayed throughout childhood when compared with children developing typically, and these delays become more apparent as children get older and skills become more complex.2 Individuals with DS typically have some degree of intellectual disability, usually mild to moderate1; ligamentous laxity; and general hypotonia, which may contribute to decreased postural and gait stability.3

Outcome measures are tests used to measure a person's ability with a given task to draw conclusions about a person's performance. Outcome measures are useful to objectively measure change in a person's function over time. Two outcome measures that can be used to quantitatively assess the coordination and functional mobility are the Timed Up and Go (TUG) test and the Timed Up and Down Stairs (TUDS) test. The TUG involves the participant rising from a chair, walking a distance of 3 m, and returning to the chair in a seated position.4 The TUG is a valid measure when used with children and adolescents with DS.5 A faster time is indicative of greater functional mobility and moderately correlated with better overall gross motor function.5 The TUDS involves an individual ascending and descending a predetermined number of steps and is considered complete when both feet return to the start point.6 The TUDS has excellent reliability with children with cerebral palsy,6 but its use with children and adolescents with DS has not yet been evaluated.

These outcome measures become more useful in a clinical setting for a desired population when the test-retest reliability and minimal detectable change (MDC) are known for that population. The test-retest reliability is the value of how consistently a participant scores trial by trial, and the MDC is defined by Huang et al7 as “the smallest amount of difference in individual scores that represents true change” (beyond random measurement error).(p114) A known test-retest reliability and MDC assist a clinician in determining whether a change in patient performance is due to a true change or may be due to measurement error. There is no established test-retest reliability value or MDC for the TUG or TUDS with children and adolescents with DS.

The TUG has been studied with children developing typically,5,8 children with cerebral palsy and spina bifida,9 and DS.5 Reference data for the TUG with children developing typically have been published5,8 as well as intrasession reliability with children with DS.5 The MDC of the TUG has been evaluated in adults with neurological disabilities and is clinically useful.7,10 The TUDS has only been studied with children with cerebral palsy.6 Our aim was to determine the test-retest reliability and MDC of the TUG and TUDS tests in children and adolescents with DS. We hypothesize that the MDC for the TUG and TUDS tests will be small enough to be considered clinically relevant in the population of children and adolescents with DS, making these tests both useful and applicable in therapy clinics. This study was approved by the Institutional Review Board at the University of Indianapolis.



Twelve children and adolescents, ages 3 to 17 years with DS, were recruited for this study. Inclusion criteria were (1) ages of 3 to 17 with DS; (2) able to walk a distance of 6 m independently 3 consecutive trials; (3) ascend/descend 15 steps independently, twice with a handrail as needed; and (4) follow simple verbal directions, per parent report. Participants were excluded if they (1) had a musculoskeletal injury that required medical attention within the past 12 months or (2) had an uncorrected visual deficit.


In each testing session, participants were asked to perform the TUG 3 times and the TUDS 2 times. A second session occurred 7 days after the first session. Four researchers collected the data at 2 locations. For the TUG, participants were given 1 practice test to assess their comprehension of the procedure. Because the TUDS is a physically demanding test, only 2 formal trials of the TUDS were performed with no practice trial to avoid fatigue. Instructions and a demonstration were provided. Because of the cognitive impairments that accompany DS, verbal cueing was provided as needed throughout the trials to maintain focus on the task to ensure participants' best effort and the most accurate results.9 Examples of verbal cues provided by the researchers included repetition of directions and positive encouragement to promote better performance.

TUG-Specific Methods

The TUG protocol was adapted from the modified TUG protocol for children.9 This adaptation to the standard TUG has been used by others5 to validate the TUG with children with DS. These modifications have been used to minimize the effects of complicated instructions on the results of the test.9 A bench with no armrests was placed 3 m from a wall. The bench height was adjusted to allow the participant to sit with both feet flat on the ground and an approximate 90° flexion of the knees. On the wall, an 8.5″-by-11″ target was placed centered to the bench at eye level of the participant in standing. The target was a picture of a children's character that the child preferred. Participants completed the trials wearing functional attire including usual orthoses and shoes to simulate community ambulation. Participants were given the option of a 1-minute break between trials.

A research team member demonstrated the test by stating, “When I say go I want you to walk, not run, as quickly as you can to touch that target on the wall, come back and sit exactly as you are now” and then the researcher demonstrated the test while the participant observed. The participant was seated on a bench with both feet planted firmly on the ground. A second researcher began recording time when the child lifted from the seat (not on the word go) and stopped when the child was seated. This ensured that movement time and not reaction time was evaluated.9 Verbal cues to “walk fast, don't run” were given as needed throughout the test to help the child stay on task.

TUDS-Specific Methods

The stairs consisted of 15 steps (height 20 cm per step) and bilateral handrails. Because of the width of the flight of stairs, only 1 handrail was within reach at a given time. The participants ascended the flight of stairs in any way they wished, whether it was using a marking time pattern (step-to), reciprocal pattern (step-through), running, skipping stairs, and/or using a handrail on either side.6 Participants were instructed to face forward when ascending and descending the stairs. A researcher walked along beside the participant to ensure safety and to provide verbal encouragement as needed.

For each participant, a research team member explained and demonstrated the test. The participant started at a fixed distance of 30 cm from the base of the first step. A research team member prompted the participant using a “ready, set, go” protocol. A second research team member began the timing of the trial upon the word “go” as it was done in the pilot study.6 The participant ascended the standard flight of stairs until they reached the top step. The participant then turned and descended the stairs. Timing was concluded when both of the participant's feet returned to the ground floor. The child was not required to return to the starting mark. At the conclusion of the first TUDS trial, the participant was provided an optional 1-minute rest break before the second trial began. In contrast to Zaino, Marchese, and Westcott,6 we allowed the participants to wear their usual orthoses and our flight of stairs had 15 stairs compared with the 14 stairs in the original study.6

Data Analysis

All statistics were calculated for variables using SPSS (version 23) software. The α level was 0.05. Means of the 3 TUG and 2 TUDS trials were used for analysis. Test-retest reliabilities of the 3 trials of the TUG and 2 trials of the TUDS were estimated using intraclass correlation coefficients (ICC 2,k) and the standard error of measurement (SEM). The MDCs for the TUG and TUDS tests were calculated at the 95% confidence interval using the following equation: MDC95 = 1.96 × √2 × SEM. The correlations between participant age and times on both the TUG and TUDS were examined using the Pearson correlation coefficient.


Twelve participants between the ages of 3 and 17 (mean 9.5 years, standard deviation 4.4 years) completed the trials for the TUG test data collection (Table 1). The TUG had high test-retest reliability in this population (TUG ICC 2,k; r = 0.923). The MDC was 1.26 seconds (Table 2). Eight of the 12 participants also completed the TUDS trials (Table 1). Four were unable to participate in the TUDS test due to the alternate testing location not having access to a flight of stairs. The TUDS test had high test-retest reliability (TUDS ICC 2,k; r = 0.974). The MDC was 12.52 seconds (Table 2). Differences in times from session 1 to session 2 (1 week later) are graphed in Figures 1 (TUG) and 2 (TUDS). Age was negatively correlated with TUG times but was not significant (r = −0.525, P = 0.08). Age was negatively correlated with TUDS times and was significant (r = −0.759, P = 0.29).

Fig. 1.:
Mean Timed Up and Go test scores.
Fig. 2.:
Mean Timed Up and Down Stairs test scores.
TABLE 1 - Mean Scores for TUG and TUDS
Participant Age, y TUG Day 1 TUG Day 2 TUDS Day 1 TUDS Day 2
1 11 7.54 6.74 21.98 39.90
2 15 7.18 6.97 22.44 21.70
3 16 4.70 4.38 12.55 11.95
4 7 6.80 5.08 37.02 30.08
5 7 8.33 7.51 101.44 91.72
6 6 7.28 7.85 66.67 57.09
7 16 7.12 7.26 19.00 16.58
8 9 6.02 4.49 40.25 35.10
9 12 4.79 4.76
10 6 8.67 7.06
11 5 5.94 6.42
12 4 10.32 11.16
Group mean (SD) 9.5 (4.4) 7.06 (1.60) 6.64 (1.89) 40.17 (30.05) 38.01 (25.99)
Abbreviations: SD, standard deviation; TUDS, Timed Up and Down Stairs; TUG, Timed Up and Go.

N Mean (SD) Day 1 Mean (SD) Day 2 Test-Retest ICC (2,k) MDC, s
TUG 12 7.06 (1.60) 6.64 (1.89) 0.932 1.26
TUDS 8 40.17 (30.05) 38.01 (25.99) 0.974 12.52
Abbreviations: ICC, intraclass correlation coefficient; MDC, minimal detectable change; SD, standard deviation; TUDS, Timed Up and Down Stairs; TUG: Timed Up and Go.


We believe that the MDC for the TUG test of 1.26 seconds is small enough to be considered clinically relevant with children and adolescents with DS. This finding, combined with the high test-retest reliability, confirmed our hypothesis that the TUG test would be both reliable and useful in a clinical setting. Our hypothesis for the TUDS test, however, was rejected as the MDC of 12.52 seconds is too large in our opinion to be clinically relevant for this population.

Sample size can affect reliability coefficients, and our sample was relatively small, especially for the TUDS. However, there was wide variability in our sample. Times for the TUG ranged from 4.38 to 11.16 seconds, and the times for the TUDS had an even larger range (11.95-101.44 seconds). Reliability is based upon the proportion of a sample's variance that is the result of error.11 Because there was wide variability in the sample, the proportion of the variance that was attributable to error was smaller than it would be in a more homogenous sample. Thus, the size of our sample likely had less effect on the reliability coefficients.

Sample size likely had an effect on the correlation of performance to age for the TUG and TUDS. For the TUG, the large variability in times may have masked a potential relationship. For example, the second youngest participant (#11) had the third fastest time on the TUG whereas the second oldest participant's (#2) time was close to the group mean. A larger sample would be needed to identify true correlations with age given this variability. In contrast, the performance on the TUDS had a significant correlation with age. This result should be interpreted cautiously as it is based on the performance of 8 participants. In our experience, age is not always the best indicator of motor performance for children with DS. Other factors, such as cognition and motivation, are relevant but were not measured in this study.

The TUG test is useful for clinicians due to its ease of administration in a clinical setting. Because children with DS receive physical therapy in a variety of settings such as outpatient clinics, school settings, or home therapy, there is a need for a reliable, yet simple tool to document change. The advantages of the TUG are its simplicity in setup and the minimal time to complete, resulting in an efficient way to detect clinically relevant changes in functional mobility. Nicolini-Panisson and Donadio5 correlated TUG scores with both Gross Motor Function Measure (GMFM) total scores and GMFM Dimension E (Walking, Running, and Jumping) scores and found moderate negative correlations for both (r = −0.49 total GMFM and r = −0.55 Dimension E). They conclude that because decreased TUG times correlated with better gross motor ability, the TUG could be a useful screening test for functional mobility for children with DS.5 The explanation for the usefulness of the TUG as a screening test may be in data presented by Rigoldi et al12 regarding gait development across the lifespan of individuals with DS. These authors reported that children with DS showed “movement uncertainty”(p161) and used compensatory strategies, such as an increased base of support and increased movement in the frontal plane, to compensate for decreased stability. Because the TUG requires both gait and movement transitions, it addresses key challenges for individuals with DS, and our results confirm that the TUG is reliable and may be sensitive to change over time.

Our results and the results from Nicolini-Panisson and Donadio5 are in direct contrast to the results of Villamonte et al13 who reported very poor reliability for the TUG (r = <0.25) with children and adults with DS. However, Villamonte et al13 used a distance of 9 m instead of 3 m, and did not report detail regarding instructions or if verbal cues were adapted to meet the needs of individuals with intellectual disability, making comparisons across the studies difficult.

Our conclusion that the TUG is both a clinically relevant and a reliable outcome measure for this population is supported by previous research, which found an ICC of 0.82 for the TUG for children with DS,5 compared with our ICC of 0.923. Children with DS required increased time (mean of 9.17 seconds) compared with their peers who were developing typically (mean of 5.61 seconds).5 Our participants with DS had TUG times that ranged from 4.38 to 11.16 seconds, and the group mean was 7.06 seconds on the first day and 6.64 seconds on the second day, 1 week later. In both our study and the study by Nicolini-Panisson and Donadio,5 children were told to walk fast and not run; however, we used repeated verbal cues as necessary to keep the child on task. We used the mean time from 3 trials whereas Nicolini-Panisson and Donadio5 used the shortest time of 3 trials. One additional comparison from this study is that Nicolini-Panisson and Donadio5 commented that a change in TUG test time of approximately 2 seconds could be considered clinically important, compared with our calculated MDC of 1.26 seconds. Nicolini-Panisson and Donadio5 used z-scores from the data of their participants who were developing typically and stated that “significant variations usually include changes in 2 z-score units”(p496) as the basis for their statement regarding clinically important changes. Because these authors did not use the traditional formula for calculating the MDC, the results are not directly comparable to ours.

Our modifications to the TUG protocol were based on Williams et al9 and described in the Methods section earlier. The modification using a target on a wall was effective in preventing the participants from veering away from the anticipated path and allowed for less overall variability between trials due to distractions or impaired comprehension of the protocol. Hanging a participant-selected picture to touch on the wall provided additional motivation and compliance, yielding better results than what would likely have been observed if the traditional TUG instructions to walk around a mark on the floor had been used.

Based on the large MDC for the TUDS, we question the clinical usefulness of this tool as it recognizes large changes in performance as a true change in this population. If this study were repeated with a larger sample size, the MDC might be smaller and may indicate that the TUDS is a more useful measure with children and adolescents with DS. No previous research has been done to determine the MDC or reliability of this outcome measure with children and adolescents with DS.

Another factor for the large MDC for this test was that the TUDS protocol asks for participants to quickly ascend and descend a flight of stairs. For this population with decreased postural stability and decreased motor performance compared with their peers who are developing typically, asking them to go as quickly as possible on stairs may have been a novel task. When learning how to ascend and descend stairs, participants were likely taught to focus on safety more than speed. The large variation in times on the TUDS may reflect the participants' lack of experience in attempting to move quickly on the stairs. Participants may have been fearful about increasing their speed during a task in which safety is often emphasized. The issue of fear of falling seemed to be especially relevant for participant #5, whose times in the TUDS were slower than the other participants. Participant #5 needed frequent verbal encouragement and reassurance to complete the task.

To ensure that the data we collected provided a reliable reassessment, the TUG and TUDS sessions were conducted 1 week apart. Although it is not typical in a clinical setting to use an outcome measure for every session or even on a week-to-week basis, we believed this protocol would give us a more trustworthy result. When trials are not completed on the same day, the ability of a participant to learn the skills due to repetition, also referred to as repeated testing error, decreases. However, with increased time between assessment sessions, changes observed may be due to maturity and growth of the participant and not due to clinical interventions. To balance the possibility of repeated testing error and maturation, we held the trials 1 week apart. Previous studies establishing the MDC in other patient populations collected the data in 1 session.5,10 We performed multiple trials on 2 days to establish a more accurate average of performances, and to further limit behavioral and cognitive influences on the test results.

After giving the initial instructions for both the TUG and the TUDS, we did not set specific regulations regarding the type and frequency of verbal cueing to be used during the trials of either outcome measure. Verbal cueing was used at the discretion of the research team throughout each trial. The general purpose of verbal cues was to keep each participant focused on the task to get their best effort. Participants varied in their abilities to remain focused on either task. Some participants were highly motivated and others required more verbal prompting from the researchers. For few participants, we limited cueing to avoid distracting the participant.

Although the variation in verbal cues provided could have influenced our results, we believe not having a specific cueing protocol is more realistic when comparing our results to clinical practice. The freedom to use verbal cueing when necessary allowed us to motivate, repeat directions, and regain the focus of the participants. Although Rockwood et al14 stated that the TUG should not be used with individuals with cognitive deficits because of poor reliability, Ries et al10 reported that use of verbal cueing was the key in obtaining consistent performance on the TUG with adults with Alzheimer disease. Our justification for our procedure was that by modifying cueing to the participant, we were able to obtain more accurate results regarding physical ability as opposed to testing the ability of the participants to follow directions. All other aspects of the TUG and TUDS tests were standardized across participants.

As is common in most pediatric studies conducted outside of a clinical setting, our ability to recruit a large sample was limited. We did not complete a power analysis before the start of this study to determine a minimum sample size requirement. As our initial participant recruitment yielded minimal response, additional data collection was conducted at a secondary site as an effort to increase sample size for the TUG test. As there was not a staircase present at the alternate facility, the TUDS test was not completed with the additional participants. The findings of this study would be strengthened with a larger sample size for both the TUG and TUDS tests. Our protocol allowed variation in verbal cueing, making it more like clinical practice, but this also hinders comparison to other similar studies.


The TUG was determined to be a valid and clinically relevant outcome measure to be used with children and adolescents with DS, with an MDC of 1.26 seconds. Due to the large MDC for the TUDS test, there is limited clinical application as the child would need to demonstrate a large difference in performance for the results to be useful in documenting change. Navigating stairs quickly seemed to be a novel task for many of our participants and this likely contributed to the variability between trials and overall increased MDC. To increase the utility of the TUG for the pediatric population with DS, a follow-up study should be performed to determine the minimal clinically important difference for this population. Knowledge of the minimal clinically important difference would allow clinicians to quantify whether a change in performance can be translated to meaningful differences in the patient's mobility and dynamic balance throughout their daily function.


We would like to thank all of the participants and their parents for their willingness and enthusiasm to participate in this study. We would also like to thank Gigi's Playhouse-Indianapolis for allowing us to use your facility. Finally, we thank Dr Stephanie Combs-Miller, PT, PhD, NCS, for her assistance with data analysis.


1. National Down Syndrome Society. Down syndrome facts. Accessed June 7, 2016.
2. Palisano RJ, Walter SD, Russell DJ, et al. Gross motor function of children with Down syndrome: creation of motor growth curves. Arch Phys Med Rehabil. 2001;82:494–500.
3. Shumway-Cook A, Woollacott MH. Dynamics of postural control in the child with Down syndrome. Phys Ther. 1985;65:1315–1322.
4. Podsiadlo D, Richardson S. The timed “Up & Go”: a test of basic functional mobility for frail elderly persons. J Am Geriatr Soc. 1991;39:142–148.
5. Nicolini-Panisson RD, Donadio MVF. Normative values for the timed “up and go” test in children and adolescents and validation for individuals with Down syndrome. Dev Med Child Neurol. 2014;56:490–497.
6. Zaino CA, Marchese VG, Westcott SL. Timed up and down stairs test: preliminary reliability and validity of a new measure of functional mobility. Pediatr Phys Ther. 2004;16:90–98.
7. Huang S-L, Hsieh C-L, Wu R-M, Tai C-H, Lin C-H, Lu W-S. Minimal detectable change of the timed “up & go” test and the dynamic gait index in people with Parkinson disease. Phys Ther. 2011;91:114–121.
8. Itzkowitz A, Kaplan S, Doyle M, et al. Timed up and go: reference data for children who are school age. Pediatr Phys Ther. 2016;28:239–246.
9. Williams EN, Carroll SG, Reddihough DS, Phillips BA, Galea MP. Investigation of the timed “up & go” test in children. Dev Med Child Neurol. 2005;47:518–524.
10. Ries JD, Echternach JL, Nof L, Gagnon Blodgett M. Test-retest reliability and minimal detectable change scores for the timed “up and go” test, the six-minute walk test, and gait speed in people with Alzheimer disease. Phys Ther. 2009;89:569–579.
11. Portney LG, Watkins MP. Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, NJ: Pearson Prentice Hall; 2009.
12. Rigoldi C, Galli M, Albertini G. Gait development during lifespan in subjects with Down syndrome. Res Dev Disabil. 2011;32:158–163.
13. Villamonte R, Verhrs PR, Feland JB, Johnson AW, Seeley MK, Eggett D. Reliability of 16 balance tests in individuals with Down syndrome. Percept Mot Skills. 2011;11:530–542.
14. Rockwood K, Awalt E, Carver D, MacKnight C. Feasibility and measurement properties of the functional reach and the timed up and go tests in the Canadian Study of Health and Aging. J Gerontol A Biol Sci Med Sci. 2000;55:M70–M73.

Down syndrome; minimal detectable change; reliability; TUG test

© 2017 Wolters Kluwer Health, Inc. and Academy of Pediatric Physical Therapy of the American Physical Therapy Association