Repeatability of Aerobic Capacity Measurements in Parkinson Disease


Medicine & Science in Sports & Exercise:
doi: 10.1249/MSS.0b013e31822432d4
Applied Sciences

Purpose: Maximal or peak aerobic capacity (V˙O2peak) during a maximal-effort graded exercise test is considered by many to be the “gold standard” outcome for assessing the effect of exercise training on cardiorespiratory fitness. The reliability of this measure in Parkinson disease (PD) has not been established, where the degree of motor impairment can vary greatly and is influenced by medications. This study examined the reliability of V˙O2peak during a maximal-effort graded exercise test in subjects with PD.

Methods: Seventy healthy middle-aged and older subjects with PD Hoehn and Yahr stage 1.5–3 underwent a screening/acclimatization maximal-effort treadmill test followed by two additional maximal-effort treadmill tests with repeated measurements of V˙O2peak. A third V˙O2peak test was performed in a subset of 21 subjects.

Results: The mean V˙O2peak measurement was 2.4% higher in the second test compared with the first test (21.42 ± 4.3 vs 21.93 ± 4.50 mL·kg−1·min−1, mean ± SD, P = 0.03). The intraclass correlation coefficients (ICC) for V˙O2peak expressed either as milliliters per kilogram per minute or as liters per minute were highly reliable, with ICC of 0.90 and 0.94, respectively. The maximum HR (ICC of 0.91) and final speed achieved during the tests (ICC of 0.94) were also highly reliable, with the respiratory quotient being the least reliable of the parameters measured (ICC of 0.65).

Conclusions: Our results demonstrate that measurement of V˙O2peak is reliable and repeatable in subjects with mild to moderate PD, thereby validating use of this parameter for assessing the effects of exercise interventions on cardiorespiratory fitness.

Author Information

1Geriatrics Research Education and Clinical Center, Baltimore Veterans Affairs Medical Center, Baltimore, MD; 2Division of Gerontology and Geriatric Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD; 3Department of Neurology, University of Maryland School of Medicine, Baltimore, MD; 4Maryland Exercise and Robotics Center of Excellence, Veterans Affairs Rehabilitation Research & Development Service, Baltimore Veterans Affairs Medical Center, Baltimore, MD; and 5University of Maryland School of Nursing, Baltimore, MD

Address for correspondence: Leslie I. Katzel, M.D., Ph.D., Geriatrics Research Education and Clinical Center, Baltimore Veterans Affairs Medical Center, BT/18/GR, 10 N. Greene Street, Baltimore, MD 21201; E-mail:

Submitted for publication January 2011.

Accepted for publication April 2011.

Article Outline

Resting tremor, bradykinesia, rigidity, and postural instability with resulting impairment of gait and balance contribute to progressive impairment of mobility in Parkinson disease (PD). The gait abnormalities in PD have been well described. PD patients tend to walk slowly with shuffling and dragging steps, diminished arm swing, and flexed forward posture gait with higher cadence and smaller stride length than non-PD patients. In a more advanced disease, gait patterns include propulsive gait, steppage gait, waddling gait, and scissor gait. Despite significant advances in pharmacotherapy and deep brain stimulation, continuing development of new interventions to improve function remains a high priority for this patient population. We and others have proposed that treadmill-based aerobic exercise training may improve cardiorespiratory fitness, ambulation, and neurological performance in patients with PD (8,9,12,13,18,19). However, proper assessment of rehabilitation interventions including treadmill therapy depends on consistently reliable functional outcome measures. Maximal or peak aerobic capacity (V˙O2peak) during a maximal-effort graded exercise test (GXT) is often used as the “gold standard” outcome for assessing the effect of exercise training on cardiorespiratory fitness. For a measure to be a gold standard, the measure must be highly reproducible wherein repeated measurements obtained from a given subject under similar circumstances should show little within-subject variation. A measure that has this characteristic is defined as having high reliability or precision. The reliability of this measure in PD has not been established, where the degree of motor impairment can vary greatly and is influenced by medications. No prior studies have examined the intraclass correlation coefficient (ICC) of repeated measurement of V˙O2peak in PD. Hence, establishing repeatability of V˙O2peak in PD is crucial because a large variation in repeated measurements would confound the use of this parameter for assessing changes in cardiorespiratory fitness across rehabilitation interventions. This study examined the reliability of V˙O2peak during maximal-effort GXT in subjects with PD. As a secondary aim, we examined whether demographics or disease severity influenced the reproducibility of the measurement.

Back to Top | Article Outline



Subjects were recruited from the University of Maryland Parkinson’s Disease and Movement Disorders Center, from the Baltimore Veterans Affairs Medical Center Parkinson’s Disease Clinic, and via media advertisements from the community for participation in a randomized clinical trial of exercise training in PD. The University of Maryland Baltimore Institutional Review Board approved the protocol. After written informed consent was obtained at the first study visit, a history and physical examination including the Unified Parkinson’s Disease Rating Scale (UPDRS) (6) and Hoehn and Yahr (HY) staging (10) was performed by a neurologist with expertise in PD (LMS). Individuals underwent three timed 10-m walks to evaluate their ability to ambulate and to assess their self-selected walking velocity. Self-selected velocity was defined as the average velocity during the three tests. To minimize the effects of pharmacological therapy, particularly with subjects on L-dopa, where there are well-described fluctuations of wearing off or end-of-dose deterioration, all study evaluations were performed while the subjects were “on” (i.e., experiencing benefit from their antiparkinsonian medications) at the same time of day with the same timing interval relative to their medication doses. Subjects used an additional dose of medication to maintain the on state for assessment when necessary.

The main study inclusion criteria were 1) diagnosis of PD based on criteria of asymmetrical onset of at least two of the three cardinal features (resting tremor, bradykinesia, rigidity) with no atypical signs or exposure to dopamine-blocking drugs, 2) HY stage (4) 1.5 to 3 (while on for motor fluctuators) and presence of gait impairment or postural instability, as defined by a score of 1 or 2 on UPDRS (6) items No. 29 Gait or No. 30 Postural Stability (mild to moderate gait or balance impairment including slowing, dragging of the affected leg, freezing, or festination), 3) age ≥40, 4) Folstein Mini-Mental Status Examination (7) score ≥23, and 5) unlikely to require PD medication adjustment for 4 months. The study exclusion criteria included 1) unstable cardiac, pulmonary, liver, or renal disease; 2) unstable hypertension or diabetes; 3) anemia or orthopedic or chronic pain condition restricting exercise; 4) unstable psychiatric illness; or 5) >20 min of aerobic exercise more than three times per week (to avoid a prior training effect). The individuals selected for participation did not have any of the 15 standard Brain Bank diagnostic criteria that exclude a PD diagnosis (such as rapidly progressive dementia and gaze palsy).

Back to Top | Article Outline


Exercise treadmill testing
Screening exercise treadmill test.

After the initial study evaluation, a screening graded exercise treadmill test to voluntary exhaustion without V˙O2 monitoring was performed using a customized manual protocol as previously described (18). All treadmill testing was performed in the early afternoon while the subjects were on. Stopping criteria were based on the American College of Sports Medicine criteria (21). This screening exercise treadmill test served to 1) acclimate the subjects to walking on a treadmill, 2) evaluate for symptoms of overt coronary disease and detect silent myocardial ischemia, 3) evaluate hemodynamic HR and blood pressure response to exercise, 4) observe gait patterns, and 5) determine whether there were any significant balance problems or other issues that would preclude their ability to safely exercise. All subjects wore a gait belt for safety, and a spotter stood behind subjects during the treadmill evaluations. Subjects were instructed to use the minimum level of handrail support for balance during the test.

Immediately preceding each test, baseline preexercise ECG and blood pressure were assessed with the subjects supine, seated, and standing. After preliminary ECG and vital signs were obtained, the subjects started walking on the treadmill. The initial target speed for treadmill testing was the subject’s self-selected over-ground walking velocity, with the incline set at 0%. In some subjects, the initial treadmill speed was adjusted slightly during the first 30 s by the exercise physiologist and clinician according to the subject’s tolerance with the preselected treadmill speed. The first stage was conducted for 2 min at 0% grade, the next stage was conducted for 2 min at 4% grade, and the grade was subsequently advanced by 2% every minute until voluntary exhaustion. In frailer subjects, the second stage was conducted at 2% instead of 4% to allow a more gradual increase in workload. Once the grade reached 10%, subjects were asked if the speed of the treadmill could be simultaneously advanced with grade (generally by 0.2 mph). The ECG was monitored continuously, and blood pressure was measured during the first three stages of the tests and every 2 min during recovery. This protocol was well tolerated. Only one subject was excluded because of the development of chest pain during the test.

Back to Top | Article Outline
Graded exercise treadmill test with measurement of peak oxygen consumption.

At the next study visit, generally 1 wk later, subjects underwent a progressive GXT to voluntary exhaustion as described above with measurement of peak oxygen consumption (V˙O2peak) using a Quark Cardio Pulmonary Exercise Testing metabolic analyzer (COSMED, Rome, Italy). During the test, O2 consumption, CO2 production, and minute ventilation were measured breath by breath, and values were averaged for 20-s intervals. Subjects were instructed not to talk during the test because this is known to affect the depth of breathing and gas exchange. On the basis of our pilot study (18), we anticipated that many of these deconditioned subjects would not be able to obtain a true maximal aerobic capacity, defined as a plateau in oxygen consumption during the final stage, maximal HR >85% of age-adjusted predicted maximal HR, and respiratory quotient (RQ) or RER > 1.10. Therefore, the values we obtained in many of the subjects represented peak effort values or V˙O2peak as opposed to V˙O2max. The V˙O2peak was based on the mean of the final two 20-s averages obtained during the final stage of the test.

One week later, subjects underwent a second evaluation of their V˙O2peak. The exercise physiologist set the initial speed of the test at the speed reached during the final stage of the prior test of peak V˙O2 consumption or on the basis of comments noted during the prior test. The subjects exercised to voluntary exhaustion. Importantly, both the test supervision and the V˙O2peak determinations were made by study staff who were blinded to the results of the prior treadmill tests. We report the baseline preexercise intervention data on 70 consecutive eligible subjects who performed two maximal-effort treadmill tests with measurement of peak oxygen consumption.

To gain further insight into V˙O2peak test–retest reliability, a third maximal-effort GXT with measurement of oxygen consumption was performed a week later in 21 subjects for whom the V˙O2peak measurements for the two treadmill tests varied by >5%. The exercise physiologist set the initial speed of the test at the speed reached during the final stage of the prior test of peak V˙O2 consumption or on the basis of comments noted during the prior test. The subjects exercised to voluntary exhaustion. As in the prior tests, the supervising clinician was blinded to the results of the prior treadmill tests including the determination of the V˙O2peak.

Back to Top | Article Outline
Statistical Analyses

Analyses were conducted using SAS software, version 9.2 (SAS Institute, Inc., Cary, NC). Outcome measures obtained during GXT included peak oxygen consumption (L·min−1 and mL·kg−1·min−1), maximal HR, maximal ventilation, RQ, maximum treadmill speed, and maximum treadmill grade. All statistical tests were two sided and performed at a significance level of 0.05.

Pearson product–moment correlations were computed from data obtained from the first and second V˙O2peak tests. Unbiased intraclass correlations (rk′) (23) were computed using all available data, i.e., data obtained from the first two tests and data for three tests in the 21 subjects on whom a third test was performed. The correlations (Pearson correlations) and ICC, along with their associated 95% confidence intervals, were computed from 10,000 bootstrap samples of the original 70 subjects. Each bootstrap sample containing 70 rows of data was obtained by selecting with replacement from the original 70 subjects. The mean of the 10,000 bootstraps is reported as the mean for the statistics; the range of values including the percentiles of 2.5 to 97.5 of the distribution of the 10,000 values for each statistic is reported as the statistic’s 95% confidence interval. High reliability was defined as an ICC greater than 0.85.

We used random-effects ANOVA to determine whether there was a training effect across the time points. Three random-effects models were examined: (a) random intercept, (b) random time, and (c) random intercept and time. We chose the best covariance structure using corrected Akaike information criteria. Bland–Altman plots were constructed (2), defined as the difference in values from test 2 and test 1 plotted versus the average of the values from tests 1 and 2. We computed the limits of agreement during Bland–Altman analysis. This was specified as the bias ± 1.96 SD of the difference (average difference ± 1.96 SD of the difference). Linear multiple regression analyses were used to determine whether baseline characteristics and severity of PD affected the repeatability of the measurements.

Back to Top | Article Outline


Physical characteristics and PD severity for the 70 subjects (50 men and 20 women), with a mean ± SD age of 65.3 ± 10.7 yr (range = 42–86 yr), are summarized in Tables 1 and 2. Overall, on the basis of the UPDRS and HY ratings, the subjects had moderately severe PD. However, there was a wide range of disease severity. Nine subjects (13%) had been treated with deep brain stimulation for their PD. Beyond their PD diagnosis, the population was healthy, with only four individuals (6%) having prior history of stable CAD, seven (10%) on medications for diabetes, and only one (1%) being a current smoker. Only seven (10%) were on β-blockers. As expected, a history of depression was common in this population with 22 subjects (31%) currently on antidepressant medication.

The treadmill testing was well tolerated. There were no falls or serious adverse events during the treadmill testing. There were initial concerns based on preliminary pilot studies by our group (21) that some subjects might have symptomatic orthostatic declines in their blood pressure going from supine to standing or during the recovery phase of the test. This did not prove to be a problem as only three subjects had asymptomatic orthostatic declines in blood pressure that did not interfere with the assessment or warrant premature termination of the test.

Comparison of the two peak treadmill tests (V˙O2peak) performed 1 wk apart allowed for the determination of the ICC for many cardiopulmonary parameters (Tables 3 and 4). The overall mean ± SD V˙O2peak measurements for tests 1 and 2 were 21.42 ± 4.30 and 21.93 ± 4.50 mL·kg−1·min−1, respectively (P = 0.03), or 2.4% (0.56 mL·kg−1·min−1) higher on average for the second test. Bland–Altman plots of the within-subject change for V˙O2peak expressed in milliliters per kilogram per minute versus the mean of tests 1 and 2 (Fig. 1) show the small but statistically significant increase of 0.56 (95% confidence interval of −3.5 to 4.6) mL·mg−1·min−1 between the first and the second tests. The second test had higher V˙O2peak values in 38 (54%) of the tests, consistent with a mild learning effect of the subjects and in the use of the final speed obtained in the first test to set the initial speed for the second test. During these tests, of the 63 subjects not on β-blockers, only 7 (11%) subjects achieved a true V˙O2max on the basis of an RER greater than 1.1 and achievement of >85% age-predicted maximal HR.

The ICC for the peak exercise tests (test 1 vs test 2) are summarized in Table 4. V˙O2peak expressed both as milliliters per kilogram per minute and liters per minute was highly reliable, with ICC of 0.90 and 0.94, respectively. The maximum HR and final speed achieved during the tests were also highly reliable in these patients. Final grade achieved during the tests was less reliable (ICC of 0.74) and fell below the 0.85 cutoff. The RQ proved to be the least reliable of the parameters measured during the GXT (ICC of 0.65).

We examined time (i.e., test number), age, sex, race, medical comorbidities, UPDRS total, UPDRS motor, and HY stage to see if they predicted the change in V˙O2peak from one test to the next. None of these variables demonstrated significant bivariate correlations (both Spearman and Pearson) with V˙O2peak expressed either in milliliters per kilogram per minute (smallest P value = 0.15) or liters per minute (smallest P value = 0.25). Collectively, the variables did not predict V˙O2peak expressed either as milliliters per kilogram per minute or liters per minute when they were included in a multiple regression.

To gain further insight into the reliability of the testing and factors associated the variability, a third treadmill test was performed in 21 subjects in whom the first two tests differed by >5%. Fitness as assessed by V˙O2peak, measured either in milliliters per milligram per minute or liters per minute, improved with repeated testing (Table 5). The increases were small but significant. V˙O2peak increased by 0.56 mL·mg−1·min−1 per test (a total of 1.2 mL·mg−1·min−1 from the first to the third test) or 0.43 L·min−1 per test (a total of 0.087 L·min−1 from the first to the third test). Maximum HR, RQ, and maximum treadmill grade showed no significant increase across the three tests; however, ventilation and maximum treadmill speed increased. Individual subject data revealed two main testing patterns across the threetests. In eight subjects, the third test had higher values than either of the first two tests with a progressive increase in V˙O2peak across the three tests in five subjects. This pattern is consistent with a learning effect. In 12 subjects, the third test was intermediate in value between the first and the second tests. This pattern is consistent with regression to the mean. In only 1 of 21 subjects did we observe a progressive decline in V˙O2peak across the three tests.

Back to Top | Article Outline


Our results are the first to demonstrate that measurement of V˙O2peak is reliable and repeatable in subjects with moderate PD, thereby validating use of this parameter for assessing the functional adaptation across rehabilitation interventions. The ICCs were 0.90 or higher for several peak treadmill testing measures including V˙O2peak (mL·kg−1·min−1 or L·min−1), maximum HR, and final speed achieved. We observed a small 2.4% mean increase in V˙O2peak and other cardiovascular parameters such as maximal HR during the second test, consistent with subjects pushing themselves slightly harder during the second test. In almost all instances of disparate results (>5%) between the first and the second tests, the difference in measured V˙O2peak was readily attributable at the time of the testing to differences in final grade and/or speed achieved during the test and subjective comments from the patients as to whether they were having a “good” day. Importantly, there were no falls or serious adverse events during these tests. None of the subjects had symptomatic orthostatic declines in blood pressure during the pretest phase going from supine to standing or during recovery.

We performed a PubMed literature search using a variety of key words but were unable to find any other articles validating the reliability of treadmill testing with assessment of V˙O2peak in subjects with PD. Importantly, the reliability and repeatability parameters observed in PD subjects compare favorably to healthy subjects without PD. Generally, healthy controls have ICCs in the range of 0.85–0.98 for V˙O2peak and maximal HR (3,19,20). Tuner et al. (22) reported high correlations for peak HR (r = 0.94) and peak oxygen uptake (r = 0.98) in subjects with peripheral arterial disease and claudication. Riebe et al. (15) also reported high intraclass correlations for peak oxygen consumption (r = 0.97) in patients with peripheral arterial disease. We previously reported reliability coefficients of 0.87 for HR and 0.92 for V˙O2peak in hemiparetic stroke patients studied using a GXT protocol similar to the one used in the current PD study (5,11). In a large study of patients with congestive heart failure (HF-ACTION), the mean V˙O2peak was not different in tests 1 and 2 (1). However, the HF-ACTION investigators reported a large within-subject variability of V˙O2peak between the two tests of 6.6%, with small but statistically significant increases in time on treadmill and RQ during the second test compared with the first test. The authors attributed the variation to intrinsic subject factors such as daily hemodynamic and volume status fluctuations. In comparing our results with other studies, particularly those done in healthy subjects, it is important to recognize that some of the studies used treadmill tests set at the same speed and grade with the stated goal of attempting to reproduce values. This is in contrast to our approach, which relied on staff discretion and participant feedback to dictate speed and grade progression to volitional exhaustion. Interpretation and comparison with other studies also must consider differences in the ways data are expressed and analyzed.

Several studies have examined aerobic capacity in subjects with PD. Many studies conducted testing on cycle ergometers making a direct comparison with the present treadmill study difficult. Overall, study results for V˙O2peak in PD have been mixed with some reporting values similar to those of age-matched healthy non-PD controls (4,14,17). Interestingly, Canning et al. (4) reported no correlation between disease severity and V˙O2peak in individuals with mild to moderate PD. Although our subjects had V˙O2peak values that were generally 20% lower than age-matched controls without PD studied in our laboratory (16), there was substantial heterogeneity in the V˙O2peak and walking speeds.

There were several factors and testing methodology limitations that may contribute to increased variability in the measurement of V˙O2peak. First, for safety reasons due to gait problems and increased risk of falling, subjects were allowed to hold on to the side or front railing and in many instances used the front railing of the treadmill for balance support. It is well established that hand support decreases workload and measured V˙O2. Subjects were repeatedly told to minimize handrail support. Nevertheless, their inability to walk without rail support may have contributed to variability in the measurements and the modest upward trend in V˙O2peak. Second, inherent biological variability in PD symptomatology is likely to affect reproducibility. Because PD is unique in its robust motor response to dopaminergic medications, it is particularly vulnerable to increased variability of motor performance. We attempted to minimize this by performing the tests at the same time of day with the same time interval between medication dose and testing. Subjects often commented on the nature of their PD symptoms during the testing visits, but we did not objectively rate this source of variation. Third, there was inherent variability in the effort that these deconditioned subjects were willing to exert from day to day. On the basis of HR and RQ criteria, few of the subjects were able to achieve a true V˙O2max. Fourth, technical limitations related to reproducibility of the metabolic equipment calibration prohibited exact replication of the measurement accuracy between testing visits.

The strengths and weaknesses of our study warrant comment. One strength of the study is that the investigative team has extensive experience in treadmill testing subjects with a variety of chronic medical conditions such as stroke, peripheral arterial disease, congestive heart failure, and human immunodeficiency virus. Importantly, our group had performed preliminary pilot studies in PD before embarking on the current larger study. Lessons learned from these other populations with chronic disease and pilot PD participants were used in this study. Another strength of the study was the relatively large number of subjects studied (N = 70) and the range of disease severity. All participants had well-characterized PD, and all were receiving optimal medical management. Among the weaknesses was the potential for “volunteer bias.” Because volunteers were carefully screened for medical comorbidities that could affect their ability to safely exercise, results from our population subgroup may not be generalizable to the general population of subjects with PD. We excluded patients with HY stages 4 and 5 (when “off”), limiting the applicability of our results to those with less severe disease. Those with more severe PD would likely have less reproducible measurements during treadmill testing on the basis of wider motor fluctuations. Another limitation is that we studied well-controlled medicated PD patients who are on and that these results may not translate to those PD patients not on medications or in whom the symptoms are not well controlled.

In summary, our results demonstrate that GXT with measurement of aerobic exercise capacity is reproducible and can be safely performed in subjects with mild to moderate PD. Despite an observed learning effect across all three tests, the magnitude of the effect was small, averaging approximately 2.6% from the first to the second test. Our current recommendation based on these data is that two measurements of V˙O2peak are warranted for intervention studies. Although a single test will probably suffice for characterizing fitness levels in cross-sectional studies, accounting for learning effects is important in longitudinal studies for filtering out changes that are not associated with the effects of rehabilitation interventions.

The authors thank the hard work and efforts of Terra Hill, Jessica Hammers, Kate Fisk, Brad Hennessie, and other members of the study team.

This work was supported by the Michael J. Fox Foundation for Parkinson’s Research, the National Institute on Aging Claude D. Pepper Older Americans Independence Center National Institutes of Health grant P30-AG02874, Veteran Affairs Rehabilitation Research & Development Maryland Exercise and Robotics Center of Excellence, and the Baltimore Veteran Affairs Medical Center Geriatric Research Education and Clinical Centers.

The authors report no conflict of interest.

The results of the present study do not constitute endorsement by the American College of Sports Medicine.

Back to Top | Article Outline


1. Bensimhon DR, Leifer ES, Ellis SJ, et al.. Reproducibility of peak oxygen uptake and other cardiopulmonary exercise testing parameters in patients with heart failure (from the Heart Failure and A Controlled Trial Investigating Outcomes of exercise traiNing). Am J Cardiol. 2008; 102 (6): 712–7.
2. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986; 1 (8476): 307–10.
3. Blessinger J, Sawyer B, Davis C, Irving BA, Weltman A, Gaesser G. Reliability of the VmaxST portable metabolic measurement system. Int J Sports Med. 2009; 30 (1): 22–6.
4. Canning CG, Alison JA, Allen NE, Groeller H. Parkinson’s disease: an investigation of exercise capacity, respiratory function, and gait. Arch Phys Med Rehabil. 1997; 78 (2): 199–207.
5. Dobrovolny CL, Ivey FM, Rogers MA, Sorkin JD, Macko RF. Reliability of treadmill exercise testing in older patients with chronic hemiparetic stroke. Arch Phys Med Rehabil. 2003; 84 (9): 1308–12.
6. Fahn S, Elton RL; Members of the UPDRS Development Committee. Unified Parkinson’s Disease Rating Scale. In: Fahn S, Marsden CD, Calne DB, Goldstein M, editors. Recent Developments in Parkinson’s Disease, Vol 2. Florham Park, NJ: Macmillan Health Care Information; 1987. p. 153–64.
7. Folstein MF, Folstein SE, Mchugh PR. “Mini-mental state.” A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975; 12 (3): 189–98.
8. Herman T, Giladi N, Gruendlinger L, Hausdorff JM. Six weeks of intensive treadmill training improves gait and quality of life in patients with Parkinson’s disease: a pilot study. Arch Phys Med Rehabil. 2007; 88 (9): 1154–8.
9. Herman T, Giladi N, Hausdorff JM. Treadmill training for the treatment of gait disturbances in people with Parkinson’s disease: a mini-review. J Neural Transm. 2009; 116 (3): 307–18.
10. Hoehn MM, Yahr MD. Parkinsonismml: onset, progression and mortality. Neurology. 1967; 17 (5): 427–42.
11. Macko RF, Katzel LI, Yataco A, et al.. Low-velocity graded treadmill stress testing in hemiparetic stroke patients. Stroke. 1997; 28 (5): 988–92.
12. Miyai I, Fujimoto Y, Yamamoto H, et al.. Long-term effect of body weight–supported treadmill training in Parkinson’s disease: a randomized controlled trial. Arch Phys Med Rehabil. 2002; 83 (10): 1370–3.
13. Protas EJ, Mitchell K, Williams A, Qureshy H, Caroline K, Lai EC. Gait and step training to reduce falls in Parkinson’s disease. NeuroRehabilitation. 2005; 20 (3): 183–90.
14. Protas EJ, Stanley RK, Janovic J, MacNeill B. Cardiovascular and metabolic responses to upper-and lower-extremity exercise in men with idiopathic Parkinson’s disease. Phys Ther. 1996; 76 (1): 34–40.
15. Riebe D, Patterson RB, Braun CM. Comparison of two progressive treadmill tests in patients with peripheral arterial disease. Vasc Med. 2001; 6 (4): 215–21.
16. Rosen MJ, Sorkin JD, Goldberg AP, Hagberg JM, Katzel LI. Predictors of age-associated decline in maximal aerobic capacity: a comparison of four statistical models. J Appl Physiol. 1998; 84 (6): 2163–70.
17. Saltin B, Landin S. Work capacity, muscle strength and SDH activity in both legs of hemiparetic patients and patients with Parkinson’s disease. Scand J Clin Lab Invest. 1975; 35 (6): 531–8.
18. Skidmore FM, Patterson SL, Shulman LM, Sorkin JD, Macko RF. Pilot safety and feasibility study of treadmill aerobic exercise in Parkinson disease with gait impairment. J Rehabil Res Dev. 2008; 45 (1): 117–24.
19. Skinner JS, Wilmore KM, Jaskolska A, et al.. Reproducibility of maximal exercise test data in the HERITAGE family study. Med Sci Sports Exerc. 1999; 31 (11): 1623–8.
20. Hawkins MN, Raven PB, Snell PG, Stray-Gundersen J, Levine BD. Maximal oxygen uptake as a parametric measure of cardiorespiratory capacity. Med Sci Sports Exerc. 2007; 39 (1): 103–7.
21. Thompson WR, Gordon NF, Pescatello LS, editors. ACSM’ s Guidelines for Exercise Testing and Prescription. 8th ed. Philadelphia (PA): Lipinncott, Williams & Wilkins; 2010. Chapter 5. Clinical exercise testing; p. 105–34.
22. Tuner SL, Easton C, Wilson J, et al.. Cardiopulmonary responses to treadmill and cycle ergometry exercise in patients with peripheral vascular disease. J Vasc Surg. 2008; 47 (1): 123–30.
23. Weiner BJ. Statistical Principles in Experimental Design. 2nd ed. New York (NY): McGraw-Hill; 1971. p. 283–96.


©2011The American College of Sports Medicine