Secondary Logo

Journal Logo

A Comparative Effectiveness Review


van Rotterdam, Joan, BSc, MMS; Hensley, Michael, PhD, MBBS; Hazelton, Michael, PhD, MA, RN

Journal of Cardiopulmonary Rehabilitation and Prevention: March 2019 - Volume 39 - Issue 2 - p 73–84
doi: 10.1097/HCR.0000000000000405
Scientific Review

Background: Cardiac and pulmonary rehabilitation have been shown to reduce the symptoms of disease, as well as reducing health care utilization. To ensure the continuation of these programs, patient outcome measures (POMs) are essential to map treatment effectiveness. This review is a comparative effectiveness literature review of studies with a pre- to post-POM assessment of responsiveness (ie, change in health status over time).

Methods: A quality review of the literature included not only randomized controlled trials but also parallel studies, as well as all observational and retrospective trials. This review included a list of articles and their characteristics; a quality assessment of the literature and a list of POMs utilized in this setting were assessed for responsiveness.

Results: There was inconsistency in the literature with the measurement of responsiveness or effect size. The most commonly used POM was the SF-36; however, it was found to be less responsive to change in health status pre- to post-rehabilitation, particularly in the mental domain of this instrument. The most responsive POM in this setting was the Global Mood Scale.

Conclusion: The surveyed literature found no “gold standard” POM for either cardiac rehabilitation or pulmonary rehabilitation but there was some preference for the disease-specific POMs; however, some of these instruments lose their discriminatory power at the end of the rehabilitation period. This literature review found that a Likert scale is more responsive than a dichotomous scale and that a simple questionnaire is more responsive in a pre- to post-setting than a complex questionnaire.

This comparative effectiveness literature review of cardiac and pulmonary rehabilitation included randomized controlled trials, as well as parallel, observational, and retrospective trials. This review included a list of articles and their characteristics; a quality assessment of the literature and a list of patient outcome measures utilized in this setting were assessed for responsiveness.

School of Medicine and Public Health, Faculty of Health, Callaghan Campus, The University of Newcastle, Newcastle, New South Wales, Australia (Ms van Rotterdam); Director of Medical Services, John Hunter Hospital, Newcastle, New South Wales, Australia (Dr Hensley); and School of Nursing and Midwifery, Faculty of Health, The University of Newcastle, Newcastle, New South Wales, Australia (Dr Hazelton).

Correspondence: Joan van Rotterdam, BSc, MMS, 82 Fennell Crescent, Blackalls Park NSW 2283, Australia (

The authors declare no conflicts of interest.

Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal's Web site (

Chronic obstructive pulmonary disease is a common condition, with 19% of Australians >40 yr of age estimated to be affected. People with this condition have poor quality of life (QoL) with cardinal symptoms of dyspnea and exercise intolerance. Pulmonary rehabilitation (PR) programs have been shown to help reduce these symptoms and reduce health care utilization.1

Coronary heart disease remains the most frequent cause of death worldwide; however, with improved medical technology and interventions, people are living longer with symptomatic coronary heart disease. Cardiac rehabilitation (CR) programs have been shown to reduce the risk of reinfarction as well as reduce cardiovascular and all-cause mortality.2

Patients with chronic cardiac and chronic respiratory illness have been found to benefit from an integrated self-management plan that is initially put together for them within a rehabilitation program.3,4 The concept of QoL is integral to rehabilitation and its measurement in this setting summarizes the overall value placed on life at any given time. Health-related quality of life measures to what extent disease limits the capacity to lead a normal life.5

In rehabilitation, function of patients is the result of a dynamic interaction between health conditions (disease) and contextual factors (environmental and personal factors) that describe the milieu in which individuals live.6 Disability is defined as a loss of function in a person and CR and PR programs are designed to limit the disabling consequences of disease.

To ensure CR and PR programs continue to provide clinically valuable services, it is essential to evaluate the efficacy of these programs. Therefore, patient outcome measures (POMs) are essential to determine treatment effectiveness.

In response to worldwide endorsement of CR and PR programs, the Cochrane Review committee has undertaken literature reviews of both types of programs. There have been 6 reviews of CR programs and 3 reviews of PR programs. Both sets of reviews have endorsed the success of these rehabilitation programs; however, there are some methodological issues with the initial studies.

To measure the effectiveness of such programs, POMs are used to assess a patient's response and should reflect a direction of perceived health change. Measuring change in POMs in this setting is in part a reflection of the reliability of an instrument and the ongoing process of construct validity (learning more about the construct, making new predictions, and then testing them).7,8

Responsiveness, or the process of measuring change in a patient's response in relation to POMs, is a key issue in this review and it is why validation of these instruments is important and, as such, needs to be fully explained. Responsiveness is related to sensitivity.9 Responsiveness is the ability of POMs to detect changes when a patient's condition improves or deteriorates. Responsiveness is the measurement of the change an instrument detects over time among patients. The concept of responsiveness is context specific, that is, a POM may be responsive in a certain setting with a certain population but that same tool may not perform as well in another population.7

The concept of responsiveness determines the effectiveness of both CR and PR programs. A POM such as a health-related quality of life measure may be sensitive to the disease process and give information in this area but may not necessarily give information as to whether the rehabilitation program brings about an improvement, a deterioration, or no change in the health status of the patient.

Studies relating to rehabilitation as an intervention, studies that involve a pre- and post-rehabilitation measurement regime, are essentially of 2 types. First, the “within” patient trial is a longitudinal study with rehabilitation as an intervention that measures change of health status over time in patients within a group (eg, a prospective trial). The “between” patient trial is a longitudinal study with rehabilitation as an intervention but also compares patient groups (eg, randomized controlled trials [RCTs]).10

Significance tests are conducted (t test or equivalent) to determine whether the change in score of a POM within a patient over time is most likely due to chance or to an intervention. To estimate the magnitude of change over time (or whether that change is meaningful), the effect size index will supplement the statistical testing (Figure 1).10

Figure 1

Figure 1

Back to Top | Article Outline


The purpose of this comparative effectiveness review was to determine whether community-based CR and PR programs bring about a measurable change in health status as shown by POMs pre- to post-rehabilitation. The 3 main objectives of this review were to (1) list the articles under review and provide a checklist of items included in each study; (2) provide a qualitative analysis of research papers; and (3) provide a qualitative analysis of POMs utilized in these review studies.

A search was undertaken of current databases (Figure 2), including MEDLINE, PyschINFO, Cochrane, CINAHL, Ovid, BMJ, AMED, BioMed Central, and Open Access journals, with articles restricted to published research in English (1995-2015). One thousand five hundred fifty-seven articles were found that satisfied the inclusion criteria. Exclusion criteria were then applied to this group, and it was found that 83 studies matched the selection criteria applicable for this literature review. Table 1 provides a comprehensive list of articles in date order with a description of the population, the type of trial under review, the POMs and other tests that were undertaken, a short description of the intervention, and a short summary of conclusions for each study.11

Figure 2

Figure 2

Table 1

Table 1

Back to Top | Article Outline


Eighty-three articles were found that fit the selection criteria and 75 of those are listed in the Supplemental Digital Content 1, available at:, and supplemental reference list (see Supplemental Digital Content 2, available at: Of these 75 articles, 14 were RCTs, 7 were nonrandomized parallel group trials or comparative trials, 38 were prospective observational trials, and 16 were retrospective trials.

The PR literature consisted of 32 studies: 9 were RCTs, 1 a comparative study, 14 were prospective longitudinal trials, and 8 were retrospective trials. The CR literature included 43 studies: 5 were RCTs, 6 were comparison trials, 24 were prospective longitudinal trials, and 8 were retrospective trials.

Back to Top | Article Outline


The bulk of articles in this review were observational studies and the main emphasis of this review is as an outcome-level assessment evaluating the performance of POMs. Therefore, several arguments can be put forward to include these studies in a quality assessment of the literature. The sample sizes for the existing RCTs are quite low while there are a number of community-based observational trials with large number of patients (eg, Ries et al12 with 1218 patients and Hevey et al13 with 1485 patients). Treatment providers and types of programs available for CR and PR in an outpatient setting can be inconsistent with staff ranging from nurses, allied health professionals, and other professionals assisting.

There are also several gaps in the RCT literature. These include the following: (1) none of the RCTs were able to blind for the outcome of interest (ie, health-related quality of life); (2) all the RCTs were efficacy studies14; (3) and not all of the important outcomes were captured in the RCTs (ie, the responsiveness of the POMs used in the studies).15

Back to Top | Article Outline


The quality assessment of the literature included all types of studies: RCTs; comparative studies; observational studies; and retrospective studies.14,15 The quality assessment tool used in this review was adapted from the graded Cochrane Collaboration system, which can be used to assess the quality of parallel group trials and observational trials, although it was specifically developed for RCTs. Articles were graded using 5 areas of bias: (1) random sequence generation; (2) allocation concealment, both sources of selection bias16; (3) the responsiveness of the POMs being assessed in previous studies, a source of performance bias17; (4) incomplete outcome data, a source of attrition bias; and (5) selective reporting which is a source of reporting bias. These areas of bias are important to this group of articles and, as such, affect the quality of the results of each study.

Blinding of the outcome assessments (ie, POMs) did not occur in any of the included studies featured, so it was eliminated as a source of selection bias due to redundancy. Attrition bias and reporting bias were interpreted using the usual Cochrane-based quality assessment.

Table 1 is the list of included articles. If bias was present in an area, it was designated as “-ve” and if no bias was present, it was marked as “+ve.” Once each article was examined for bias, it was graded from high, the likelihood of bias is very small to very low, which meant that the likelihood of bias in this study was high.

Table 2 is a summary of the quality assessment process, with 46 articles falling in the moderately low, low, and very low class, and 29 articles were grouped as above the moderate level. It is this second group of studies from which POMs were assessed more closely for responsiveness.16Table 3 is a compilation of the most commonly used POMs from the 29 studies listed at the moderate level and above in the quality assessment.

Table 2

Table 2

Table 3

Table 3

Back to Top | Article Outline


The 36-Item Short Form Health Survey (SF-36) was the most commonly used generic measurement tool in both CR and PR; in fact, it was the most commonly used POM in the cardiac literature. In the PR literature, the SF-36 ceased to be used after the study by Jones et al.18 Instead, the respiratory literature changed to using disease-specific instruments with the development of improved instruments, such as the COPD Assessment Test and the Clinical COPD Questionnaire. Compared with the disease-specific instruments in the CR and PR setting, the generic instruments and many domains of the generic instruments were less responsive to change in health status.

In the PR literature, the disease-specific Chronic Respiratory Questionnaire (CRQ) was the most frequently used instrument, with the consensus being that in the domain of dyspnea and fatigue, the CRQ (in particular, the self-administered form) is more responsive to change in health status than the St George's Respiratory Questionnaire.19,20 The most commonly used purpose-specific instrument in PR was the Feeling Thermometer.21

The most commonly used disease-specific tool in the CR literature was the MacNew QoL after Myocardial Infarction (MacNew). This instrument is responsive to CR in most domains; however, it shows a greater treatment effect in the short-term than in the long-term.13 The most commonly used purpose-specific instrument in CR is the Hospital Anxiety and Depression Scale (HADS). This instrument for patients with higher measured values in the domains of anxiety and depression shows good responsiveness; however, for patients whose measured values are in the lower ranges of anxiety and depression, this instrument shows little significant change and, therefore, is less responsive.

Eleven of the review articles specifically assessed and compared POMs in both CR and PR (see Supplemental Digital Content 1, available at: Several studies performed correlations across instruments, but the most consistently significant correlations were in the dyspnea domains between the purpose-specific instruments, such as the Medical Research Council Scale and the CRQ and St George's Respiratory Questionnaire in PR and between the MacNew Physical domain and the SF-36 Physical Composite (PCS) domain in CR.

This group of articles found that the Global Mood Scale, a purpose-specific instrument based on a 2-factor model of mood with 10 negative and 10 positive mood terms was the most responsive instrument in the CR setting.22 The domain construct for this instrument is very different to that of the HADS since the Global Mood Scale provides information on global well-being whereas the HADS assesses clinical states and was developed in a hospital setting. The veteran's version of the SF-36 (SF-36V) was found to be more responsive than the usual version of the SF-36. The SF-36V uses a 5-point ordinal choice instead of the usual dichotomous yes/no answer.23

Back to Top | Article Outline


An overall assessment of the literature reveals that CR and PR in the outpatient setting bring about a change in health status for patients and that there is overall an improvement in the QoL for most patients in this setting. But the question that this review poses is how successfully existing POMs go about measuring this, and it is evident that there are many problems in this area.

Many of the review studies were not consistent in their determination of responsiveness or effect size and this made comparison of POMs across studies difficult. For each new set of circumstances (ie, population), it is important to test the POM for responsiveness. Even if an instrument is reliable, it remains to be shown that differences in response to treatment can be detected before the instrument can be used for the assessment of change.24

The literature revealed a range of recommendations of which instrument (generic, disease-specific, or purpose-specific) is best used in CR and PR.25 Generic instruments often provide a broad picture of health and allow comparison of trial patients to population norms.26 They are often more sensitive to comorbidities, an important consideration when choosing an instrument to use in CR and PR. An assessment of a global concept of QoL may be useful in population studies; however, they may be difficult to interpret at a clinical level. Highly standardized POMs such as the SF-36 may also omit aspects of QoL that are of great importance to the individual.27

Existing generic instruments may have more of a rehabilitation focus. The SF-36, created in the 1980s, is an example of this type of instrument and is a multipurpose health survey with 36 questions. The 8 health concepts for the SF-36 were selected from 40 included in the Medical Outcomes Study published by Stewart and Ware in 1992,28 although most of the items for the SF-36 were taken from concepts already in existence in POMs from the 1970s and the 1980s.

Rehabilitation has had a paradigm shift since the 1980s model, which was originally proposed by Nagi Wood for the World Health Organization. This model was based on a “consequences of disease” classification, which focused on the impact of diseases or other health conditions that may follow as a result.29 This model has since been superseded by the International Classification of Functioning, Disability and Health (ICF-2001), in which the patient is instead seen in terms of his or her function, especially at the person and societal levels.30

Disease-specific measures were more widely used in the pulmonary literature. These instruments are more sensitive to the disorder under consideration and are therefore more likely to reflect clinical changes. They appear to be more useful in RCTs as they detect small but significant change even though these studies often use smaller or more moderate sample sizes. They are also better used for the specific population for which they were created.

Purpose-specific instruments such as the HADS also have a role to play and are often used to supplement and fill the gaps between generic- and disease-specific instruments. To cover all the domains in CR or PR that the clinician or researcher requires, often a number of POMs are needed (ie, generic, disease-specific as well as purpose-specific instruments). An example of this is the MacNew, which loses some of its discriminatory power at the end of the CR period and, therefore, it is best to compliment the MacNew with other psychosocial assessment instruments.31

The purpose of the POM is the best determinant for which type of instrument to use. Patient outcome measures that are successful in a research setting may not always translate into a clinical setting and vice versa. Instruments that are clinically relevant and have a medical (organ function) focus, which emphasizes signs, symptoms, and diagnosis, may not translate to a rehabilitation focus (such as CR and PR), which emphasizes function at the person and societal level.32

Self-report or individualized POMs were not included in any of the review literature. They were, however, used with some success in 2 articles on generic chronic disease rehabilitation.25,33 A large number of POMs reflect the objective perspective of the outsider rather than the patient's subjective point of view. Individualized methods focus on uniqueness (ie, the QoL is determined by the person who lives it).27

Back to Top | Article Outline


In the search to find a “gold standard” POM for use in CR and PR, the surveyed literature instead revealed a diversity of opinions and several instruments were proposed with a preference for the disease-specific instruments. While most of these are sensitive to the disease process, not all domains or all instruments are responsive to longitudinal change in health status brought about by CR and PR.

Current studies of CR and PR programs utilize generic, disease-specific, and purpose-specific POMs or a combination of these instruments. In CR and PR programs, symptoms and signs of organ dysfunction may show very little related to mapping a patient's progress. These tools have some drawbacks but the main issue for the patient is that, for all these instruments, an external investigator has determined the domains. Instead, the patient's own perceptions of his or her health status may prove to be more meaningful.25 A tool that utilizes the patient's own perceptions and weights aspects of that life which are particular to the person may prove to be more responsive in this setting.27,32,33

Back to Top | Article Outline


This research was supported by an Australian Government Research Training Program (RTP) Scholarship.

Back to Top | Article Outline


1. Johnston M, Young M, Grimmer K, Antic R, Frith P. Frequency of referral and attendance at a pulmonary rehabilitation programme amongst patients admitted to a tertiary hospital with chronic obstructive pulmonary disease. Respirology. 2013;18:1089–1094.
2. Clark AM, Hartling L, Vandermeer B, Lissel SL, McAlister FA. Secondary prevention programmes for coronary heart disease: a meta-regression showing the merits of shorter, generalist, primary care-based interventions. Eur J Cardiovasc Prev Rehabil. 2007;14:538–546.
3. American Association of Cardiovascular and Pulmonary Rehabilitation. Guidelines for Pulmonary Rehabilitation Programs. 3rd ed. Champaign, IL: Human Kinetics; 2004.
4. American Association of Cardiovascular and Pulmonary Rehabilitation. Guidelines for Cardiac Rehabilitation and Secondary Prevention Programs. 4th ed. Champaign, IL: Human Kinetics; 2004.
5. Stineman M, Lollar D, Ustun T. The International Classification of Functioning, Disability and Health. In: DeLisa JA, et al, eds. Physical Medicine and Rehabilitation: Principles and Practices. 4th ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2005:1096–1138.
6. World Health Organization. International Classification of Functioning, Disability and Health: ICF. Geneva, Switzerland: World Health Organization; 2001. Accessed 2015.
7. Streiner DL, Norman GR. Health Measurement Scales: A Practical Guide to Their Development and Use. 2nd ed. Toronto, Canada: Oxford Press; 1996.
8. Guyatt G, Kirshner B, Jaeschke R. Measuring health status: what are the necessary measurement properties? J Clin Epidemiol. 1992;45(12):1341–1345.
9. Fayers PM, Machin D. Quality of Life: The Assessment, Analysis and Interpretation of Patient-Reported Outcomes. 2nd ed. Chichester, West Sussex, England: John Wiley and Sons; 2007.
10. Middel B, van Sonderen E. Statistical significant change versus relevant or important change in (quasi) experimental design: some conceptual and methodological problems in estimating magnitude of intervention-related change in health services research. Int J Intergr Care. 2002;2:e12.
11. Hackett L, Anderson C. Predictors of depression after stroke: a systematic review of observational studies. Stroke. 2005;36:2296–2301.
12. Ries A, Make B, Lee S, et alThe effects of pulmonary rehabilitation in the National Emphysema Treatment Trial. Chest. 2005;128(6):3799–3809.
13. Hevey D, McGee H, Horgan J. Responsiveness of HRQoL outcome measures in cardiac rehabilitation: comparison of cardiac rehabilitation outcome measures. J Consult Clin Psychol. 2004;72(6):1175–1180.
14. Norris S, Atkins D, Bruening W, et al Selecting observational studies for comparing medical interventions. Methods Guide for Effectiveness and Comparative Effectiveness Review [Internet]. Rockville, MD: Agency for Healthcare Research and Quality (US); 2008-2010. Accessed 2017.
15. Dreyer N, Tunis S, Berger M, Ollendorf D, Mattox P, Gliklich R. Why observational studies should be among the tools used in comparative effectiveness research. Health Aff. 2010;29(10):1818–1825.
16. Higgins JPT, Altman DG, Gotzsche PC, et alThe Cochrane Collaboration's Tool for assessing risk of bias. BMJ. 2011;343(7829):889–893.
17. Guyatt G, Naylor D, Juniper E, Heyland DK, Jaeschke R, Cook DJ. Users' guide to the medical literature. XII. How to use articles about health-related quality of life. JAMA. 1997;277:1232–1237.
18. Jones R, Harding S, Chung M, Campbell J. The prevalence of post-traumatic stress disorder in patients undergoing pulmonary rehabilitation and changes in PTSD symptoms following rehabilitation. J Cardiopulm Rehabil Prev. 2009;29:49–56.
19. Daudey L, Peters J, Molema J, et alHealth status in COPD cannot be measured by the St George's Respiratory Questionnaire alone: an evaluation of the underlying concepts of this questionnaire. Respir Res. 2010;11:98–104.
20. Puhan M, Guyatt G, Goldstein G, et alRelative responsiveness of the Chronic Respiratory Questionnaire, St Georges Respiratory Questionnaire and four other health-related QoL instruments for patients with chronic lung disease. Respir Med. 2007;101:308–316.
21. Lavrakas PJ, ed. Encyclopedia of Survey Research Methods. Thousand Oaks, CA: Sage Publications; 2008.
22. Vizza J, Neatrour D, Felton P, Ellsworth DL. Improvement in psychosocial functioning during an intensive cardiovascular lifestyle modification program. J Cardiopulm Rehabil Prev. 2007;27:376–383.
23. Belza B, Steele B, Cain K, Coppersmith J, Howard J, Lakshminarayan S. Seattle Obstructive Lung Disease questionnaire. J Cardiopulm Rehabil. 2005;25:107–114.
24. Verrill D, Barton C, Beasley W, Lippard WM. The effects of short-term and long-term pulmonary rehabilitation on functional capacity, perceived dyspnea and QoL. Chest. 2005;128:673–683.
25. Kennedy A, Reeves D, Bower P, et alThe effectiveness and cost- effectiveness of a national lay-led selfcare support programme for patients with long-term conditions: a pragmatic randomised controlled trial. J Epidemiol Community Health. 2007;61:254–261.
26. McKee G. Are there meaningful longitudinal changes in health related quality of life-SF-36, in cardiac rehabilitation patients. Eur J Cardiovasc Nurs. 2009;2009:40–47.
27. Carr A, Thonpson P, Kirwan J. Quality of life measures. Br J Rheumatol. 1996;35:275–281.
28. Stewart AL, Ware JE. Measuring Functioning and Well-Being: The Medical Outcomes Study Approach. Durham, NC: Duke University Press; 1992.
29. Granger C. Quality and Outcome Measures for Medical Rehabilitation. Philadelphia, PA: WB Saunders; 2000.
30. Kaplan R. QoL as an outcome measure in pulmonary disease. J Cardiopulm Rehabil. 2005;25(6):321–331.
31. Maes S, de Gucht V, Goud R, Hellemans I, Peek N. Is the MacNew quality of life questionnaire a useful diagnostic and evaluation instrument for cardiac rehabilitation? Eur J Cardiovasc Prev Rehabil. 2008;15(5):516–520.
32. Kalra A. Measuring quality of life-who should measure quality of life? BMJ. 2001;322:1417–1420.
33. Pitkala K. The effectiveness of day hospital care on home care patients. J Am Geriatr Soc. 1998;46(9):1086–1095.

cardiac disease; quality of life; questionnaires; rehabilitation; respiratory tract disorders

Supplemental Digital Content

Back to Top | Article Outline
Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved.