Journal Logo

Research Article

A Systematic Review of Outcome Measures Assessing Disability Following Upper Extremity Trauma

Jayakumar, Prakash MBBS, BSc(Hons), MRCS(Eng); Williams, Mark PhD; Ring, David MD, PhD; Lamb, Sarah DPhil; Gwilym, Stephen FRCS(Orth), PhD

Author Information
JAAOS: Global Research and Reviews: July 2017 - Volume 1 - Issue 4 - p e021
doi: 10.5435/JAAOSGlobal-D-17-00021
  • Open


Outcome measurement in orthopaedics has evolved rapidly over the past 20 years, and there are many patient-reported or clinician-based outcome measures.1,2 The popularity of patient-reported outcome (PRO) measurement, in particular, has grown in response to the perception that clinicians have an incomplete understanding of the true impact of disease on a patient's life and the complexity of the human illness experience.2,3 PRO measures, by definition, focus on quantifying the subjective impact of health from the patient's perspective, commonly referred to as “disability” in contrast to “impairment” (objective pathophysiology). Common orthopaedic outcomes such as range of motion and fracture union represent the biomedical paradigm. PRO measures represent the biopsychosocial paradigm (including the influence of thoughts, emotions, behaviors, and circumstances) on symptoms and limitation.

The International Classification of Functioning, Disability and Health defines disability as a multidimensional concept related to the dynamic interaction between body functions and structures, activity limitations, and participation restrictions alongside environmental and personal factors.3 These components are influenced by impairment (ie, problems with structure and function of the body leading to significant deviation and loss), psychosocial factors, and symptom experience.3 The alleviation of disability, in this wider context, is the primary aim of most orthopaedic interventions.

Orthopaedic trauma is often associated with a significant impact on the magnitude of disability and the factors influencing it, which can affect an individual's quality of life in several health domains.4,5 There is increasing evidence that disability is less associated with measures of impairment and objective pathophysiology than the subjective psychosocial aspects of illness.6,7 Factors likely to mediate these interactions include anxiety, depression, ineffective coping, pain catastrophizing, and kinesiophobia, as well as social status, support, financial loss, and secondary gain.48 This has an influence on recovery following musculoskeletal trauma, which is shown to have a stronger association with pain intensity and disability than biomedical factors, such as fracture type.810 Furthermore, studies such as those conducted by Bhandari et al4 reported on a significant number of patients experiencing orthopaedic trauma breach thresholds for psychological distress.

Upper limb injuries demonstrate reduced health-related quality of life indices compared with trauma involving other regions.11 The inability to feed, clothe, and care for oneself following injury, particularly involving a dominant arm, can be extremely debilitating.11 A study involving proximal humerus fractures demonstrated that measures of impairment, such as range of motion and arm strength, did not correlate with PRO measures of disability.12 Factors such as social independence appeared to more accurately predict PROs than physician based assessments and even mortality in these patients.12,13 Similarly, studies involving distal radius fractures demonstrate depression, anxiety, kinesiophobia, and catastrophic thinking as the most important factors influencing disability and rate of recovery.9,14,15 Despite this growing evidence and the rising demand for robust PRO measurement, there remains a lack of clarity regarding the original development, testing, and quality of PRO measures in the context of upper extremity trauma and disability in this region.


The primary objective was to identify outcome measures developed for upper extremity conditions, focusing on traumatic injuries, and to classify them by anatomic region, condition type, instrument type, and the psychometric evaluation used in their original development. Secondarily, we aimed to assess the methodological quality of original studies, introducing a PRO measure that incorporated trauma patients in their development. We conclude by highlighting the challenges and solutions encountered in measuring outcomes and disability in this population.


Data Sources

A broad search strategy was applied to PubMED (MEDLINE from 1946 to 2016), OVIDSP (EMBASE from 1974 to 2016), CINAHL (from 2006 to 2016), and PsycINFO (from 1806 to 2016) electronic databases on July 1, 2016. Search terms related to “upper limb anatomy,” “outcome measurement,” and demographic parameters were combined with the operator AND (Supplemental Digital Content 1, No restrictions were set in the search fields, and terms were identified in the title and/or abstract without any limits. Further identification was conducted through an internet search engine (Google) and a contemporary atlas of outcome measures.3 The review is reported according to the PRISMA statement and registered on the PROSPERO system (No. CRD42016046243) (Appendix 1).

Study Selection

Studies involving adult patients experiencing any orthopaedic upper extremity condition involving outcome measurement systems were identified. Abstracts were screened by the lead investigator (P.J.) to (1) generate a comprehensive set of outcome measures and (2) track down the original article introducing the measure plus or minus any development and psychometric evaluation studies, if available. Psychometric evaluations of PRO measures were taken to include assessments of validity, reliability, responsiveness, interpretability, and acceptability.16 Eligibility assessment selected only the original publications of PRO measures for qualitative and quantitative synthesis. Measures not recognized as multidomain outcome measurement systems, such as those focusing on clinimetric features alone (eg, range of motion, pathoanatomic or radiological grading and classification, and clinical examination tests), single health components (eg, pain, depression, and return to activity), broad diagnostic groups (eg, osteoarthritis and tumor classifications), and health behavior scales, were excluded along with articles not published in English.

Data Extraction and Data Synthesis

Data were extracted, synthesized, and recorded using an electronic database (Microsoft Excel, v15.33). Outcome measures were classified by anatomic region, conditions assessed (ie, broad etiology and specific diagnoses), instrument characteristics (ie, coverage and type), and initial level of psychometric evaluation. Measures combining patient-reported and clinician-based components were classified as the latter by default, unless one or the other was more popularly used in the literature. Initial characterization of psychometric evaluation was based on details of validation (construct validity). If none existed, measures were classified with “no initial empirical psychometric evaluation” or “no initial empirical psychometric evaluation and no validation studies identifiable.”

Quality assessment was conducted using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) criteria and 4-point checklist.17 This is a well-established standard for evaluating methodological quality, design requirements, and preferred statistical analysis of measurement properties of health-related PRO measures. Only original studies involving patients with trauma conditions in the development and psychometric evaluation of instruments were assessed. Contact with authors was made for clarification of conditions when these were nonspecific. Properties were assessed with the lowest rating within a category taken as the score for the section. In addition, data were extracted for generalizability (ie, population characteristics and sampling procedure) and interpretability but were not rated. The items within PRO measures were also categorized as “best fit” into one of the five health domains by three investigators (P.J., D.R., and S.G.) to calculate the proportion (percentage) of each domain as part of the full score, including any instrument weightings. Discordant judgments were resolved through discussions coordinated by the lead author (P.J.) to reach a consensus between the three investigators. Synthesized data were reported using descriptive statistics and discordant judgments resolved by a discussion among all the authors.


A total of 144 outcome measures targeting the upper extremity were identified (Figure 1 and Table 1). The majority focused on the shoulder, wrist, and hand (Figure 2). Fifty-eight percent (n = 83/144) included patients with trauma problems either in combination with other conditions or alone (Figure 3). Seven percent (n = 10/144) required corresponding authors to be contacted to determine conditions investigated because these could not be otherwise identified. Conditions included fractures, dislocations, and soft-tissue injuries (Table 1). Instrument classification revealed the majority as completely or partially clinician based (53%; n = 76/144) (Figure 4) and predominantly condition specific (56%; n = 80/144) (Figure 5), with trends in instrument type and coverage mapped over time (Figures 6 and 7).

Figure 1
Figure 1:
Search strategy and selection of articles. *July 1, 2016. **CINAHL search includes PsychINFO database. Data Synthesis I (Table 1); Data Synthesis II Quality Assessment (Table 2). CBO - clinician-based outcome, PRO = patient-reported outcome
Table 1
Table 1:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-a
Table 1-a:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-b
Table 1-b:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-c
Table 1-c:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-d
Table 1-d:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-e
Table 1-e:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-f
Table 1-f:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-g
Table 1-g:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-h
Table 1-h:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-i
Table 1-i:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Table 1-j
Table 1-j:
Outcome Measures for Upper Extremity Conditions Classified by Conditions, Instrument, and Initial Psychometric Evaluation
Figure 2
Figure 2:
Outcome measures for upper extremity conditions by anatomic region.
Figure 3
Figure 3:
Outcome measures for upper extremity conditions by clinical conditions in index evaluation studies. nos = nonspecific or not specified
Figure 4
Figure 4:
Instruments classified by clinician-based or patient-reported outcome measurement.
Figure 5
Figure 5:
Instruments classified by the level of coverage.
Figure 6
Figure 6:
Number of clinician-based and patient-reported outcome measurements, including the level of psychometric validation from 1960 to date. CBO = clinician-based outcome, PRO = patient-reported outcome
Figure 7
Figure 7:
Outcome measurement type by original publication year from 1960 to date

Quality assessment was conducted on 29 original studies (20%; n = 29/144) that included some form of psychometric evaluation of PRO measures when they were first published and involved upper limb trauma patients in their study cohort (Table 2). The majority of studies included in quality assessment were prospective cohort studies. “Test-retest” reliability was assessed more frequently than internal consistency and measurement error, and rating was “poor” to “good” when assessed. Content (face) validity ratings were “good” to “excellent” for almost all measures. Construct validity assessed through testing of hypotheses rated “good to excellent” in two thirds of studies and “poor to fair” in the rest, while structural validity was “poor” in 71% of studies (n = 17/24) when it was assessed. This was primarily due to few studies undertaking factor analysis or item response theory (IRT) analysis, a requisite for higher ratings. The lack of gold-standard measures in this field meant that criterion validity was rarely assessed. Responsiveness was highly variable; although most studies allowed some interpretability through score distribution and change, few conducted analysis of floor-ceiling effects, minimal clinical important difference, or minimal detectable change. Characterization of health domains revealed that the majority of items were related to physical function and symptoms, whereas the relative proportion represented by social (median 14%; range 3%–35%) and psychological aspects (median 14%; range 3%–16.5%) was low. Generalizability assessment revealed low levels of reporting patient consent, percentage of missing items, and study limitations.

Table 2
Table 2:
Characterization and Quality Assessment of PRO Measures Involving Patients With Upper Extremity Trauma in the Original Study Cohort
Table 2-a
Table 2-a:
Characterization and Quality Assessment of PRO Measures Involving Patients With Upper Extremity Trauma in the Original Study Cohort
Table 2-b
Table 2-b:
Characterization and Quality Assessment of PRO Measures Involving Patients With Upper Extremity Trauma in the Original Study Cohort
Table 2-c
Table 2-c:
Characterization and Quality Assessment of PRO Measures Involving Patients With Upper Extremity Trauma in the Original Study Cohort
Table 2-d
Table 2-d:
Characterization and Quality Assessment of PRO Measures Involving Patients With Upper Extremity Trauma in the Original Study Cohort


The selection of outcome measures is of paramount importance in conducting high-quality orthopaedic research.48 Arriving at this choice, particularly among the current assortment of PRO measures, may benefit from an understanding of the methodological quality of their development and relevance to study populations.49 Over 140 different outcome measures targeted at upper extremity problems were identified, with a substantial number involving trauma conditions in their original cohorts, many of which included distal radius fractures, rotator cuff tears, and shoulder instability. The majority were clinician-based, injury-specific, or procedure-specific instruments and lacked empirical psychometric evaluation in initial development or any identifiable validation studies since introduction. One in five was a PRO measure, involving trauma patients who had undergone initial psychometric evaluation. Methodological quality was deemed acceptable in terms of test-retest reliability, content, and construct validity through hypothesis testing, but was of variable quality and/or lacking in others such as internal consistency, measurement error, responsiveness, and interpretability. This work also demonstrated a transition from pure clinician-based measures to nonvalidated PRO measures to those with some form of validation before introduction (Figure 6). A further trend toward an increase in the development of region-specific instruments over condition-specific measures is observed (Figure 7). This may reflect the importance placed on PRO measurement in modern orthopaedic practice as well as a possible trend toward more general outcome measurements of disability impact at the regional level. Despite these findings, relatively low rates of patient-reported assessment have been observed in the orthopaedic trauma literature, and the drive to develop these instruments does not appear to correlate with their level of utilization in clinical practice.50


There are some limitations to this work. First, only the original index articles relevant to each outcome measure were within scope. We were interested in understanding the nature of the original developmental work by the investigators and the methodological quality of these measures. Although we located the studies for each instrument, it is recognized that more than one early investigation could lay claim to being part of the initial psychometric evaluation. Furthermore, subsequent studies may have superseded these index evaluations, performing further assessments, including specific trauma populations. Second, although the proportion of psychosocial components within our selection of PRO measurements was judged to be low, it is appreciated that investigators may incorporate other measures to account for psychological and social well being. Third, the identified outcome measurement set is unlikely to be exhaustive, but intuitively any instruments “missed” are more likely clinician-based than patient-reported. In this regard, there is an issue of publication bias and its impact on the internal validity of this work.51 Unpublished studies reporting negative, unfavorable outcomes following instrument testing may exist. The lack of reporting limitations in many of the studies may also reflect a level of reporting bias where there has been a vested interest in instrument promotion. Fourth, we recognize that “all upper extremity trauma conditions” were included as the target category, and findings around methodological quality of measures may vary if the evaluations were performed around specific injuries. Finally, in 7% of authors contacted, inquiries were limited to diagnostic clarification, and no further information was gathered around methodological quality. Thus, it was unclear whether low ratings were down to lack of reporting or actual lack of quality according to COSMIN. The findings of this review can be considered in light of some of the challenges and solutions in this field.

Timing and recruitment of orthopaedic trauma patients for instrument development, testing, and outcome assessment may be logistically difficult because of a variety of clinical and environmental stressors. Measurement should occur at a time when patients are “stable” enough to perform evaluations while being close enough to the date of injury to fully capture the health-related impact. This can be challenging when considering the effects of symptoms (eg, fracture-related pain) and clinical circumstances (eg, fracture immobilization). These issues may be unavoidable but managed best by improved patient and staff education as outcome measurement becomes part of everyday orthopaedic practice.

Responder burden plus inefficient and irrelevant testing are further issues, especially when full and lengthy fixed-length outcome measures are administered in these populations. The risks of incomplete scoring, poor patient experience, “gaming” of the assessment for perceived reward, and difficulties in estimating performance while being “out of action,” alongside the tendency to overestimate one's level of ability in these situations, are apparent.52

Functionality and psychometric properties of instruments primarily developed in chronic conditions may lack adequate coverage and incorporate a set of items too narrow and limited in assessing health impact outside this context.53 This tendency may be reflected by high floor-ceiling effects during applications in trauma.3,53 Furthermore, instruments or groups of instruments should adequately cover all relevant health-related domains, including psychosocial factors which are shown to have a dominant influence on disability.

Tailored assessment of patients is an ongoing challenge within outcome measurement in general. It is particularly relevant in orthopaedic trauma where population characteristics are wide ranging. One component involves the assessment of a patient's baseline status, a particular challenge in trauma situations. Other aspects include the capture of patient experience and emerging concept of patient activation, an individual's level of involvement in his or her care and the propensity to engage in adaptive health behaviors.28,29,54

One or more of these challenges can be met by the introduction and/or mode of application of established and contemporary outcome measurement systems. First, combinations of region-specific and generic measures with instruments measuring specific factors of relevance, such as depression and pain interference, could be applied to more comprehensively assess patient-focused, health-related outcomes. Instrument choice should depend on the psychometric attributes of the measure, methodological quality in development, and evidence of validation in the target population.49 Generic PRO measures have gained popularity in trauma as they provide a more holistic measurement of health-related outcomes in the multiply injured and medically complex patient, while allowing comparisons between interventions.3 Collaborative efforts are underway to develop standardized outcome sets through a consensus-based selection of generic and specific outcome measures.55

Second, abbreviated versions of well-established scales (eg, QuickDASH) have been developed to improve efficiency and performance while maintaining validity against their full-version counterparts.19 Another contemporary solution involves computerized adaptive tests (CATs). CATs are dynamic tests using computers to administer test items based on the IRT mathematical model.5658 An IRT-based algorithm allows adaptation to the patient's last response and administration of relevant subsequent items from a large question bank.58 The Patient-Reported Outcome Measurement Information System (PROMIS) developed by the US National Institute of Health is one of the most commonly used CAT systems.5658 PROMIS CAT scores range from 0 to 100, with 50 points as US general population mean. They enable capture of physical (eg, physical function and pain interference), mental (eg, anxiety, and depression), and social (eg, social isolation) health domains through modules that can be tailored to the study and population being assessed.56 Customization, avoiding redundancy, minimizing floor-ceiling effects, and maximizing scoring efficiency and measurement precision are clearly advantageous in the trauma setting.5658 Studies have demonstrated the correlation of CATs with popular fixed-length scales.59

In general, computer-based outcome assessment represents a positive paradigm shift, with instruments such as PROMIS CATs being incorporated in outcome measurement software by well-established organizations such as the AO Foundation.60 It is important to note that PROMIS CATs were originally developed for chronic conditions, and their development as regional measures (eg, PROMIS Upper Extremity Physical Function CAT) and evaluation in traumatic conditions is ongoing.56,61 Other outcome measures with adaptive capabilities, but delivered through a paper-based format, include the FLEX-SF shoulder instrument and short-form PROMIS measures.56,62

Some fixed-length scales have also been designed to provide a more relevant, patient-specific assessment by factoring in patient-reported “levels of ability” and “levels of necessity” in relation to various activities.42 Other instruments have focused on accurate measurement of functional progress and minimizing the discrepancy between what patients report and what they actually do, by clinician-observed grading of enacted activities of daily life in real time.63 PRO measures such as the PFWO and ULFI are designed to capture recall of preinjury performance and baseline function.21,44,45 The former includes a component that accounts for compensatory mechanisms in performing daily activities.44,45 Another strategy in establishing a baseline involves the use of patient proxies, such as family and friends, to aid in recall of preinjury function during the early postinjury phase.64 In terms of patient satisfaction with various health domains, the MHQ includes a component measuring satisfaction with appearance, physical function, and symptoms.41 Patient experience, including satisfaction with care, is often assessed through separate scales, although instruments such as the SRI and MWQ evaluate this aspect in musculoskeletal trauma and wrist/hand injuries, respectively.43,65 Early work on patient activation measures has demonstrated a direct correlation with satisfaction among upper extremity conditions and musculoskeletal trauma patients, as well as improved pain relief, mental health, and reduced disability.66,67 Further research is necessary to assess correlation with PROs.

This work has systematically reviewed the methodological quality of studies involving PRO measurements in upper extremity trauma on a broad scale. Focused evaluations, using the COSMIN checklist, have been conducted in distal radius fractures; however, the literature in this area is lacking overall.49 Instrument properties should be defined for the population being tested and not for the PRO instrument itself.3,13 In reality, PRO measures have been used throughout orthopaedics in patient groups for which the instrument was not initially developed or psychometrically evaluated.3,13 It is unclear whether commonly used instruments can measure all the health-related aspects surrounding upper limb trauma important to the individual. These measures are commonly selected by intuition, clinical culture, and familiarity, with the belief that they are “fit for purpose” and capable of capturing the substantive components of disability experienced by these patients. Ultimately, PRO measurement selection requires careful consideration of the methodological quality, and further research is required to evaluate their psychometric properties in these populations. Reaching a consensus on outcome measurement sets in trauma that are delivered in a standardized fashion will form a more complete, comparable, and interpretable assessment of disability in these populations.68


1. Suk M, Hanson B, Norvell D, Helfet D: AO Handbook. Musculoskeletal Outcomes Measures and Instruments. Selection and Assessment Upper Extremity, ed 1. Davos, Switzerland, Thieme, AO Publishing, 2009, vol 1, pp 65-387.
2. Ayers DC, Bozic KJ: The importance of outcome measurement in orthopaedics. Clin Orthop Relat Res 2013;471:3409-3411.
3. World Health Organization, ICF: International Classification of Functioning, Disability and Health. Accessed on June 13, 2017.
4. Bhandari M, Busse JW, Hanson BP, Leece P, Ayeni OR, Schemitsch EH: Psychological distress and quality of life after orthopedic trauma: An observational study. Can J Surg 2008;51:15-22.
5. Kaske S, Lefering R, Trentzsch H, et al.: Quality of life two years after severe trauma: A single centre evaluation. Injury 2014;45(suppl 3):S100-S105.
6. Nota SPFT, Bot AGJ, Ring D, Kloen P: Disability and depression after orthopaedic trauma. Injury 2015;46:207-212.
7. Levin PE, MacKenzie EJ, Bosse MJ, Greenhouse PK: Improving outcomes: Understanding the psychosocial aspects of the orthopaedic trauma patient. Instr Course Lect 2014;63:39-48.
8. Vranceanu AM, Bachoura A, Weening A, Vrahas M, Smith RM, Ring D: Psychological factors predict disability and pain intensity after skeletal trauma. J Bone Joint Surg Am 2014;96:e20.
9. Das De S, Vranceanu AM, Ring DC: Contribution of kinesophobia and catastrophic thinking to upper-extremity-specific disability. J Bone Joint Surg Am 2013;95:76-81.
10. Bot AGJ, Bekkers S, Arnstein PM, Smith RM, Ring D: Opioid use after fracture surgery correlates with pain intensity and satisfaction with pain relief. Clin Orthop Relat Res 2014;472:2542-2549.
11. De Putter CE, Selles RW, Haagsma JA, et al.: Health-related quality of life after upper extremity injuries and predictors for suboptimal outcome. Injury 2014;45:1752-1758.
12. Clement ND, Duckworth AD, McQueen MM, Court-Brown CM: The outcome of proximal humeral fractures in the elderly: Predictors of mortality and function. Bone Joint J 2014;96B:970-977.
13. Slobogean GP, Noonan VK, Famuyide A, O'Brien PJ: Does objective shoulder impairment explain patient-reported functional outcome? A study of proximal humerus fractures. J Shoulder Elbow Surg 2011;20:267-272.
14. Ring D, Kadzielski J, Fabian L, Zurakowski D, Malhotra LR, Jupiter JB: Self-reported upper extremity health status correlates with depression. J Bone Joint Surg Am 2006;88:1983-1988.
15. Roh YH, Lee BK, Noh JH, Oh JH, Gong HS, Baek GH: Effect of anxiety and catastrophic pain ideation on early recovery after surgery for distal radius fractures. J Hand Surg Am 2014;39:2258-2264.e2.
16. Fitzpatrick R, Davey C, Buxton MJ, Jones DR: Evaluating patient-based outcome measures for use in clinical trials. Health Technol Assess 1998;2:i-iv, 1-74.
17. Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC: Rating the methodological quality in systematic reviews of studies on measurement properties: A scoring system for the COSMIN checklist. Qual Life Res 2012;21:651-657.
18. Marx RG, Bombardier C, Hogg-Johnson S, Wright JG: Clinimetric and pscyhometric strategies for development of a health measurement scale. J Clin Epidemiol 1999;52:105-111.
    19. Beaton DE, Wright JG, Katz JN; Upper Extremity Collaborative Group: Development of the QuickDASH: Comparison of three item-reduction approaches. J Bone Joint Surg Am 2005;87:1038-1046.
    20. Carroll D: A quantitative test of upper extremity function. J Chronic Dis 1965;18:479-491.
      21. Gabel C, Michener LA, Burkett B, Neller A: The Upper Limb Functional Index: Development and determination of reliability, validity and responsiveness. J Hand Ther 2006;19:328-349.
      22. Michener LA, McClure PW, Sennett BJ: American Shoulder and Elbow Surgeons standardized shoulder assessment form, patient-self-report section: Reliability, validity and responsiveness. J Shoulder Elbow Surg 2002;11:587-594.
        23. Alberta FG, El Attrache NS, Bissell S, Mohr K, Browdy J, Yocum L: The development and validation of a functional assessment tool for the upper extremity in the overhead athlete. Am J Sports Med 2010;38:903-911.
          24. Watson L, Story I, Dalziel R, Hoy G, Shimmin A, Woods D: A new clinical outcome measure of glenohumeral joint instability: The MISS questionnaire. J Shoulder Elbow Surg 2005;14:22-30.
            25. Schmidutz F, Beirer M, Braunstein V, Bogner V, Wiedemann E, Biberthaler P: The Munich Shoulder Questionnaire (MSQ): Development and validation of an effective patient-reported tool for outcome measurement and patient safety in shoulder surgery. Patient Saf Surg 2012;6:9.
              26. Charles E, Kumar V, Blacknall J, Edwards K, Geoghegan JM, Manning PA: A validation of the Nottingham Clavicle Score: A clavicle, acromio-clavicular joint and sterno-clavicular joint specific patient reported outcome measure. BJJ Orthop Proc 2013;95B(supp 1):50.
                27. Dawson J, Fitzpatrick R, Carr A: The assessment of shoulder instability: The development and validation of a questionnaire. J Bone Joint Surg Br 1999;81:420-426.
                  28. Leggin BG, Michener LA, Shaffer MA, Brenneman SK, Iannotti JP, Williams GR Jr: The Penn shoulder score: Reliability and validity. J Orthop Sports Phys Ther 2006;36:138-151.
                  29. Hollinshead RM, Mohtadi NG, Vande Guchte RA, Wadey VM: Two 6-year follow-up studies of large and massive rotator cuff tears: Comparison of outcome measures. J Shoulder Elbow Surg 2000;9:373-379.
                  30. Brophy RH, Beauvais RL, Jones EC, Cordasco FA, Marx RG: Measurement of shoulder activity level. Clin Orth Relat Res 2005;439:101-108.
                    31. van der Heijden GJ, Leffers P, Bouter LM: Shoulder disability questionnaire design and responsiveness of a functional status measure. J Clin Epidemiol 2000;53:29-38.
                      32. Williams GN, Gangel TJ, Arciero RA, Uhorchak JM, Taylor DC: Comparison of the Single Assessment Numeric Evaluation Method and Two Shoulder Rating Scales: Outcome measures after shoulder surgery. Am J Sports Med 1999;27:214-221.
                        33. Noorani AM, Roberts DJ, Malone AA, Waters TS, Jaggi A, Lambert SM: Validation of the Stanmore percentage of normal shoulder assessment. Int J Shoulder Surg 2012;6:9-14.
                          34. Kohn D, Geyer M: The subjective shoulder rating system. Arch Orthop Trauma Surg 1997;116:324-328.
                            35. Gilbart MK, Gerber C: Comparison of the subjective shoulder value and the Constant Score. J Shoulder Elbow Surg 2007;16:717-721.
                              36. Kirkley A, Alvarez C, Griffin S: The development and evaluation of a disease-specific quality of life questionnaire for disorders of the rotator cuff: The Western Ontario Rotator Cuff Index. Clin J Sport Med 2003;13:84-92.
                                37. Kirkley A, Griffin S, McLintock H, Ng L: The development and evaluation of a disease-specific quality of life measurement tool for shoulder instability: The Western Ontario Shoulder Instability Index (WOSI). Am J Sports Med 1998;26:764-772.
                                  38. Dawson J, Doll H, Boller I, Fitzpatrick R, Little C, Rees J: The development and validation of a patient-reported questionnaire to assess outcomes of elbow surgery. J Bone Joint Surg Br 2008;90:466-473.
                                    39. MacDermid JC: Outcome evaluation in patients with elbow pathology: Issues in instrument development and evaluation. J Hand Ther 2001;14:105-114.
                                      40. Chen CC, Granger CV, Peimer CA, Moy OJ, Wald S: Manual Ability Measure (MAM-16): A preliminary report on a new patient-centred and task-orientated outcome measure of hand function. J Hand Surg Br 2005;30:207-216.
                                        41. Chung KC, Pillsbury MS, Walters MR, Hayward RA: Reliability and validity testing of the Michigan Hand Outcomes Questionnaire. J Hand Surg Am 1998;23:575-587.
                                        42. Seaton MK, Groth GN, Matheson L, Feely C: Reliability and validity of the Milliken Activities of Daily Living Scale. J Occup Rehabil 2005;15:343-351.
                                        43. Beirer M, Serly J, Vester H, Pförringer D, Crönlein M, Deiler S: The Munich Wrist Questionnaire (MWQ): Development and validation of a new patient-reported outcome measurement tool for wrist disorders. BMC Musculoskelet Disord 2016;17:167.
                                        44. Bialocerkowski AE, Grimmer KA, Bain GI: Development of a patient-focused wrist outcome instrument. Hand Clin 2003;19:437-448.
                                        45. Bialocerkowski AE, Grimmer KA, Bain GI: Validity of the patient focused wrist outcome instrument: Do impairments represent functional ability? Hand Clin 2003;19:449-455.
                                        46. MacDermid JC: Responsiveness of the Disability of the Arm, Shoulder and Hand (DASH) and Patient-Rated Wrist/Hand Evaluation (PRWHE) in evaluating change after hand therapy. J Hand Ther 2004;17:18-23.
                                          47. MacDermid JC: Development of a scale for patient rating of wrist pain and disability. J Hand Ther 1996;9:178-183.
                                            48. Bhandari M, Petrisor B, Schemitsch E: Outcome measurements in orthopaedics. Indian J Orthop 2007;41:32-36.
                                            49. Kleinlugtenbelt YV, Nienhuis RW, Bhandari M, Goslings JC, Poolman RW, Scholtes VAB: Are validated outcome measures used in distal radial fractures truly valid? Bone Joint Res 2016;5:153-161.
                                            50. Horwitz DS, Richard RD, Suk M: The reporting of functional outcome instruments in the Journal of Orthopaedic Trauma over a 5-year period. J Orthop Trauma 2014;28:2-5.
                                            51. Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR: Publication bias in clinical research. Lancet 1991;337:867-872.
                                            52. Streiner DL, Norman GR, Cariney J: Reliability (Chapter 8), in Streiner DL, Norman GR, Cariney J, eds: Health Measurement Scales: A Practical Guide to Their Development and Use, ed 5. Oxford, UK, Oxford University Press, 2015, pp 159-196.
                                            53. Hung M, Stuart AR, Higgins TF, Saltzman CL, Kubiak EN: Computerized adaptive testing using the PROMIS physical function item bank reduces test burden with less ceiling effects compared to the short musculoskeletal function assessment in orthopaedic trauma patients. J Orthop Trauma 2014;28:439-443.
                                            54. Hibbard JH, Stockard J, Mahoney ER, Tusler M: Development of the Patient Activation Measure (PAM): Conceptualizing and measuring activation in patients and consumers. Health Serv Res 2004;39(4 Pt 1):1005-1026.
                                            55. International Consortium for Health Outcemes Measurement (ICHOM). Accessed June 13, 2017.
                                            56. PROMIS Health Organization, PROMIS Cooperative Group. PROMIS Instrument Development and Validation Scientific Standards Version 2.0. 2012 (Revised May 2013). Accessed July 20, 2017.
                                            57. Revicki DA, Cella DF: Health status assessment for the twenty-first century : Item response theory, item banking and computer adaptive testing. Qual Life Res 1997;6:595-600.
                                            58. Hung M, Clegg DO, Greene T, Saltzman CL: Evaluation of the PROMIS physical function item bank in orthopaedic patients. J Orthop Res 2011;29:947-953.
                                            59. Döring AC, Nota SP, Hageman MG, Ring DC: Measurement of upper extremity disability using the patient-reported outcomes measurement information system. Hand Surg Am 2014;39:1160-1165.
                                            60. AO Trauma: Accessed July 20, 2017.
                                            61. Hays RD, Spritzer KL, Amtmann D, Lai JS, Dewitt EM, Rothrock N: Upper-extremity and Mobility Subdomains from the Patient-Reported Outcomes Measurement Information System (PROMIS) adult physical functioning item bank. Arch Phys Med Rehabil 2013;94:2291-2296.
                                            62. Cook KF, Roddey TS, Gartsman GM, Olson SL: Development and psychometric evaluation of the Flexilevel Scale of Shoulder Function. Med Care 2003;41:823-835.
                                            63. van de Water AT, Davidson M, Shields N, Evans MC, Taylor NF: The shoulder function index (SFInX): A clinician-observed outcome measure for people with a proximal humeral fracture. BMC Musculoskelet Disord 2015;16:31.
                                            64. Stuart AR, Higgins TF, Hung M, Weir CR, Kubiak EN, Rothberg DL: Reliability in measuring pre-injury physical function in orthopaedic trauma. J Orthop Trauma 2015;29:527-532.
                                            65. Walton DM, MacDermid JC, Pulickal M, Rollack A, Veitch J: Development and initial validation of the Satisfaction and Recovery Index (SRI) for measurement of recovery from musculoskeletal trauma. Open Orthop J 2014;30:316-325.
                                            66. Knutsen EJ, Paryavi E, Castillo RC, O'Toole RV: Is satisfaction among orthopaedic trauma patients predicted by depression and activation levels? J Orthop Trauma 2015;29:e183-e187.
                                            67. Gruber JS, Hageman M, Neuhaus V, Mudgal CS, Jupiter JB, Ring D: Patient activation and disability in upper extremity illness. J Hand Surg Am 2014;39:1378-1383.
                                            68. Porter ME, Larsson S: Standardizing patient outcomes measurement. N Engl J Med 2016;374:504-506.

                                            Appendix 1 PRISMA 2009 Checklist

                                            Supplemental Digital Content

                                            Copyright © 2017 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the American Academy of Orthopaedic Surgeons.