Patient-reported outcome measures (PROMs) have been increasingly adopted in orthopaedics to capture patient-centered indicators of health status for use in clinical care, research, and cost-effectiveness analysis. Numerous PROMs are administered regularly in clinical practices, and many subspecialty communities have identified instruments relevant to their patient populations, standardizing their use through registries or multicenter research groups.
Despite the meaningful advances in patient-centered care made possible with PROMs, substantial barriers remain to their adoption and standardization. Among the most frequently discussed concerns are questions on how reliably and precisely an instrument captures the outcome of interest, how best to compare outcome scores between different patient populations, and how to minimize the patient and administrative burden associated with administering PROMs. To improve patient care, these surveys must be useful to physicians and patients, ideally contributing to the process of shared decision making. For PROM use in clinical practice to become widespread, the accuracy and ease of use of the survey are paramount.
In 2004, a group of scientists, statisticians, and psychometricians from across the United States received funding from the National Institutes of Health as part of the Roadmap for Medical Research to develop a new system of PROMs for use in clinical care and medical research.1 From this major effort came the Patient-Reported Outcomes Measurement Information System (PROMIS), which was established to improve the reporting of patient symptoms, function, and health-related quality of life in an efficient and precise manner. PROMIS was designed specifically to overcome many of the barriers to adoption faced by legacy measures (eg, narrow scope, administrative burden) through the development of PROMs that are publicly available, efficient, precise, and flexible. Several of these measures have been validated in patient populations with specific orthopaedic conditions. We believe PROMIS represents a new paradigm for measurement of patient-reported outcomes in orthopaedics and in other fields of medicine.
PROMIS was developed with attention to the psychometric characteristics that would make it maximally precise, reliable, and versatile. PROMIS consists of item banks organized into domains of health, such as physical health and mental health (Table 1). These item banks were developed using item response theory (IRT), a contemporary process of test development that ensures that each individual item (or question) is reliable, of value to the whole, and precisely placed along the continuum of the trait being tested.2 IRT helps to make PROMIS maximally unidimensional, that is, more specific for the domain being tested and less influenced by comorbidities in other health domains than many older measures. In IRT, items can be added if coverage of the domain is found to be lacking, while still maintaining overall measurement validity.
Classical test theory limits older measures to being administered as static surveys, whereas IRT enables PROMIS measures to be more flexible. PROMIS measures can be administered as static “short forms” similar to those available for other measures, but also can be administered dynamically on a computer.2 Administration via computer involves a highly accurate computerized format called computerized adaptive testing (CAT), in which a computer algorithm customizes item delivery to an examinee by selecting each subsequent item based on answers to previous items. By dynamically refining its estimate of the patient’s true condition, CAT software enables a high level of precision using fewer questions.
CAT is a key advantage of PROMIS that distinguishes it from many conventional measures. Assessment of a respondent’s physical function with PROMIS Physical Function CAT has been shown to be as accurate as or more accurate than conventional measures used to assess function in patient populations with trauma injuries or disorders of the foot and ankle, upper extremity, or spine, and requires less time to administer than conventional measures, such as the shortened version of the Disabilities of the Arm, Shoulder, and Hand (QuickDASH) questionnaire, and the Foot and Ankle Ability Measure (FAAM) Activities of Daily Living Subscale.3-8 In addition to the benefits (in terms of patient experience) of reducing the burden associated with survey administration, CAT may be more accurate than conventional outcome measures because longer questionnaires used to measure outcomes tend to be less reliable as patients tire, lose focus, and consider their answers less thoroughly.9,10
Another strength of PROMIS that surpasses previously available measures is the use of a T-score as the output. The scores of all PROMIS domains are normalized to the general population, with the mean set to a score of 50, and the standard deviation set to 10 points. This scoring system is immediately understandable, even if the clinician is unfamiliar with the measure. Thus, practitioners have little problem interpreting scores from any PROMIS measure.
The Use of PROMIS in Specific Populations
PROMIS has been compared against conventional general health and disease-specific PROMs used in orthopaedic practice and regularly has been found to improve coverage of the relevant health domain, increase reliability, and reduce respondent and administrative burden when administered via CAT forms, thereby improving the evaluation of quality-of-life and health status.3-8,11-17 Thus far, this advantage has been studied in patients presenting with orthopaedic disorders of the foot and ankle,6,7,11,13 upper extremity,3,4,15 and spine,8,14 as well as in those with traumatic5,17 and sports-related injuries treated in an outpatient setting.16 Responsiveness to change in these patient populations requires further exploration.
To date, the PROMIS Physical Function domain is the most thoroughly studied health domain in patients with musculoskeletal disorders. The psychometric properties of the complete, 124-item Physical Function item bank have been assessed in patient populations with orthopaedic conditions of the foot and ankle, lower extremity, upper extremity, and spine through the administration of the complete item bank.12,14 In each case, the whole item bank performed well with adequate unidimensionality (ie, low unexplained variance) and excellent coverage (ie, low ceiling or floor effects) and reliability (ie, reproducible item, person ordering). Following the measurement of these psychometric properties, PROMIS CAT-enabled instruments have been compared with legacy instruments in patient populations with disorders of the foot and ankle,6,7 hand and upper extremity,3,4,15 and spine.8 These validated legacy instruments include the Oswestry Disability Index (ODI), the FAAM, the Foot Function Index (FFI), and the DASH instrument.
Recent literature comparing PROMIS with traditionally used outcome measures in patient populations with various orthopaedic conditions has focused particularly on the degree of correlation among these types of outcome measures as well as on comparisons of respondent and administrative burden, coverage, and unidimensionality. High correlation is taken as evidence that similar underlying traits are being assessed and indicates that PROMIS could effectively replace a corresponding legacy measure.
Foot and Ankle
PROMIS has compared favorably with conventional instruments in two recent studies of patient populations with orthopaedic foot and ankle conditions. Hung et al7 compared PROMIS with the FFI and the FAAM. In this study, 287 patients took the PROMIS Lower Extremity CAT in conjunction with the FFI and FAAM Sports module in random order on a tablet computer. The correlation between the PROMIS instrument and the legacy instruments was strong: 0.61 and 0.60 for the comparison of the Lower Extremity CAT to the FAAM Sports Subscale and the FFI, respectively (P < 0.01 for both correlations). The Lower Extremity CAT demonstrated a reduced time requirement, reduced floor effects, and a slightly reduced ceiling effect compared with the FAAM Sports module and the FFI.
The PROMIS Physical Function CAT then was compared with the FAAM Activities of Daily Living Subscale and the FFI 5-point verbal rating scale (FFI-5pt).6 A total of 311 patients undergoing elective surgery at one of 10 clinical sites in the American Orthopaedic Foot and Ankle Society’s Orthopaedic Foot and Ankle Outcomes Research Network completed these three surveys in an online portal. The Pearson correlations of the Lower-extremity CAT to the FAAM Activities of Daily Living Subscale and the FFI-5pt were 0.79 and 0.69, respectively. The Physical Function CAT also demonstrated greater or equivalent unidimensionality and reliability compared with the FAAM Activities of Daily Living Subscale and the FFI-5pt in this cohort. Each instrument was responsive to change, but notably, the Physical Function CAT and the FAAM Activities of Daily Living Subscale measured improved physical function with surgery, whereas the FFI-5pt indicated a slight deterioration in physical function with surgery. These results may have stemmed in part from the reduced unidimensionality exhibited by the FFI-5pt, indicating measurement of more than one underlying construct.
Three separate studies recently have shown that PROMIS instruments compare favorably with conventional instruments in assessing patients with disorders of the upper extremity. Tyser et al4 conducted a study of 134 patients presenting with disorders of the upper extremity who completed the Physical Function CAT and the full 30-item DASH score on a handheld tablet. The instruments showed good correlation (r = 0.726; P < 0.001). The researchers also examined the psychometric properties in this cohort, and both instruments demonstrated excellent reliability and unidimensionality, but the Physical Function CAT exhibited greater coverage. The DASH instrument exhibited slight ceiling and floor effects of 4% and 1%, respectively, whereas the Physical Function CAT showed no ceiling or floor effects. This result is remarkable, given the lack of specific focus of the physical function domain on the upper extremity.
Döring et al3 administered the more focused Physical Function Upper Extremity CAT; the shortened QuickDASH measure; and the PROMIS Physical Function CAT, Physical Function Mobility CAT, and Pain Interference CAT to 84 patients presenting with orthopaedic conditions of the hand or upper extremity in an outpatient setting. The PROMIS Physical Function Upper Extremity CAT demonstrated high correlation with the QuickDASH (β = −0.81; P < 0.001). No floor or ceiling effects were observed for the PROMIS instrument, and no ceiling effect, but a slight floor effect, were seen for the QuickDASH measure (one patient scored 0). Interestingly, the PROMIS Pain Interference scores correlated with the Physical Function Upper Extremity CAT and the QuickDASH scores, suggesting a connection between the underlying constructs that warrants further investigation.
Overbeek et al15 conducted a similarly designed study at the same center, this time administering the more general PROMIS Physical Function CAT and the QuickDASH and the PROMIS Pain Interference CAT and the PROMIS Depression CAT to 93 patients at a hand surgery clinic. The Physical Function CAT and the QuickDASH measure exhibited moderate correlation with each other (r = −0.55; P < 0.001) as well as moderate to high correlation with the pain interference and depression metrics, again indicating the association of psychosocial distress with physical function in a patient population with hand and upper extremity orthopaedic conditions.
The enhanced psychometric properties of PROMIS are particularly beneficial for assessing outcomes in patients with spine conditions because suboptimal unidimensionality and large floor effects have been demonstrated in two widely used conventional instruments, the Neck Disability Index and the ODI.18,19 In a recent study, a large cohort of patients who underwent outpatient spine procedures completed the Physical Function CAT measure and two conventional measures, the ODI and the Medical Outcomes Study 36-Item Short Form Physical Functioning Scale.8 The PROMIS Physical Function CAT was found to have considerably better coverage than either legacy scale and required far less time for patients to complete. It also exhibited strong correlation with the ODI and the Physical Functioning Scale of the Medical Outcomes Study 36-Item Short Form questionnaire, resulting in Pearson correlation values of 0.73 and 0.78, respectively (P < 0.0001). These correlations supported the development of equations to link values from the legacy scores to the Physical Function CAT scores, allowing practices to leverage previous work in a PROMIS CAT-enabled future.
Other Patient Populations
Preliminary investigations have examined the use and efficacy of the PROMIS instruments following anterior cruciate ligament (ACL) reconstruction and orthopaedic trauma. Papuga et al16 compared the Physical Function CAT instrument with the International Knee Documentation Committee (IKDC) scale, a conventional PROM used in patients following ACL reconstruction. In this study, both PROMs were also compared with data on gait pathology produced by an instrumented carpet designed for gait analysis. A total of 106 patients completed these assessments preoperatively and at 3, 10, 20, and 52 weeks after ACL reconstruction; the Physical Function CAT scores correlated highly with the IKDC scores across all time points (combined correlation r = 0.90, P < 0.001). The Physical Function CAT instrument was more responsive to change than was the IKDC, revealing the slight reduction in function measured by the instrumented carpet 3 weeks postoperatively that the IKDC scale was not able to capture. Receiver operating characteristic curve analysis further demonstrated that the Physical Function CAT scores at every time point, including at baseline, were diagnostic in this cohort for predicting poor outcomes, such as latent complications or subsequent interventions after ACL reconstruction surgery. This final result raises the question of whether PROMIS could be used to stratify risk in this population, a question that warrants further investigation.
In an orthopaedic trauma clinic, the Physical Function CAT instrument was compared with the psychometrically well-designed preferred instrument known as the Short Musculoskeletal Functional Assessment (sMFA) questionnaire.5 Both instruments demonstrated high internal measurement consistency in this population with no floor effects. The Physical Function CAT had no ceiling effect, but the sMFA exhibited a 14.4% ceiling effect. The Rasch Partial Credit model, a one-parameter IRT model, demonstrated good measurement quality and strong support of unidimensionality for the Physical Function CAT in this population, whereas the sMFA was only moderately unidimensional.
In another investigation of a patient population with traumatic orthopaedic injuries, Morgan et al17 evaluated 47 patients with proximal humerus fractures, comparing the PROMIS Physical Function CAT instrument with the DASH, the sMFA, and the Constant Shoulder Score. The Physical Function CAT measure exhibited moderate to high correlation with legacy measures, with absolute values of Spearman rank correlation coefficients from 0.52 to 0.81 (P < 0.001), and had reduced respondent and/or administrative burden compared with all legacy measures. The Physical Function CAT also had a reduced ceiling effect compared with the sMFA and the DASH score in this population.
Taken together, these results demonstrate that PROMIS reliably captures patient-reported health outcomes in patient populations treated for orthopaedic foot and ankle, upper extremity, and spine conditions. Preliminary results also are promising in sports medicine and trauma populations. The PROMIS CAT instruments are well correlated with conventional instruments, indicating a measurement of similar underlying traits, and have improved efficiency and broader coverage. The main limitations of the PROMIS instruments are the limited number of studies comparing and validating these measures and their consequent lack of widespread acceptance. However, these drawbacks likely will be resolved soon, given the accelerating pace of research and the adoption of PROMIS instruments.
The PROMIS team of developers created a robust and dynamic family of patient-reported outcomes tools that stand to improve orthopaedic care and comparative effectiveness analysis. These tools are based on the underlying concepts of IRT, which allow flexibility, enhance accuracy, and improve efficiency through the use of CAT. The PROMIS Physical Function domain, which can be administered through static short forms or through CAT, is a meaningful innovation in orthopaedic outcomes measurement. The Physical Function CAT has demonstrated improved coverage, reliability, and validity, as well as a reduced respondent and administrative burden compared with conventional or legacy measures in patient populations treated for foot and ankle, upper extremity, and spine conditions. Further investigation will be needed to evaluate all major orthopaedic populations, to determine the responsiveness to change in health status over time, and to ascertain clinically meaningful differences in scores for each population.
Evidence-based Medicine: Levels of evidence are described in the table of contents. In this article, reference 9 is a level I study. Reference 10 is a level II study. References 1-8 and 11-19 are level III studies.
References printed in bold type are those published within the past 5 years.
1. Cella D, Riley W, Stone A, et al. PROMIS
Cooperative Group: The Patient-Reported Outcomes Measurement
Information System (PROMIS
) developed and tested its first wave of adult self-reported health outcome item banks: 2005-2008. J Clin Epidemiol 2010;63(11):1179-1194.20685078
2. Fries JF, Witter J, Rose M, Cella D, Khanna D, Morgan-DeWitt E: Item response theory, computerized adaptive testing, and PROMIS
: Assessment of physical function. J Rheumatol 2014;41(1):153-158.24241485
3. Döring AC, Nota SP, Hageman MG, Ring DC: Measurement of upper extremity disability using the Patient-Reported Outcomes Measurement
Information System. J Hand Surg Am 2014;39(6):1160-1165.24799143
4. Tyser AR, Beckmann J, Franklin JD, et al. Evaluation of the PROMIS
physical function computer adaptive test in the upper extremity. J Hand Surg Am 2014;39(10):2047-2051.e4.25135249
5. Hung M, Stuart AR, Higgins TF, Saltzman CL, Kubiak EN: Computerized adaptive testing using the PROMIS
Physical Function item bank reduces test burden with less ceiling effects compared with the Short Musculoskeletal Function Assessment in orthopaedic trauma patients. J Orthop Trauma 2014;28(8):439-443.24378399
6. Hung M, Baumhauer JF, Brodsky JW, et al. Orthopaedic Foot and Ankle Outcomes
Research (OFAR) of the American Orthopaedic Foot and Ankle Society (AOFAS): Psychometric comparison of the PROMIS
Physical Function CAT with the FAAM and FFI for measuring patient-reported outcomes
. Foot Ankle Int 2014;35(6):592-599.24677217
7. Hung M, Nickisch F, Beals TC, Greene T, Clegg DO, Saltzman CL: New paradigm for patient-reported outcomes
assessment in foot & ankle research: Computerized adaptive testing. Foot Ankle Int 2012;33(8):621-626.22995227
8. Brodke DS, Lawrence BD, Spiker WR, Neese AM, Hung M: Converting ODI or SF-36 Physical Function domain scores to a PROMIS
PF Score. Spine J 2014;14(suppl):S50.
9. Sahlqvist S, Song Y, Bull F, Adams E, Preston J, Ogilvie D; iConnect consortium: Effect of questionnaire length, personalisation and reminder type on response rate to a complex postal survey: Randomised controlled trial. BMC Med Res Methodol 2011;11:62.21548947
10. Jepson C, Asch DA, Hershey JC, Ubel PA: In a mailed physician survey, questionnaire length had a threshold effect on response rate. J Clin Epidemiol 2005;58(1):103-105.15649678
11. Hung M, Baumhauer JF, Latt LD, Saltzman CL, SooHoo NF, Hunt KJ; National Orthopaedic Foot and Ankle Outcomes
Research Network: Validation of PROMIS
® Physical Function computerized adaptive tests for orthopaedic foot and ankle outcome research. Clin Orthop Relat Res 2013;471(11):3466-3474.23749433
12. Hung M, Clegg DO, Greene T, Saltzman CL: Evaluation of the PROMIS
physical function item bank in orthopaedic patients. J Orthop Res 2011;29(6):947-953.21437962
13. Hung M, Franklin JD, Hon SD, Cheng C, Conrad J, Saltzman CL: Time for a paradigm shift with computerized adaptive testing of general physical function outcomes
measurements. Foot Ankle Int 2014;35(1):1-7.24101733
14. Hung M, Hon SD, Franklin JD, et al. Psychometric properties of the PROMIS
physical function item bank in patients with spinal disorders. Spine (Phila Pa 1976) 2014;39(2):158-163.24173018
15. Overbeek CL, Nota SP, Jayakumar P, Hageman MG, Ring D: The PROMIS
physical function correlates with the QuickDASH in patients with upper extremity illness. Clin Orthop Relat Res 2015;473(1):311-317.25099262
16. Papuga MO, Beck CA, Kates SL, Schwarz EM, Maloney MD: Validation of GAITRite and PROMIS
as high-throughput physical function outcome measures following ACL reconstruction. J Orthop Res 2014;32(6):793-801.24532421
17. Morgan JH, Kallen MA, Okike K, Lee OC, Vrahas MS: PROMIS
Physical Function computer adaptive test compared with other upper extremity outcome measures in the evaluation of proximal humerus fractures in patients older than 60 years. J Orthop Trauma 2015;29(6):257-263.26001348
18. Brodke DS, Annis P, Lawrence BD, Ryan Spiker W, Neese A, Hung M: Oswestry Disability Index: A psychometric analysis with 1610 Patients. Spine J 2014;14:S49.
19. Hung M, Cheng C, Hon SD, et al. Challenging the norm: Further psychometric investigation of the neck disability index. Spine J 2015;15(11):2440-2445.24662211