Readability of Orthopaedic Patient-reported Outcome Measures: Is There a Fundamental Failure to Communicate? : Clinical Orthopaedics and Related Research®

Secondary Logo

Journal Logo

Clinical Research

Readability of Orthopaedic Patient-reported Outcome Measures: Is There a Fundamental Failure to Communicate?

Perez, Jorge L. MD1; Mosher, Zachary A. BS1; Watson, Shawna L. BA1; Sheppard, Evan D. MD1; Brabston, Eugene W. MD1; McGwin, Gerald Jr. MS, PhD1; Ponce, Brent A. MD1,a

Author Information
Clinical Orthopaedics and Related Research 475(8):p 1936-1947, August 2017. | DOI: 10.1007/s11999-017-5339-0
  • Free



The CDC defines “health literacy” as “the degree to which an individual has the capacity to obtain, communicate, process, and understand basic health information” [127]. Although correlation does not equate to causation, previous studies have shown associations between health literacy and healthcare outcomes, noting that poor health literacy correlates with increased healthcare cost, hospitalization rates, and mortality [35, 95, 118, 119]. Accordingly, the CDC encourages physicians to maximize health literacy by adapting patient materials to the level of knowledge of the intended audience [99]. For written materials, the American Medical Association (AMA) suggests writing at or below the 6th grade reading level; similarly, the NIH suggests writing at the 7th or 8th grade reading level [128, 135]. This “readability” can be determined by using various validated formulas which incorporate factors such as sentence length, word count, and word complexity. While studies have shown that patient-directed materials, such as consent forms, educational materials, or discharge summaries often are written above suggested reading levels [7, 24, 123, 135], few studies have analyzed materials used by patients to self-report their health status [3, 38]. Known as “patient-reported outcome measures,” (PROMs) these questionnaires are used clinically and in research to quantify patients’ perceptions of their conditions, functional abilities, baseline health status, treatment success, and physician competency [6, 8, 18, 103, 108].

Validation of PROMs is important to ensure that they measure the endpoints of interest accurately and reproducibly. Elements such as reliability, consistency, content/construct validity, and sensitivity to change, are often considered part of this validation process; however, readability is seldom mentioned as a factor in validation [11, 88, 97, 110]. Thus, while a PROM may have elements of a well-designed survey, low readability could impair its practical value and clinical utility. However, readability is not synonymous with comprehension, as comprehension is a multifaceted concept of aesthetic utility and academic content. Thus, while patient materials may be deemed “highly readable,” if they are written in poor font or color choices, the reader may have an issue comprehending the contained content. Furthermore, with diverse knowledge levels among patients, the level of comprehension is unique to every patient, regardless of the readability level of the PROM content. Even so, investigating the readability of patient materials offers practitioners a sense for whether broad, diverse populations of patients are likely to be able to use these tools in real-world practice.

Two studies exist regarding the readability of orthopaedic-specific PROMs [3, 38], and both are limited in their scope of PROM selection and their heavy reliance on a single readability measurement test, the Flesch Reading Ease [3, 38]. While the Flesch Reading Ease is one of the most commonly-used readability scores in health literature, its continued utility with modern syntax has been called into question as newer, more broadly applicable readability measures have been developed [130]. In the absence of a single, accepted, validated readability measure for healthcare materials, the use of a lone readability measure could lead to an unnecessary skew in the results of these studies. To account for this, one systematic review supports using multiple readability measures to evaluate a passage [45]. Additionally, prior studies have not assessed if PROMs with higher numbers of questions were traditionally written at higher reading levels. With trends toward increasingly brief surveys [37], questions arise regarding if lack of readability was an issue in longer surveys. Finally, if PROMs were written at too high of a reading level, an effort must be made to improve their readability and warrant their continued clinical use. While this has been shown in patient education materials [52, 115], this has not been performed in PROMs. Therefore, questions arise regarding if the same techniques used to improve patient education materials are also applicable for use with PROMs, and if edited versions of PROMs continue to possess their prior reliability and validation.

We therefore asked: (1) What proportion of orthopaedic-related PROMs and orthopaedic-related portions of the NIH Patient Reported Outcomes Measurement Information System (PROMIS®) are written at or below the 6th and 8th grade levels? (2) Is there a correlation between the number of questions in the PROM and reading level? (3) Using systematic edits based on guidelines from Centers for Medicare and Medicaid Services (CMS) [90], what proportion of PROMs achieved NIH-recommended reading levels?

Materials and Methods

Selection of Patient-Reported Outcomes Instrument List

A PubMed search was conducted to identify an inclusive list of orthopaedic-associated PROMs. The most relevant article identified, “Are patient-reported outcome measures in orthopaedics easily read by patients?” included a list of 59 PROMs [38]. We supplemented this list with orthopaedic PROMs from the following resources: (1) “Guide to outcomes instruments for musculoskeletal trauma research,” if they were specified as “patient” reported, and not “combined” or “physican” reported [1]; (2) the American Academy of Orthopaedic Surgeons’ (AAOS) website [5]; and (3) the quick-link Orthopaedic Scores website [74].

Preparation of Patient-Reported Outcomes Documents

A total of 86 independent PROMs were identified for inclusion and obtained in their published form (ie, as original journal publications or via the authors’ respective websites). These PROMs were grouped as follows: general health/musculoskeletal/pain status (15) (Table 1), upper extremity (21) (Table 2), lower extremity (41) (Table 3), and spine (nine) (Table 4). In addition, four PROMIS® Adult Short Forms and one investigator-compiled “PROMIS® Bank” consisting of questions from 11 relevant PROMIS® Adult Item Banks were assessed (Table 5). Individual PROMs and PROMIS® materials were attained in Portable Document Format (PDF), manually converted to Microsoft Word® format (Microsoft Corporation, Redmond, WA, USA), and reviewed for accuracy by the authors. All advertisements, hyperlinks, pictures, copyright notices, and other text that was not a direct element of the questionnaire were removed. Each PROM or section of the PROMIS® (item bank or short form) then was saved as a text-only file for analysis by the readability software.

Table 1:
Median reading grade levels of 15 common, orthopaedic-related, patient-reported outcome measures for general/musculoskeletal health or pain status, as determined by 19 unique readability algorithms
Table 2:
Median reading grade levels of 21 common, orthopaedic-related, patient-reported outcome measures of the upper extremity, as determined by 19 unique readability algorithms
Table 3:
Median reading grade levels of 41 common, orthopaedic-related, patient-reported outcome measures of the lower extremity, as determined by 19 unique readability algorithms
Table 4:
Median reading grade levels of nine common, orthopaedic-related, patient-reported outcome measures of the spine, as determined by 19 unique readability algorithms
Table 5:
Median reading grade levels of the NIH PROMIS® question sets

Readability Assessment

Readability tests were chosen based on the following inclusion criteria: (1) intended for English text; (2) intended for adult use or used in a previously published study; and (3) score output scale of grade level, with higher grade levels corresponding to a more difficult to comprehend text. Additionally, we included the Flesch Reading Ease readability index score (scale, 1-100) owing to its simple grade scale convertibility and for comparative relevance with previously published use [38]. In the absence of any, single accepted readability measure for healthcare-related materials, each document was analyzed by 19 unique readability algorithms, each meeting the criteria above (Table 6). Assessment was performed via Readability Studio 2015 (Oleander Software, Ltd, Pune, Maharashtra, India). Descriptions and algorithms for each readability test were adapted from the Readability Studio descriptions (Appendix 1. Supplemental material is available with the online version of CORR®.).

Table 6:
Readability tests with MGL and IQR

Descriptive Statistics

Descriptive statistics were performed on the readability test results, and the median grade level (MGL) and interquartile range (IQR) were reported. Spearman's correlation coefficient was used to determine whether the number of survey items in each PROM correlated with its readability level. All statistical analyses were performed using SPSS Version 22.0 (IBM SPSS Statistics for Macintosh, Armonk, NY, USA).

Readability Improvement Editing Process

For PROMs with mean readability scores above the 8th grade level, the following editing steps, based on the CMS Toolkit for Making Written Material Clear and Effective [90], were instituted. We edited the PROMs by using active voice, simple, short sentences, and a simplified vocabulary [90]. After these three steps, the median grade level (MGL) was reassessed. All PROMs meeting criteria for inclusion underwent each editing step as outlined above (Appendix 2. Supplemental material is available with the online version of CORR®.).


Sixty-four of 86 PROMs (74%) were found to have an MGL at or below the AMA-recommended 6th grade reading level, while 81 of 86 of the scores (94%) were found to be at or below the NIH-recommended 8th grade level (Fig. 1). The overall MGL of independent PROMs was 5.0 (IQR, 4.6-6.1), corresponding to approximately the start of the United States’ 5th grade school year. The investigator-compiled PROMIS® Bank had an MGL of 4.1 (IQR, 3.5-4.8). The four selected PROMIS® Adult Short Forms had an MGL of 4.2 (IQR, 4.2-4.3) (Table 5). The Nottingham Health Profile has the lowest MGL of the independent PROMS (MGL, 2.6; IQR, 0.2-3.8) (Table 1), followed by the American Shoulder and Elbow Surgeons Questionnaire (MGL, 3.8; IQR, 0.9-5.2) (Table 2), Marx Activity Rating Scale (MGL, 3.9; IQR, 2.0-4.5) (Table 3), RAND 20-item Short Form (MGL, 3.9; IQR, 2.1-4.5) (Table 1), and Simple Shoulder Test (MGL, 3.9; IQR, 2.3-4.7) (Table 2). The PROMs with the highest MGLs were the UCLA Activity Score (MGL, 12.1; IQR, 7.0-13.9), Modified Cincinnati Rating System (MGL, 9.1; IQR, 6.2-10.5), Lower Extremity Measure (MGL, 8.9; IQR, 5.5-10.9), Lysholm Knee Score (MGL, 8.4; IQR, 6.7-13.2), and the Tegner Activity Level Scale (MGL, 8.4; IQR, 6.1-9.8) (Table 3). All item banks and short forms of the PROMIS® achieved AMA and NIH recommendations (Table 5).

Fig. 1:
The median grade level (MGL) distribution of of the included independent patient-reported outcome measures (PROMs) are shown. Sixty-four of 86 met the American Medical Association recommendations at or below the 6th grade reading level (black line with *); 81 met the NIH recommendations for the 8th grade reading level (black line with #).

There was no correlation appreciated between the MGL and the number of questions contained in a PROM (r = −0.081; p = 0.460).

Following edits, all five PROMs (UCLA Activity Score, Modified Cincinnati Rating System, Lysholm Knee Score, Tegner Activity Level Scale, Lower Extremity Measure) achieved the NIH-recommended 8th grade level, while three (Modified Cincinnati Rating System, Tegner Activity Level Scale, Lower Extremity Measure) achieved the AMA recommendation of 6th grade level (Fig. 2). Editing of these PROMs improved readability by 4.3 MGL (before: 8.9 [IQR, 8.4-9.1], after: 4.6 [IQR 4.6-6.4]; difference of median, 4.3; p = 0.008).

Fig. 2:
The median reading grade level (MGL) improvements of low readability PROMs (> 8.0 MGL) after the Centers for Medicare & Medicaid-derived editing process are shown. LEM = Lower Extremity Measure; TALS = Tegner Activity Level Scale; LKS = Lysholm Knee Score; MCRS = Modified Cincinnati Rating System; UCLA = University of California, Los Angeles Activity Score.


PROMs have been increasingly implemented in orthopaedic practice to objectively quantify surgical outcomes and assist in guiding surgical decision making [6, 8, 47]. However, their utility was questioned with a recent report suggesting that most PROMs are written at levels too difficult for the average adult to comprehend [38]. That study is limited by the use of only one readability measure, the Flesch Reading Ease. Our study, using multiple readability measures and giving equal weight to each, seeks to assess the true readability of orthopaedic-related PROMs. Therefore, we asked: (1) What proportion of orthopaedic-related PROMs and orthopaedic-related portions of the NIH PROMIS® are written at or below the 6th and 8th grade levels? (2) Is there a correlation between the number of questions in the PROM and reading level? (3) Using systematic edits based on guidelines from the CMS [90], what proportion of PROMs achieved NIH-recommended reading levels?

This study has limitations. First, the readability scores were determined by heeding equal weight to each algorithm used. This could be a weakness of the study, as some formulas could be better equipped and more reliable for use in assessing the readability of healthcare documents, and deserve greater weighting during the determination of MGLs. Additionally, the CMS Toolkit [89] highlights the importance of aesthetics on readability; however, we did not assess such aspects because they could not be analyzed by the software. We also did not evaluate whether the editing process altered the clinical validity and utility of the four selected PROMs. Thus, the possible effects of the editing process on clinical and diagnostic validity merits additional investigation. In addition, this analysis excluded non-English PROMs, as they were unable to be assessed via the readability algorithms used in MGL calculation. Finally, this readability analysis cannot assess the literacy level of PROMs. Readability equations are a numeric method to evaluate PROMs based solely on quantifiable metrics, while literacy involves numerous qualitative factors which this study was not designed to measure. Although having a low MGL does not necessarily translate to higher comprehension and clinical utility, MGLs are the best method currently available to broadly appreciate the level of understanding of healthcare documents among varied patient populations.

The finding that more than 90% of PROMs and all areas of the PROMIS® are written at acceptable reading levels refutes the study by El-Daly et al. [38], which led to fears regarding the widespread failure of PROMs. Based on their assessment, only 12% of PROMs had a reading grade level congruent with the average UK literacy level (reported as 11-year-old students or 6th grade), thus questioning the accuracy and reliability of data obtained through PROMs—a sentiment further endorsed in a response by Brown [20]. Inconsistencies between findings in our study and that by El-Daly et al. likely center on their use of a single readability score, the Flesch Reading Ease. While this readability algorithm is mentioned by the CMS, CDC, and NIH as having utility in assessing patient-related documents, it is not, nor is any other readability algorithm, recognized as a gold standard instrument intended to be used in isolation—each entity encourages the use of multiple readability algorithms, not one test in solitude [23, 90, 128]. In our analysis, the Flesch Reading Ease algorithm yielded the third highest MGL of the 19 readability tests used (Table 6). Additionally, the grade level extrapolation of this index score (with original outputs on a scale of 0-100) has only a 5th grade minimum, likely falsely elevating scores, and with an exaggerated baseline [44]. The Flesch Reading Ease is also a dated measure, and questions have arisen regarding its continued utility in assessing health literature [130]. In short, while the Flesch Reading Ease is a commonly used score, its aggressive grade level conversions and lack of adaptation to modern syntax may make it a poor choice on which to base sweeping PROM readability conclusions, and calls for reform. The potential alarm initiated by El-Daly et al. [38] and endorsed by Brown [20] appears to be overenthusiastic and potentially misleading. However, our findings should be met with guarded optimism. Even though most PROMs are readable to the average American, patients in traditionally low-literacy areas such as the rural southeastern United States where illiteracy rates encompass more than one in three adults [98], may continue to have issues with PROM comprehension. In these areas of decreased literacy, physicians might better serve their patients by selecting PROMs written at a 3rd to 5th grade level [100].

There was no correlation found between numbers of questions in a PROM and associated reading level. While trends of PROM formation are shifting from arduous surveys being multiple pages in length with numerous subsections, to those consisting of short, high-impact questioning [14, 37], it is interesting that reading level is not associated with PROM length. However, the readability algorithms do not assess for possible reader fatigue and document length, but instead analyze sentence and paragraph length. Therefore, while the readability may not be affected in longer, more detailed PROMs, mental fatigue of patients taking the PROMs could play a role. Mental fatigue, studied after traumatic brain injury, has been shown to negatively affect a patient's ability to comprehend new information [62]. Additionally, it has been shown that tired patients are likely to leap to conclusions prematurely [134]. While readability is not affected by PROM length, future research is required to assess the possible effects of reader fatigue on comprehension of the longer PROMs.

Editing according to CMS guidelines improved all PROMs and brought them to or under the 8th grade reading level. These guidelines address many aspects of readability, from text selection to aesthetic appeal; however, in Part 4, Section 3 of the CMS Toolkit, multiple specific suggestions are made, including limiting the number and length of sentences, using the active voice, avoiding acronyms, and using conversational style with nontechnical terms [90]. These were adopted to formulate our editing process (Appendix 2. Supplemental materials are available with the online version of CORR®) which yielded satisfactory results by lowering the MGL of documents with poor readability by 45%, allowing all to score under the 8th grade reading level (Fig. 2). With the emergence of PROMs as clinical and research tools, steps must be taken to ensure improved readability and sustained validity of measures written over recommended reading levels. Although validation of the CMS-based editing process for use with PROMs is necessary, the improvements in MGL after edits are encouraging. The edited PROMs also would need to be revalidated. Research has shown that minor changes may significantly alter the questions being asked, and thus, the nature of the responses [94, 105]. While onerous, this revalidation process for these five high-scoring PROMs may be necessary before the edited PROMs are used for clinical research.

PROMs are increasingly used in patient-centered healthcare and outcomes research. Thus, their readability is vital for accurate, valid responses. We disagree with the previous conclusion that the majority of PROMs used in orthopaedics are “incomprehensible to most patients asked to complete them” [38]. In contrast, our study, the most comprehensive analysis of PROM readability to date, revealed that more than 90% of orthopaedic PROMs are written at or below the 8th grade reading level. Additionally, our study tests a method of editing PROMs to reliably decrease the MGL; validation of this method and of edited PROMs is required. Our analysis contradicts previous concerns and provides confidence for the use of nearly all commonly used PROMs in clinical orthopaedic practice.


1. Agel J, Swiontkowski MF. Guide to outcomes instruments for musculoskeletal trauma research. J Orthop Trauma. 2006;20:8 supplS1-146.
2. Alberta FG, ElAttrache NS, Bissell S, Mohr K, Browdy J, Yocum L, Jobe F. The development and validation of a functional assessment tool for the upper extremity in the overhead athlete. Am J Sports Med. 2010;38:903-911 10.1177/0363546509355642.
    3. Alvey J, Palmer S, Otter S. A comparison of the readability of two patient-reported outcome measures used to evaluate foot surgery. J Foot Ankle Surg. 2012;51:412-414 10.1053/j.jfas.2012.03.001.
    4. Amadio PC, Berquist TH, Smith DK, Ilstrup DM, Cooney WP 3rd, Linscheid RL. Scaphoid malunion. J Hand Surg Am. 1989;14:679-687 10.1016/0363-5023(89)90191-3.
      5. American Academy of Orthopaedic Surgeons. Patient Reported Outcome Mesures. Available at: Accessed November 20, 2016.
      6. Ayers DC. Implementation of patient-reported outcome measures in total knee arthroplasty. J Am Acad Orthop Surg. 2017;25:suppl 1S48-50 10.5435/JAAOS-D-16-00631.
      7. Badarudeen S, Sabharwal S. Assessing readability of patient education materials: current role in orthopaedics. Clin Orthop Relat Res. 2010;468:2572-25803049622 10.1007/s11999-010-1380-y.
      8. Baumhauer JF, Bozic KJ. Value-based healthcare: patient-reported outcomes in clinical decision making. Clin Orthop Relat Res. 2016;474:1375-13784868147 10.1007/s11999-016-4813-4.
      9. Beaton DE, Wright JG, Katz JN. Development of the QuickDASH: comparison of three item-reduction approaches. J Bone Joint Surg Am. 2005;87:1038-1046.
      10. Behrend H, Giesinger K, Giesinger JM, Kuster MS. The “forgotten joint” as the ultimate goal in joint arthroplasty: validation of a new patient-reported outcome measure. J Arthroplasty. 2012;27:430-436e431.
      11. Bellamy N, Buchanan WW, Goldsmith CH, Campbell J, Stitt LW. Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol. 1988;15:1833-1840.
      12. Bennett PJ, Patterson C, Wearing S, Baglioni T. Development and validation of a questionnaire designed to measure foot-health status. J Am Podiatr Med Assoc. 1998;88:419-428 10.7547/87507315-88-9-419.
        13. Binkley JM, Stratford PW, Lott SA, Riddle DL. The Lower Extremity Functional Scale (LEFS): scale development, measurement properties, and clinical application. North American Orthopaedic Rehabilitation Research Network. Phys Ther. 1999;79:371-383.
        14. Black N. Patient reported outcome measures could help transform healthcare. BMJ. 2013;346:f167 10.1136/bmj.f167.
        15. Bolton JE, Breen AC. The Bournemouth Questionnaire: a short-form comprehensive outcome measure. I. Psychometric properties in back pain patients. J Manipulative Physiol Ther. 1999;22:503-510 10.1016/S0161-4754(99)70001-1.
          16. Bolton JE, Humphreys BK. The Bournemouth Questionnaire: a short-form comprehensive outcome measure. II. Psychometric properties in neck pain patients. J Manipulative Physiol Ther. 2002;25:141-148 10.1067/mmt.2002.123333.
            17. Bormuth JR. Readability: a new approach. Reading Res Q. 1966;1:79-132 10.2307/747021.
              18. Bourne RB. Measuring tools for functional outcomes in total knee arthroplasty. Clin Orthop Relat Res. 2008;466:2634-26382565042 10.1007/s11999-008-0468-0.
              19. Bremander AB, Petersson IF, Roos EM. Validation of the Rheumatoid and Arthritis Outcome Score (RAOS) for the lower extremity. Health Qual Life Outcomes. 2003;1:55280699 10.1186/1477-7525-1-55.
                20. Brown TD. CORR Insights(®): Are patient-reported outcome measures in orthopaedics easily read by patients? Clin Orthop Relat Res. 2016;474:256-257 10.1007/s11999-015-4612-3.
                21. Budiman-Mak E, Conrad K, Stuck R, Matters M. Theoretical model and Rasch analysis to develop a revised Foot Function Index. Foot Ankle Int. 2006;27:519-527 10.1177/107110070602700707.
                  22. Caylor J, Sticht T, Fox L, Ford J. Methodologies for determining reading requirements of military occupational specialties. Human Resources Research Organization. Available at: Accessed December 9, 2016.
                    23. Centers for Disease Control and Prevention, U.S. Department of Health and Human Services. Simply Put: A guide for creating easy-to-understand materials. Available at: Accessed November 25, 2016.
                    24. Choudhry AJ, Baghdadi YM, Wagie AE, Habermann EB, Heller SF, Jenkins DH, Cullinane DC, Zielinski MD. Readability of discharge summaries: with what level of information are we dismissing our patients? Am J Surg. 2016;211:631-636 10.1016/j.amjsurg.2015.12.005.
                    25. Chung KC, Pillsbury MS, Walters MR, Hayward RA. Reliability and validity testing of the Michigan Hand Outcomes Questionnaire. J Hand Surg Am. 1998;23:575-587 10.1016/S0363-5023(98)80042-7.
                      26. Coleman M, Liau TL. A computer readability formula designed for machine scoring. J Appl Psychol. 1975;60:283 10.1037/h0076540.
                        27. Constant CR, Murley AH. A clinical method of functional assessment of the shoulder. Clin Orthop Relat Res. 1987;214:160-164.
                        28. Danielson WA, Bryan SD. Computer automation of two readability formulas. Journalism Mass Commun Q. 1963;40:201-206 10.1177/107769906304000207.
                          29. Davis AM, Perruccio AV, Canizares M, Tennant A, Hawker GA, Conaghan PG, Roos EM, Jordan JM, Maillefert JF, Dougados M, Lohmander LS. The development of a short measure of physical function for hip OA HOOS-Physical Function Shortform (HOOS-PS): an OARSI/OMERACT initiative. Osteoarthritis Cartilage. 2008;16:551-559 10.1016/j.joca.2007.12.016.
                            30. Dawson J, Doll H, Boller I, Fitzpatrick R, Little C, Rees J, Jenkinson C, Carr AJ. The development and validation of a patient-reported questionnaire to assess outcomes of elbow surgery. J Bone Joint Surg Br. 2008;90:466-473 10.1302/0301-620X.90B4.20290.
                              31. Dawson J, Fitzpatrick R, Carr A. Questionnaire on the perceptions of patients about shoulder surgery. J Bone Joint Surg Br. 1996;78:593-600.
                              32. Dawson J, Fitzpatrick R, Carr A. The assessment of shoulder instability: the development and validation of a questionnaire. J Bone Joint Surg Br. 1999;81:420-426 10.1302/0301-620X.81B3.9044.
                                33. Dawson J, Fitzpatrick R, Carr A, Murray D. Questionnaire on the perceptions of patients about total hip replacement. J Bone Joint Surg Br. 1996;78:185-190 10.2106/00004623-199602000-00004.
                                  34. Dawson J, Fitzpatrick R, Murray D, Carr A. Questionnaire on the perceptions of patients about total knee replacement. J Bone Joint Surg Br. 1998;80:63-69 10.1302/0301-620X.80B1.7859.
                                    35. Oliveira GS, Jr McCarthy RJ, Wolf MS, Holl J. The impact of health literacy in the care of surgical patients: a qualitative systematic review. BMC Surg. 2015;15:864504415 10.1186/s12893-015-0073-6.
                                    36. Domsic RT, Saltzman CL. Ankle osteoarthritis scale. Foot Ankle Int. 1998;19:466-471 10.1177/107110079801900708.
                                      37. Edwards P, Roberts I, Clarke M, DiGuiseppi C, Pratap S, Wentz R, Kwan I. Increasing response rates to postal questionnaires: systematic review. BMJ. 2002;324:1183111107 10.1136/bmj.324.7347.1183.
                                      38. El-Daly I, Ibraheim H, Rajakulendran K, Culpan P, Bates P. Are patient-reported outcome measures in orthopaedics easily read by patients? Clin Orthop Relat Res. 2016;474:246-255 10.1007/s11999-015-4595-0.
                                      39. EuroQol Group. EuroQol: a new facility for the measurement of health-related quality of life. Health Policy. 1990;16:199-208 10.1016/0168-8510(90)90421-9.
                                        40. EuroQol Research Foundation. How to obtain EQ-5D. Available at: Accessed December 7, 2016.
                                          41. Fairbank JC, Couper J, Davies JB, O'Brien JP. The Oswestry low back pain disability questionnaire. Physiotherapy. 1980;66:271-273.
                                          42. Fairbank JC, Pynsent PB. The Oswestry Disability Index. Spine (Phila Pa 1976). 2000;25:2940-2952; discussion 2952.
                                            43. Flesch RF. A new readability yardstick. J Appl Psychol. 1948;32:221-233 10.1037/h0057532.
                                              44. Flesch RF. How to Write PlainEenglish: A Book for Lawyers and Consumers 1979;New York, NYHarper and Row.
                                              45. Friedman DB, Hoffman-Goetz L. A systematic review of readability and comprehension instruments used for print and web-based cancer information. Health Educ Behav. 2006;33:352-373 10.1177/1090198105277329.
                                              46. Fries JF, Spitz P, Kraines RG, Holman HR. Measurement of patient outcome in arthritis. Arthritis Rheum. 1980;23:137-145 10.1002/art.1780230202.
                                                47. Greene ME, Rolfson O, Gordon M, Garellick G, Nemes S. Standard comorbidity measures do not predict patient-reported outcomes 1 year after total hip arthroplasty. Clin Orthop Relat Res. 2015;473:3370-33794586242 10.1007/s11999-015-4195-z.
                                                48. Gunning R. The Technique of Clear Writing 1952;New York, NYMcGraw-Hill.
                                                  49. Hale SA, Hertel J. Reliability and sensitivity of the Foot and Ankle Disability Index in subjects with chronic ankle instability. J Athl Train. 2005;40:35-401088343.
                                                  50. Harris AJ, Jacobson MD. Basic Reading Vocabularies 1982;New York, NYMacmillan.
                                                    51. Hildebrand KA, Buckley RE, Mohtadi NG, Faris P. Functional outcome measures after displaced intra-articular calcaneal fractures. J Bone Joint Surg Br. 1996;78:119-123.
                                                    52. Horner SD, Surratt D, Juliusson S. Improving readability of patient education materials. J Community Health Nurs. 2000;17:15-23 10.1207/S15327655JCHN1701_02.
                                                    53. Hudak PL, Amadio PC, Bombardier C. Development of an upper extremity outcome measure: the DASH (disabilities of the arm, shoulder and hand) [corrected]. The Upper Extremity Collaborative Group (UECG). Am J Ind Med. 1996;29:602-608 10.1002/(SICI)1097-0274(199606)29:6<602::AID-AJIM4>3.0.CO;2-L.
                                                      54. Hudson-Cook N, Tomes-Nicholson K, Breen A. Roland M, Jenner J. Revised Oswestry disability questionnaire. Back Pain: New Approaches to Rehabilitation and Education 1989;New York, NYManchester University Press187-204.
                                                        55. Hunt SM, McKenna SP, McEwen J, Backett EM, Williams J, Papp E. A quantitative approach to perceived health status: a validation study. J Epidemiol Community Health. 1980;34:281-2861052092 10.1136/jech.34.4.281.
                                                          56. Insall JN, Dorr LD, Scott RD, Scott WN. Rationale of the Knee Society clinical rating system. Clin Orthop Relat Res. 1989;248:13-14.
                                                          57. Irrgang JJ, Anderson AF, Boland AL, Harner CD, Kurosaka M, Neyret P, Richmond JC, Shelborne KD. Development and validation of the international knee documentation committee subjective knee form. Am J Sports Med. 2001;29:600-613.
                                                          58. Irrgang JJ, Snyder-Mackler L, Wainner RS, Fu FH, Harner CD. Development of a patient-reported measure of function of the knee. J Bone Joint Surg Am. 1998;80:1132-1145 10.2106/00004623-199808000-00006.
                                                            59. Jaglal S, Lakhani Z, Schatzker J. Reliability, validity, and responsiveness of the lower extremity measure for patients with a hip fracture. J Bone Joint Surg Am. 2000;82:955-962 10.2106/00004623-200007000-00007.
                                                              60. Johanson NA, Charlson ME, Szatrowski TP, Ranawat CS. A self-administered hip-rating questionnaire for the assessment of outcome after total hip replacement. J Bone Joint Surg Am. 1992;74:587-597 10.2106/00004623-199274040-00015.
                                                                61. Johanson NA, Liang MH, Daltroy L, Rudicel S, Richmond J. American Academy of Orthopaedic Surgeons lower limb outcomes assessment instruments: reliability, validity, and sensitivity to change. J Bone Joint Surg Am. 2004;86:902-909 10.2106/00004623-200405000-00003.
                                                                  62. Johansson B, Berglund P, Ronnback L. Mental fatigue and impaired information processing after mild and moderate traumatic brain injury. Brain Inj. 2009;23:1027-1040 10.3109/02699050903421099.
                                                                  63. Jordan A, Manniche C, Mosdal C, Hindsberger C. The Copenhagen Neck Functional Disability Scale: a study of reliability and validity. J Manipulative Physiol Ther. 1998;21:520-527.
                                                                  64. Juul T, Sogaard K, Roos EM, Davis AM. Development of a patient-reported outcome: the Neck OutcOme Score (NOOS): content and construct validity. J Rehabil Med. 2015;47:844-853 10.2340/16501977-2013.
                                                                    65. Kaikkonen A, Kannus P, Jarvinen M. A performance test protocol and scoring scale for the evaluation of ankle injuries. Am J Sports Med. 1994;22:462-469 10.1177/036354659402200405.
                                                                      66. Kincaid JP, Fishburne RP Jr, Rogers RL, Chissom BS. Derivation of new readability formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy enlisted personnel. Naval Air Station Memphis, Millington, Tennessee. Available at: Accessed December 9, 2016.
                                                                        67. King GJ, Richards RR, Zuckerman JD, Blasier R, Dillman C, Friedman RJ, Gartsman GM, Iannotti JP, Murnahan JP, Mow VC, Woo SL. A standardized method for assessment of elbow function. Research Committee, American Shoulder and Elbow Surgeons. J Shoulder Elbow Surg. 1999;8:351-354.
                                                                        68. Kirkley A, Alvarez C, Griffin S. The development and evaluation of a disease-specific quality-of-life questionnaire for disorders of the rotator cuff: the Western Ontario Rotator Cuff Index. Clin J Sport Med. 2003;13:84-92 10.1097/00042752-200303000-00004.
                                                                          69. Kirkley A, Griffin S, McLintock H, Ng L. The development and evaluation of a disease-specific quality of life measurement tool for shoulder instability: the Western Ontario Shoulder Instability Index (WOSI). Am J Sports Med. 1998;26:764-772.
                                                                          70. Kirkley A, Griffin S, Whelan D. The development and validation of a quality of life-measurement tool for patients with meniscal pathology: the Western Ontario Meniscal Evaluation Tool (WOMET). Clin J Sport Med. 2007;17:349-356 10.1097/JSM.0b013e31814c3e15.
                                                                            71. Kohn D, Geyer M. The subjective shoulder rating system. Arch Orthop Trauma Surg. 1997;116:324-328 10.1007/BF00433982.
                                                                              72. Kopec JA, Esdaile JM, Abrahamowicz M, Abenhaim L, Wood-Dauphinee S, Lamping DL, Williams JI. The Quebec Back Pain Disability Scale: measurement properties. Spine (Phila Pa 1976). 1995;20:341-352.
                                                                                73. Kujala UM, Jaakkola LH, Koskinen SK, Taimela S, Hurme M, Nelimarkka O. Scoring of patellofemoral disorders. Arthroscopy. 1993;9:159-163 10.1016/S0749-8063(05)80366-4.
                                                                                  74. Kurer M, Gooding C. Orthopaedic Scores. Available at: Accessed November 20, 2016.
                                                                                  75. L'Insalata JC, Warren RF, Cohen SB, Altchek DW, Peterson MG. A self-administered questionnaire for assessment of symptoms and function of the shoulder. J Bone Joint Surg Am. 1997;79:738-748 10.2106/00004623-199705000-00014.
                                                                                    76. Lawlis GF, Cuencas R, Selby D, McCoy CE. The development of the Dallas Pain Questionnaire: an assessment of the impact of spinal pain on behavior. Spine (Phila Pa 1976). 1989;14:511-516.
                                                                                      77. Lequesne MG, Mery C, Samson M, Gerard P. Indexes of severity for osteoarthritis of the hip and knee: validation: value in comparison with other assessment tests. Scand J Rheumatol Suppl. 1987;65:85-89 10.3109/03009748709102182.
                                                                                        78. Levine DW, Simmons BP, Koris MJ, Daltroy LH, Hohl GG, Fossel AH, Katz JN. A self-administered questionnaire for the assessment of severity of symptoms and functional status in carpal tunnel syndrome. J Bone Joint Surg Am. 1993;75:1585-1592 10.2106/00004623-199311000-00002.
                                                                                          79. Lo IK, Griffin S, Kirkley A. The development of a disease-specific quality of life measurement tool for osteoarthritis of the shoulder: the Western Ontario Osteoarthritis of the Shoulder (WOOS) index. Osteoarthritis Cartilage. 2001;9:771-778 10.1053/joca.2001.0474.
                                                                                            80. Lyman S, Lee YY, Franklin PD, Li W, Cross MB, Padgett DE. Validation of the KOOS, JR: a short-form knee arthroplasty outcomes survey. Clin Orthop Relat Res. 2016;474:1461-14714868168 10.1007/s11999-016-4719-1.
                                                                                              81. Lyman S, Lee YY, Franklin PD, Li W, Mayman DJ, Padgett DE. Validation of the HOOS, JR: a short-form hip replacement survey. Clin Orthop Relat Res. 2016;474:1472-14824868170 10.1007/s11999-016-4718-2.
                                                                                                82. Lysholm J, Gillquist J. Evaluation of knee ligament surgery results with special emphasis on use of a scoring scale. Am J Sports Med. 1982;10:150-154 10.1177/036354658201000306.
                                                                                                  83. MacDermid JC, Turgeon T, Richards RS, Beadle M, Roth JH. Patient rating of wrist pain and disability: a reliable and valid measurement tool. J Orthop Trauma. 1998;12:577-586 10.1097/00005131-199811000-00009.
                                                                                                    84. Majeed SA. Grading the outcome of pelvic fractures. J Bone Joint Surg Br. 1989;71:304-306.
                                                                                                    85. Martin RL, Irrgang JJ, Burdett RG, Conti SF, Swearingen JM. Evidence of validity for the Foot and Ankle Ability Measure (FAAM). Foot Ankle Int. 2005;26:968-983 10.1177/107110070502600905.
                                                                                                      86. Marx RG, Stump TJ, Jones EC, Wickiewicz TL, Warren RF. Development and evaluation of an activity rating scale for disorders of the knee. Am J Sports Med. 2001;29:213-218.
                                                                                                      87. Matsen FA 3rd, Ziegler DW, DeBartolo SE. Patient self-assessment of health status and function in glenohumeral degenerative joint disease. J Shoulder Elbow Surg. 1995;4:345-351 10.1016/S1058-2746(95)80018-2.
                                                                                                        88. Matsumoto M, Baba T, Homma Y, Kobayashi H, Ochi H, Yuasa T, Behrend H, Kaneko K. Validation study of the Forgotten Joint Score-12 as a universal patient-reported outcome measure. Eur J Orthop Surg Traumatol. 2015;25:1141-1145 10.1007/s00590-015-1660-z.
                                                                                                        89. McGee J, Centers for Medicare and Medicaid Services, U.S. Department of Health and Human Services. Toolkit for Making Written Material Clear and Effective, Section 4: Special Topics for Writing and Design. Available at: Accessed November 25, 2016.
                                                                                                        90. McGee J, Centers for Medicare and Medicaid Services, U.S. Department of Health and Human Services. Toolkit Part 4: Guidelines for Writing. Available at: Accessed November 20, 2016.
                                                                                                        91. McLaughlin HG. SMOG grading: a new readability formula. J Reading. 1969;12:639-646.
                                                                                                        92. Melzack R. The McGill Pain Questionnaire: major properties and scoring methods. Pain. 1975;1:277-299 10.1016/0304-3959(75)90044-5.
                                                                                                          93. Melzack R. The short-form McGill Pain Questionnaire. Pain. 1987;30:191-197 10.1016/0304-3959(87)91074-8.
                                                                                                            94. Miller PR. Tipsheet: Question Wording. Duke Initiative on Survey Methodology,. Available at: Accessed February 24, 2017.
                                                                                                            95. Morgan S. Miscommunication between patients and general practitioners: implications for clinical practice. J Prim Health Care. 2013;5:123-128.
                                                                                                            96. Morrey B. Morrey B. Functional Evaluation of the Elbow. The Elbow and its Disorders 2000;Philadelphia, PAWB Saunders Co74-83.
                                                                                                              97. Naal FD, Hatzung G, Muller A, Impellizzeri F, Leunig M. Validation of a self-reported Beighton score to assess hypermobility in patients with femoroacetabular impingement. Int Orthop. 2014;38:2245-2250 10.1007/s00264-014-2424-9.
                                                                                                              98. National Center for Education Statistics, U.S. Department of Education. National Assessment of Adult Literacy: State and County Estimates of Low Literacy. Available at: Accessed February 24, 2017.
                                                                                                              99. National Center for Health Marketing, Centers for Disease Control and Prevention. What We Know About Health Literacy. Available at: Accessed November 20, 2016.
                                                                                                              100. National Institutes of Health, U.S. Department of Health and Human Services. Clear & Simple: What is Clear & Simple? Available at: Accessed February 24, 2017.
                                                                                                              101. Nilsdotter AK, Lohmander LS, Klassbo M, Roos EM. Hip disability and osteoarthritis outcome score (HOOS): validity and responsiveness in total hip replacement. BMC Musculoskelet Disord. 2003;4:10161815 10.1186/1471-2474-4-10.
                                                                                                                102. Noyes FR, Barber SD, Mooar LA. A rationale for assessing sports activity levels and limitations in knee disorders. Clin Orthop Relat Res. 1989;246:238-249.
                                                                                                                103. Pace CC, Atcherson SR, Zraick RI. A computer-based readability analysis of patient-reported outcome questionnaires related to oral health quality of life. Patient Educ Couns. 2012;89:76-81 10.1016/j.pec.2012.05.010.
                                                                                                                104. Perruccio AV, Stefan Lohmander L, Canizares M, Tennant A, Hawker GA, Conaghan PG, Roos EM, Jordan JM, Maillefert JF, Dougados M, Davis AM. The development of a short measure of physical function for knee OA KOOS-Physical Function Shortform (KOOS-PS): an OARSI/OMERACT initiative. Osteoarthritis Cartilage. 2008;16:542-550 10.1016/j.joca.2007.12.014.
                                                                                                                  105. Pew Research Center. Questionnaire Design. Available at: Accessed February 24, 2017.
                                                                                                                  106. Pincus T, Summey JA, Soraci SA Jr, Wallston KA, Hummon NP. Assessment of patient satisfaction in activities of daily living using a modified Stanford Health Assessment Questionnaire. Arthritis Rheum. 1983;26:1346-1353 10.1002/art.1780261107.
                                                                                                                    107. Powers RD, Sumner WA, Kearl BE. A recalculation of four adult readability formulas. J Educ Psychol. 1958;49:99 10.1037/h0043254.
                                                                                                                      108. Ramkumar PN, Harris JD, Noble PC. Patient-reported outcome measures after total knee arthroplasty: a systematic review. Bone Joint Res. 2015;4:120-1274602194 10.1302/2046-3758.47.2000380.
                                                                                                                      109. Richards RR, An KN, Bigliani LU, Friedman RJ, Gartsman GM, Gristina AG, Iannotti JP, Mow VC, Sidles JA, Zuckerman JD. A standardized method for the assessment of shoulder function. J Shoulder Elbow Surg. 1994;3:347-352 10.1016/S1058-2746(09)80019-0.
                                                                                                                        110. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arthritis Care Res. 1991;4:143-149 10.1002/art.1790040403.
                                                                                                                        111. Roland M, Morris R. A study of the natural history of back pain: Part I. development of a reliable and sensitive measure of disability in low-back pain. Spine (Phila Pa 1976). 1983;8:141-144.
                                                                                                                          112. Roos EM, Brandsson S, Karlsson J. Validation of the foot and ankle outcome score for ankle ligament reconstruction. Foot Ankle Int. 2001;22:788-794 10.1177/107110070102201004.
                                                                                                                            113. Roos EM, Roos HP, Lohmander LS, Ekdahl C, Beynnon BD. Knee Injury and Osteoarthritis Outcome Score (KOOS): development of a self-administered outcome measure. J Orthop Sports Phys Ther. 1998;28:88-96 10.2519/jospt.1998.28.2.88.
                                                                                                                              114. Saleh KJ, Mulhall KJ, Bershadsky B, Ghomrawi HM, White LE, Buyea CM, Krackow KA. Development and validation of a lower-extremity activity scale: use for patients treated with revision total knee arthroplasty. J Bone Joint Surg Am. 2005;87:1985-1994 10.2106/JBJS.D.02564.
                                                                                                                                115. Sheppard ED, Hyde Z, Florence MN, McGwin G, Kirchner JS, Ponce BA. Improving the readability of online foot and ankle patient education materials. Foot Ankle Int. 2014;35:1282-1286 10.1177/1071100714550650.
                                                                                                                                116. Smith EA, Senter R. Automated readability index. Aerospace Medical Research Laboratories, Aerospace Medical Division, Air Force Systems Command. Available at: Accessed December 9, 2016.
                                                                                                                                  117. Smith LL. Using a modified SMOG in primary and intermediate grades. Reading Horizons. 1984;24:129-132.
                                                                                                                                    118. Smith SG, Curtis LM, Wardle J, Wagner C, Wolf MS. Skill set or mind set? Associations between health literacy, patient activation and health. PLoS One. 2013;8:e743733762784 10.1371/journal.pone.0074373.
                                                                                                                                    119. Smith SK, Dixon A, Trevena L, Nutbeam D, McCaffery KJ. Exploring patient involvement in healthcare decision making across different education and functional health literacy groups. Soc Sci Med. 2009;69:1805-1812 10.1016/j.socscimed.2009.09.056.
                                                                                                                                    120. Spache G. A new readability formula for primary-grade reading materials. The Elementary School Journal. 1953;53:410-413 10.1086/458513.
                                                                                                                                      121. Stewart AL, Hays RD, Ware JE Jr, The MOS short-form general health survey. Reliability and validity in a patient population. Med Care. 1988;26:724-735 10.1097/00005650-198807000-00007.
                                                                                                                                        122. Swiontkowski MF, Engelberg R, Martin DP, Agel J. Short musculoskeletal function assessment questionnaire: validity, reliability, and responsiveness. J Bone Joint Surg Am. 1999;81:1245-1260 10.2106/00004623-199909000-00006.
                                                                                                                                          123. Tartaglione JP, Rosenbaum AJ, Abousayed M, Hushmendy SF, DiPreta JA. Evaluating the quality, accuracy, and readability of online resources pertaining to hallux valgus. Foot Ankle Spec. 2016;9:17-23 10.1177/1938640015592840.
                                                                                                                                          124. Tegner Y, Lysholm J. Rating systems in the evaluation of knee ligament injuries. Clin Orthop Relat Res. 1985;198:43-49.
                                                                                                                                          125. Templeman D, Goulet J, Duwelius PJ, Olson S, Davidson M. Internal fixation of displaced fractures of the sacrum. Clin Orthop Relat Res. 1996;329:180-185 10.1097/00003086-199608000-00021.
                                                                                                                                            126. Thorborg K, Holmich P, Christensen R, Petersen J, Roos EM. The Copenhagen Hip and Groin Outcome Score (HAGOS): development and validation according to the COSMIN checklist. Br J Sports Med. 2011;45:478-491 10.1136/bjsm.2010.080937.
                                                                                                                                              127. U.S. Department of Health and Human Services, Office of Disease Prevention and Health Promotion. National Action Plan to Improve Health Literacy. Available at: Accessed November 20, 2016.
                                                                                                                                              128. U.S. National Library of Medicine, National Institutes of Health, U.S. Department of Health and Human Services. How to Write Easy-to-Read Health Materials. Available at: Accessed November 20, 2016.
                                                                                                                                              129. Vernon H, Mior S. The Neck Disability Index: a study of reliability and validity. J Manipulative Physiol Ther. 1991;14:409-415.
                                                                                                                                              130. Wang LW, Miller MJ, Schmitt MR, Wen FK. Assessing readability formula differences with written health information materials: application, results, and recommendations. Res Social Adm Pharm. 2013;9:503-516 10.1016/j.sapharm.2012.05.009.
                                                                                                                                              131. Ware JE Jr, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34:220-233 10.1097/00005650-199603000-00003.
                                                                                                                                                132. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36): I. conceptual framework and item selection. Med Care. 1992;30:473-483 10.1097/00005650-199206000-00002.
                                                                                                                                                  133. Washburn RA, Smith KW, Jette AM, Janney CA. The Physical Activity Scale for the Elderly (PASE): development and evaluation. J Clin Epidemiol. 1993;46:153-162 10.1016/0895-4356(93)90053-4.
                                                                                                                                                    134. Webster DM, Richter L, Kruglanski AW. On leaping to conclusions when feeling tired: mental fatigue effects on impressional primacy. J Exper Social Psychol. 1996;32:181-195 10.1006/jesp.1996.0009.
                                                                                                                                                    135. Weiss BD. Health Literacy and Patient Safety: Help Patients Understand. Manual for Clinicians. 2nd ed. American Medical Association Foundation and the American Medical Association. Available at: Accessed November 20, 2016.
                                                                                                                                                    136. Zahiri CA, Schmalzried TP, Szuszczewicz ES, Amstutz HC. Assessing activity in joint replacement patients. J Arthroplasty. 1998;13:890-895 10.1016/S0883-5403(98)90195-4.

                                                                                                                                                      Supplementary material 1 (DOCX 17 kb)

                                                                                                                                                      Supplementary material 2 (DOCX 16 kb)

                                                                                                                                                      © 2017 Lippincott Williams & Wilkins LWW