Secondary Logo

Journal Logo

Special Communication

Creating an Assessment Tool for Clinical Musculoskeletal Knowledge

Yu, Jonathan MD; Li, Alexander D. MD; Leggit, Jeffrey C. MD, CAQSM

Author Information
Current Sports Medicine Reports: February 2021 - Volume 20 - Issue 2 - p 124-128
doi: 10.1249/JSR.0000000000000812



Musculoskeletal (MSK) complaints comprise an estimated 20% of outpatient visits especially in the primary care and emergency room setting, second only to upper respiratory complaints (1). MSK diseases affect 1 in 2 people in the United States older than 18 years and nearly 3 of 4 patients older than 65 years (2). Thus, it is imperative that health care providers have a solid foundation in managing the most typical MSK complaints seen in an outpatient setting.

In 2002, President George W. Bush declared 2002 to 2011 the “United States Bone and Joint” decade, providing needed national attention to the growing morbidity of MSK disease. The amount of required preclinical and clinical MSK instruction increased over that decade (3). However, recent studies have shown that health professional graduates continue to be inadequately prepared to address their patients' MSK complaints (4–7). Graduates across health professions and internationally recognize the breadth and prevalence of MSK conditions and have expressed a lack of confidence in their practice. In a cross-sectional survey of just under 300 primary care physicians, 80% of participants stated that they had a low confidence level in doing an MSK physical examination (8). In a UK study, a noticeable gap in medical school education was noted particularly in physical activity (PA) promotion and prescription, a useful tool in primary care (9). Nurse practitioners (NP) are not immune to this either, with an online self-reporting survey showing that most NPs undergo less than 10 h of MSK education (10). This gap in skill can result in not just inadequate treatment of MSK complaints but can have adverse consequences in other realms of medicine. MSK disease has a negative effect on the level of PA that patients may otherwise have if healthy, leading to increased risk or progression of comorbidities, such as diabetes or atherosclerotic disease (11). Understanding fundamental MSK knowledge can assist in providing patients with the broad benefits of an exercise prescription. Prescribed PA compared with pharmaceutical interventions often show similar if not improved efficacy in both primary and secondary prevention of cardiovascular disease, rehabilitation after stroke, and treatment of heart failure, with PA costing far less with no risk of pharmaceutical side effects (12). In response to the increased call for MSK education, programs have increased the MSK content in their curriculum or developed intensive high-yield training sessions (12,13). Unfortunately, there remains a dearth of curriculum assessment tools. One often used MSK-specific examination, the Freedman and Bernstein's (FB) examination, has limitations in implementation (6). The FB examination's development, validation, and relevance have come into question (6). Multiple institutions have distributed this examination and found that the results are lower than expected for the level of training. One example is a cohort of board-certified primary care providers about to enter a sports medicine fellowship who took the FB examination. Only 76% scored a passing grade greater than 70%, with an average score of 76.8%, much lower than would be expected for a group of board-certified physicians with specific interest in MSK knowledge (6). Additionally, the fill-in-the-blank format of the FB examination causes subjectivity in examination grading and assessment, making this examination less reliable. Different graders scored the same answers for the same questions differently because of each grader's interpretation of the grading instructions, as well as giving credit for answers that were not a part of the original examination answer key (6). Thus a new 30-question multiple-choice assessment was created to evaluate trainees on management of the most common MSK-related complaints in current clinical practice.

The proposed MSK knowledge assessment is a tool designed to cover the most high-yield MSK topics seen in primary care populations and identify both individual trainee weaknesses as well as overall gaps in the MSK curriculum of medical schools and primary care graduate medical education. The results of the tool can be used to target and strengthen those deficiencies, as well as identify individuals who may need additional training in a specific area. Detailed explanations and references for each question were created as well. The detailed explanations can serve as a didactic tool on an individual or group level, depending how an individual or institution chooses to use it. This study describes the second iteration of the assessment tool known as “MSK30 2.0” examination created by Cummings et al. (6) and represents a continuation of the project to assess and improve the reliability of the examination.


The first iteration of this study was aimed at developing a set of assessment questions and testing their validity. The questions were selected using a modified delphi technique. The selection group consisted of two orthopedic surgeons, seven primary care sports medicine physicians, one family physician, one physical therapist, one rheumatologist, and six primary care sports medicine fellows (board-certified in physical medicine and rehabilitation, family medicine or emergency medicine). A broad group of subject areas was presented to the panel and agreement on the most important topics was determined via a numerical scoring system. From there, a 100-item question bank was narrowed down to 45 questions by the same numerical scoring system. This 45-question examination was administered to a cohort of fourth-year medical students (MS4) as a pilot test. The results of this pilot test were used to develop MSK30 1.0 which was then administered to a single Family Medicine postgraduate year 1 (PGY1) class and an entire MS4 class. The results of this iteration were published by Cummings et al. (6) and helped to established validity.

The main aim of this current study was to assess the internal consistency of the examination over a larger and diverse cohort. Testing occurred over two academic years. The second academic year's assessment was adjusted based on results from the previous year's analysis. Adjustments included removing, adding, and amending questions in a manner explained in more detail in the following paragraphs to create the current version, MSK30 2.0.

Test Year 1 with MSK30 1.0

We administered the examination to postcore clerkship students at three United States medical schools and primary care and transitional year interns and residents across multiple sites to include 1 transitional year program and 10 military and civilian family medicine programs and In addition to assessment questions, four demographics questions were used for analysis purposes (Appendix A, All results were anonymous. We administered the examination via a SurveyMonkey® link and collected results on an Excel spreadsheet. We performed statistical analysis via Excel functions and SPSS which identified overall score average, average by years of training, average of MD and DO examination takers, average of military and civilian examination takers, percent of examination takers who answered correctly for each examination question, and Cronbach's α for each examination question. Cronbach's α is a measure of internal consistency and reliability. It asks the question, “Does this data point correspond consistently with each other data point as well as the overall average values of the entire data set?” A Cronbach's α value of 0 corresponds to no consistency whereas 1 corresponds to perfect consistency (i.e., if the overall average of an examination is 99% but there is a question that everyone keeps getting wrong, that question has a Cronbach's α of near-0 since it is not consistent with the overall examination; therefore, that question is either extremely difficult or unclearly worded). After we gathered and analyzed the results, we reviewed and edited the examination to improve reliability and consistency. We identified examination questions that contributed to a low Cronbach's α values and determined whether or not they needed to be reworded, completely removed, or kept as a challenging question for examination takers via a modified delphi technique. This resulted in a refined examination to be used in year 2. This refined examination is called MSK30 2.0 (Appendices B and C, Six additional questions were added to MSK 30 2.0. Five are variations of current questions in order to develop a bank of questions to have multiple variations of the MSK 30 tool and one new question topic was added.

Test Year 2 with MSK30 2.0

We sent the MSK30 2.0 to previously participating sites via SurveyMonkey® links to administer to a new set of learners. We collected results on Excel spreadsheets and statistical analysis was again performed via Excel functions and SPSS.


Test Year 1

We administered the MSK30 1.0 assessment tool to 135 medical students, 49 PGY1, 10 PGY2, and 10 PGY3. The mean score was 76.8% (SD = 8.19%) with improvement based on training years (MS3/4s = 75.4%, PGY1 = 75.9%, PGY2 = 81%, PGY3 = 81.7%). No statistical differences were found between MD/DO graduates (77.9% [SD = 7.67%] vs. 76.9 [SD = 8.01%]) or military/civilian learners (78.4% [SD = 8.1%] vs. 73.1% [SD = 6.6%]). Overall reliability as determined by Cronbach's α was 0.379, which corresponds to poor reliability. We identified several questions with low percent correct and reexamined them (Fig. 1).

Figure 1:
Percent correct of each question for MSK30 version 1.0.

Test Year 2

The MSK30 2.0 assessment tool was taken by 181 medical students, 35 PGY1, 7 PGY2, 3 PGY3, and 3 faculty across several military and civilian residency programs. The mean score was 75.6% (SD = 7.8%). The revised assessment tool showed improved reliability (Cronbach's α = 0.432). Figure 2 shows percentage answered correctly for each question in version 2.0. The new examination showed a statistically significant linear association in test scores with higher levels of training (P = 0.048) (Fig. 3). Scores did not differ significantly between MD and DO graduates (79.9% [SD = 6.8] vs 78.8% [SD = 7.6%], P = 0.65) or males and females (75.2% [SD = 7.0%] vs 76.4% [SD = 7.4%], P = 0.22). Percent correct for each MSK subject category is shown in Figure 4.

Figure 2:
Percent correct of each question for MSK30 2.0.
Figure 3:
Scores of MSK30 2.0 given by “percent correct” across the different training levels in test year 2.
Figure 4:
Percent correct for each category, MSK30 2.0.


Expanding on the results of the previous valid MSK assessment tool, we expanded the testing pool. The results of the MSK30 2.0 showed continued validity and internal consistency. As expected there was a trend in increased scores with years of training. Discriminant validity is demonstrated by the results showing no difference in scores between sex and MD/DO examinees. Finally, the examination appears to have content validity because the examination questions were agreed upon in a modified Delphi process by a broad group of subject matter experts.

The initial MSK30 1.0 assessment tool with poor reliability (Cronbach's α = 0.379) was revised based on an item analysis that identified questions that contributed to a lower Cronbach's α and an inspection of the most common incorrect responses for these questions. Some questions were edited and others were replaced, and a new MSK30 2.0 was administered in 2019. The new MSK30 2.0 question test has an improved reliability (Cronbach's α = 0.432). Cronbach's α of 0.50 or higher is generally considered to be of moderate reliability. To address low reliability for MSK 30 Version 1.0, we identified questions with low quality using an item discrimination and edited questions for clarity. An item with low quality is one that does not discriminate well between varying levels of mastery. A good quality question is one that “better students” answer correctly and “poorer” students answer incorrectly.

An example of our editing process of a low-quality question can be seen below (correct answer is bolded).

An 18-year-old football player injured his foot and ankle after it was stepped on during a game. He is able to bear weight on the foot but has significant pain in the midfoot region. Which of the following findings on history and physical examination would be an indication for X-rays?

  • A) Pain with weight-bearing on injured foot/ankle
  • B) Tenderness on palpation of the 1st/2nd metatarsal bases
  • C) Tenderness over the lateral foot distal to the fibula
  • D) Tenderness at the anterior aspect of the medial malleolus

Analysis showed that only 30% of test-takers answered correctly leading to the reliability of the question to negatively skew Cronbach's α. “A” was instead found to be the most commonly selected answer. The goal was for the examinee to first remember the Ottawa Foot Rules and second recognize that pain in the midfoot region is a criteria. Assuming that the examinee did all this correctly, then he or she would interpret the location of the first/second metatarsal bases as essentially the midfoot area, thus choice “B” is the correct answer. A discussion with the research team found that the answer choices may not have been clear enough and thus the examinee may have not chosen “B” as it did not specifically state the midfoot region, leading them to choose “A” over “B” even though it is inability to bear weight, not pain with weight bearing that constitutes meeting criteria of the Ottawa Foot Rules. Answer choice “B” was then changed to state, “Tenderness on palpation of the navicular bone.” With this change, 67% of examinees correctly answered the question in MSK30 2.0. Items that are too difficult generally do not discriminate well between different levels of mastery. All questions below 50% were re-examined and showed improvement in the second year of testing (Figs. 1 and 2) except for questions 21 and 22. Analysis of question 21 found that learners were not familiar with the term “Sever's disease” so calcaneal apophysitis was added to the answer choice for future iterations.

In conclusion, the MSK30 2.0 is a valid method of assessing clinical MSK knowledge. The examination content is an ongoing process. Although the current version is just below the standard of a moderately reliable examination, the authors plan to continually evaluate low quality questions and revise as deemed appropriate. In addition, examination content will be reviewed periodically by the same Delphi process used to create the initial examination. In the future, we hope to create three versions of the assessment tool, so that it can be used to assess learners during a 3-year residency. The detailed didactic tool can be used by programs to teach or reinforce the MSK topic areas. During these didactic sessions, hands-on portions of the MSK assessments can be reviewed as well.

The authors declare no conflict of interest and do not have any financial disclosures.


1. DiGiovanni BF, Sundem LT, Southgate RD, Lambert DR. Musculoskeletal medicine is underrepresented in the American medical school clinical curriculum. Clin. Orthop. Relat. Res. 2016; 474:901–7.
2. United States Bone and Joint Initiative. The Burden of Musculoskeletal Diseases in the United States. Rosemont, IL, USA: American Academy of Orthopaedic Surgeons; 2016.
3. Bernstein J, Garcia GH, Guevara JL, Mitchell GW. Progress report: the prevalence of required medical school instruction in musculoskeletal medicine at decade's end. Clin. Orthop. Relat. Res. 2011; 469:895–7.
4. Day CS, Yeh AC, Franko O, et al. Musculoskeletal medicine: an assessment of the attitudes and knowledge of medical students at Harvard Medical School. Acad. Med. 2007; 82:452–7.
5. Comer GC, Liang E, Bishop JA. Lack of proficiency in musculoskeletal medicine among emergency medicine physicians. J. Orthop. Trauma. 2014; 28:e85–7.
6. Cummings DL, Smith M, Merrigan B, et al. MSK30: a validated tool to assess clinical musculoskeletal knowledge. BMJ Open Sport Exerc. Med. 2019; 5:e000495.
7. Radenkovic D, Aswani R, Ahmad I, et al. Lifestyle medicine and physical activity knowledge of final year UK medical students. BMJ Open Sport Exerc. Med. 2019; 5:e000518.
8. Abou-Raya A, Abou-Raya S. The inadequacies of musculoskeletal education. Clin. Rheumatol. 2010; 29:1121–6.
9. Osborne SA, Adams JM, Fawkner S, et al. Tomorrow's doctors want more teaching and training on physical activity for health. Br. J. Sports Med. 2017; 51:624–5.
10. Benham AJ, Geier KA, Salmond S. How well are nurse practitioners prepared to treat common musculoskeletal conditions? Orthop. Nurs. 2016; 35:325–9.
11. Thompson WR, Sallis R, Joy E, et al. Exercise is medicine. Am. J. Lifestyle Med. 2020; 14:511–23.
12. Joyner MJ, Sanchis-Gomar F, Lucia A. Exercise medicine education should be expanded. Br. J. Sports Med. 2017; 51:625–6.
13. Battistone MJ, Barker AM, Grotzke MP, et al. "Mini-residency" in musculoskeletal care: a National Continuing Professional Development Program for primary care providers. J. Gen. Intern. Med. 2016; 31:1301–7.

Supplemental Digital Content

Written work prepared by employees of the Federal Government as part of their official duties is, under the U.S. Copyright Act, a “work of the United States Government” for which copyright protection under Title 17 of the United States Code is not available. As such, copyright does not extend to the contributions of employees of the Federal Government.