Quality of Reporting of Orthopaedic Diagnostic Accuracy Studies is Suboptimal : Clinical Orthopaedics and Related Research®

Secondary Logo

Journal Logo

SECTION III: REGULAR AND SPECIAL FEATURES

Quality of Reporting of Orthopaedic Diagnostic Accuracy Studies is Suboptimal

Siva Rama, Krishna R. Boddu MRCS*; Poovali, Sharmila MD; Apsingi, Sunil MS

Author Information
Clinical Orthopaedics and Related Research 447():p 237-246, June 2006. | DOI: 10.1097/01.blo.0000205906.44103.a3
  • Free

Abstract

The orthopaedic surgeon's armamentarium for diagnosing various conditions has expanded because of rapidly evolving technologies and increasingly available clinical evidence. Studies of diagnostic accuracy evaluate the performance of such tests in detecting a target condition that can be a particular disease, a disease stage, or any health condition that should prompt clinical action. Critical assessment of validity of diagnostic accuracy studies and applicability of their results to patients are essential for an orthopaedic surgeon.4 Systematic surveys of diagnostic accuracy studies revealed a poor state of reporting with lacunae in the essential information on their design, conduct, and analysis.17,21 Deficiencies in the methodologic standards of such studies can overestimate the diagnostic performance of a test.17 Complete and accurate reporting of diagnostic accuracy studies are needed to allow surgeons to assess the validity of study results.

To improve the quality of reporting of diagnostic accuracy studies, the STARD steering committee developed a checklist and a generic flow diagram, which were published in January 2003 in eight medical journals (American Journal of Clinical Pathology, Annals of Internal Medicine, British Medical Journal, Clinical Biochemistry, Clinical Chemistry, Clinical Chemistry of Laboratory Medicine, Lancet, and Radiology).5,6 A similar checklist for randomized control trials (the CONSORT statement)1 seemed to have improved the quality of reporting of randomized control trials published in three medical journals (British Medical Journal, Journal of the American Medical Association, and Lancet).18 Applying the CONSORT criteria, Bhandari et al showed most reports of the randomized trials in orthopaedic trauma failed to provide essential information.3 The quality of reporting of diagnostic accuracy studies in the orthopaedic literature has not, to our knowledge, been reported.

We questioned whether the standards of reporting of studies of diagnostic accuracy would differ among three leading general orthopaedic journals, and whether these would differ from those of other subspecialties journals. We also questioned whether there would be an association between the reporting standards and the design and level of evidence of the studies.

MATERIALS AND METHODS

We identified and included all diagnostic accuracy studies published during a 3-year period from 2002 to 2004, in three leading general orthopaedic journals (based on the impact factor in 2003), excluding basic science research and subspecialty journals. We examined reporting of STARD checklist items individually for these studies and calculated the STARD scores.

The three journals included were Clinical Orthopaedics and Related Research and the American and the British volumes of the Journal of Bone and Joint Surgery (impact factors of 1.40, 1.95, and 1.33 in 2004, respectively13). Considering the suboptimal nature of various Medline search strategies for diagnostic accuracy studies,9,11 a manual search was conducted by examining the title and abstract of the articles cited in PubMed by two authors (KB and SP), independently. Review articles, meta- analyses, case reports, and letters were excluded initially. Studies were included if they involved human subjects and evaluated the diagnostic value of one or more tests (index test) against a reference standard in the same study population. As defined by the STARD statement, the test here refers to any procedure for obtaining additional information regarding a patient's health status, and therefore includes history, physical examination, function tests, and investigative procedures.5 The exclusion criteria were studies evaluating the outcome predictive value of a test and studies not using a reference standard test. Disagreements among the authors were resolved by discussions and consensus meetings involving the third independent reviewer.

Thirty-seven articles (Appendix) were included from 3661 published studies (Fig 1). Four disagreements regarding inclusion of the studies were resolved by the third reviewer's opinion. The articles were grouped into cohort studies (n = 27) and case-control studies (n = 10) based on the study design. Cohort studies are characterized by selection of subjects who had the index test, whereas in case-control studies, the subjects are selected on the basis of results of the reference standard. The distribution among the three journals was 14 articles published in the American volume of the Journal of Bone and Joint Surgery, eight in the British volume of the Journal of Bone and Joint Surgery, and 15 in Clinical Orthopaedics and Related Research. Eleven were published in 2002, seven in 2003, and 19 in 2004. Estimating the diagnostic value of one or more tests was the main purpose of all the studies except for two articles,12,20 that had a larger scale of aims and objectives. Levels of evidence for diagnosis were used to rank the validity of the evidence presented in a diagnostic study.7 Cohort studies testing previously developed diagnostic criteria produce Level 1 evidence, cohort studies developing such diagnostic criteria from the collected information produce Level 2 evidence, nonconsecutive studies or inconsistently applied reference standards produce Level 3 evidence, case-control studies produce Level 4 evidence, and expert opinion studies produce Level 5 evidence. In the current study, 14 articles had Level 1 evidence, 11 had Level 2 evidence, two had Level 3 evidence, and 10 had Level 4 evidence. Most of the index tests in these articles were imaging methods (n = 21), followed by clinical tests (n = 7) (Table 1). The most common reference standard was the intraoperative findings during open surgery or arthroscopy (n = 11).

F1-38
Fig 1:
A flow chart shows the process of selecting the articles on diagnostic accuracy.
T1-38
TABLE 1:
Characteristics of Index Tests and Reference Standards

The STARD statement was used to assess the quality of reporting of the articles included in our study. The statement contains a checklist of 25 items (Table 2), and recommends a flow diagram (Fig 2) to represent the study design and provide the exact number of participants at each stage of the study.5,6 Each item of the checklist was assigned a yes or no response depending on whether it was reported in the article. If the item was not applicable, it was marked as such. For Item 3, if only one of the two components was described (either inclusion and exclusion criteria or the settings and location of the study), it was marked as partially fulfilled. For Items 12 and 21, a partially fulfilled response was given if the measures of statistical uncertainty were not mentioned for the reported measures of diagnostic accuracy. The seven items (Items 8, 9, 10, 11, 13, 20, and 24) concerning the index test and the reference standard were reported as separate subitems; a for the index test and b for the reference standard. Two authors (KB and SA) independently assessed all the articles, and disagreements were resolved in a consensus meeting.

F2-38
Fig 2:
A prototype of a flow diagram for a study on diagnostic accuracy is shown. (Reproduced with permission from the BMJ Publishing Group and Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, Lijmer JG, Moher D, Rennie D, de Vet HC; Standards for Reporting of Diagnostic Accuracy. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. BMJ. 2003; 326:41-44.)
T2A-38
TABLE 2:
Quality of Reporting of Diagnostic Accuracy Studies (n = 37) According to STARD Checklist*
T2B-38
TABLE 2:
Quality of Reporting of Diagnostic Accuracy Studies (n = 37) According to STARD Checklist* (continued)

A total STARD score was calculated as described by Smidt et al,24 but with a modification. Equal weight (1 point) was allocated to each of the 25 items, and summing them derived the STARD score (0 to 25 points available) for each article. A yes response scored 1 point, a no response scored 0 points, and a partially fulfilled response scored ½ point. One-half point each was assigned for the subitems (a and b), if any. The modification introduced was if an item was not applicable, the score was calculated for the number of applicable items and was extrapolated mathematically for 25, to avoid bias in scoring attributable to any nonapplicable items. The intraclass correlation coefficient was calculated for the scores obtained by the two reviewers and was good (r = 0.88) between the two reviewers. We estimated interrater agreement for the items of the checklist by Cohen's kappa statistic (applying the grading criteria by Landis and Koch16) in three randomly selected articles. The mean STARD scores of the subgroups based on journal, year of publication, level of evidence, and study design were compared using ANOVA and Student's t test (independent samples) as appropriate. A p value less than 0.05 was considered significant.

RESULTS

Only 38% of the articles (14 of 37) reported more than ⅔ of the STARD items, and no study reported 22 or more items of 25 (Fig 3). The majority of the studies did not report one or more of the nine specific items (Items 1, 3, 10, 11, 12, 13, 21, 23, and 24)(Table 2). Predominantly neglected (≥ 80% “no” response) items were the identification of an article as a diagnostic accuracy study (Item 1), the methods for calculating test reproducibility (Item 13), and the estimates of test reproducibility (Item 24). Description of participant recruitment (Item 4) and discussion of the clinical applicability of the study findings (Item 25) were the best reported items. The interrater agreement for the reporting of checklist items was good (κ = 0.7).

F3-38
Fig 3:
This scatter plot shows percentage of diagnostic accuracy studies against number of STARD checklist items reported (of a total of 25).

Many of the articles (92%) were not identified as diagnostic accuracy studies using the words diagnostic accuracy in the title, abstract, or key words to facilitate electronic retrieval (Item 1). Medline search using the Medical Subject Headings (MeSH) terms “sensitivity and specificity” could identify 73% (27 of 37) of the included articles, but with 135 false positives (positive predictive value, 16.7%). Although the research question (Item 2) was understandable in 76% of the articles after reading the abstract and introduction, a clearly specified research question mentioning the index test, the reference standard, and the target condition together was found in 19% (seven of 37) of the articles only.

Reporting of the methodology was inadequate in the majority of the articles, particularly regarding the Items 3, 10, 11, 12, and 13 (Table 2). Description of the study population was complete in 49% of the articles only (18 of 37), because of the lack of reporting of exclusion criteria in the remaining articles (Item 3). Only 57% (21 of 37) of the articles reported how the participants were selected, and all included consecutive series of patients (Item 5). The information regarding the index tests was better reported than that of the reference standard (Items 8-11, and 13). Nearly ½ (46%) of the articles did not describe the rationale for the reference standard. The training and expertise of the persons performing and reading the tests (Item 10) was not reported in majority of the studies (65%). Blinding of the readers of a test (index or reference) for the results of another test (index or reference) was assumed if it was reported that the particular test was done before the other test (Item 11). Only 35% of the articles (13 of 37) reported about blinding of the readers of the index test and the reference standard.

Many articles reported results of the study inadequately mainly because Items 17 and 21 to 24 were not reported (Table 2). Although 78% of the articles (29 of 37) presented the numbers of included patients who had the tests, only two articles (5.4%) presented these data in a flow diagram (Item 16). Only 46% of the articles reported the time between the index tests and reference standard (Item 17). Many of the articles (84%) reported one of the measures of diagnostic accuracy (Item 21), but only three articles (8%) reported the likelihood ratios. Four articles (11%) reported diagnostic odds ratios, and three (8%) reported receiver operating characteristic (ROC) curves. The majority of the studies (73%) did not quantify the statistical uncertainty of these diagnostic accuracy indices. Estimates of the test reproducibility were reported in 8% of the studies only for the index test and the reference standard (Item 24).

We found similar mean STARD scores of cohort studies (15.5 ± 3.6) and case-control studies (13.8 ± 2.0). The STARD scores of the 37 articles ranged from 6.6 to 21.4, with a mean STARD score of 15.0 and a standard deviation of 3.3. The mean STARD scores also were similar between articles with different levels of evidence (Table 3). There were no differences between the mean STARD scores across the three journals (Table 4). Similarly, mean STARD scores were similar by year of publication (Table 5).

T3-38
TABLE 3:
Quality of Reporting of Diagnostic Accuracy Studies vs Levels of Evidence7
T4-38
TABLE 4:
Quality of Reporting of Diagnostic Accuracy Studies among the Three Major General Orthopaedic Journals
T5-38
TABLE 5:
Quality of Reporting of Diagnostic Accuracy Studies Published during Three Consecutive Years

DISCUSSION

Adequate reporting of diagnostic accuracy studies is essential for proper interpretation of their results. The STARD statement helps authors and readers in this regard, by providing a checklist of the essential information that needs to be reported. Quality of the reporting of diagnostic accuracy studies in three orthopaedic journals was evaluated using the STARD parameters in this study and was found to be suboptimal.

Our study has several limitations. Identifying only three orthopaedic journals only (and as a result of having a small sample size) limits the generalizability of the results. However, two previous studies22,24 evaluating diagnostic accuracy reports in other specialty journals using the STARD parameters found similar inadequacies as in the current study. Smidt et al24 evaluated 124 articles published in 2000 in 12 high-impact medical journals and found that the quality of reporting was less than optimal. The specific STARD items with poor reporting were nearly the same as in the current study. The mean STARD score was 11.9 ± 3.3 in their study compared with a score of 15.0 ± 3.3 in our study. Modification in the scoring method in our study, as described earlier, might have accounted for some component of these differences in the STARD scores. Siddiqui et al22 found similar flaws in recent ophthalmic publications.

Some criticism may arise in the orthopaedic community, arguing that such meticulous reporting using a checklist is relevant to basic science research rather than clinical studies. However, few items of the STARD checklist happened to be not applicable in our study (Table 2), revealing the applicability of this checklist to orthopaedic diagnostic studies. A strong scientific basis is essential for evaluating the usefulness of any diagnostic test whether it is clinical or laboratory based. Our study also revealed the paucity of diagnostic accuracy studies in the orthopaedic literature (only 1% of the published articles). Therefore, there is a need to encourage more diagnostic research in orthopaedics to ensure that the current best evidence is available for decision making.

Most of the studies did not provide a full description of the methodology (Table 2). Methodology of a diagnostic accuracy study significantly affects the study outcome, its external validity (generalizability), and internal validity (potential for bias).17,21 The methods for selecting the study population and the sampling techniques (Items 3 and 5) are essential for assessing the generalizability of the study results. An example of a good description is:

“All 1037 patients with shoulder pain who attended the shoulder clinic of the senior author between August 1999 and August 2002 were asked to map … the point of maximal shoulder pain … One hundred and thirteen patients who localized the pain within the area bounded by the midpart of the clavicle and the deltoid insertion on this diagram were eligible for inclusion … Exclusion criteria included (1) previous distal clavicular or acromioclavicular joint surgery … (7) markings … that extended beyond the area defined.”26

A reference standard or gold standard test must be nearly infallible, if not perfect, in detecting the target condition. Otherwise the imperfect gold standard bias can lead to unreliable or incorrect estimates of diagnostic accuracy.27 However, choosing the perfect reference standard may not be feasible in many of the clinical studies because of practical considerations. Therefore, a good discussion of the justification for the chosen reference standard is essential to interpret the results accordingly (Item 7). A good example is:

“Ultrasound and clinical examination were compared with arthroscopically detected synovitis as the gold standard- … Arthroscopy allows direct visualization of the syno-vial membrane and structures within the joint compartment. It … has been validated against histological findings in both OA and inflammatory arthritis ([8][9]). In addition, recent evidence suggests that synovitis visible on arthroscopy is a predictor of progression of OA ([4]).”15

A verification bias can happen if the reference standard was performed only in subjects who were positive for the index test.27 Proper reporting of the number of patients having each test of the total included patients and the sequence of testing explore the potential for such bias. The STARD statement recommends using a flow diagram to communicate such information effectively. Awareness of masking the readers of the tests for other test results and information regarding training and expertise of the persons conducting the tests (Items 10 and 11) are vital for analyzing the test results, by revealing the potential for review bias. A good example of reporting follows:

“one of two radiologists prospectively performed the ultrasonographic examination and one musculoskeletal radiologist retrospectively reinterpreted the original magnetic resonance imaging study. Each radiologist was fellowship-trained and had more than ten years of experience with the test. All were blinded to the other radiologists' interpretations and to the arthroscopic findings … To avoid compromising patient care, the orthopaedic surgeon was aware of the ultrasonographic findings and the prospective interpretations of the magnetic resonance imaging tests.”25

Estimation of the test reproducibility mostly was neglected (86%) owing to problems with methods. Poor reproducibility of a test because of instrumental and/or observer variability can adversely affect the estimates of diagnostic accuracy.

Adequate reporting of the results also was deficient in most of the studies (Table 2). Depending on the quality of the data obtained, diagnostic accuracy can be measured by indices such as sensitivity and specificity, predictive values, likelihood ratios, diagnostic odds ratios, and ROC curves.8 The ROC curves are useful to depict the performance of a test at different diagnostic thresholds, whereas diagnostic odds ratios are more valuable when combining studies in a systematic review.8 Likelihood ratios link estimates of pretest probability to posttest probability and are more useful to a clinician for interpretation of tests (Table 6).3,8 Because the specific values of these diagnostic accuracy indices are merely estimates, precision (such as 95% confidence intervals) has to be quantified for these estimates so that the clinician can know the range within which the true values of the indices are likely to lie (Table 6).10 Few articles reported likelihood ratios (8%) or precision estimates (27%)(Table 2).

T6-38
TABLE 6:
Likelihood Ratios for a Diagnostic Test in a Data Example

Our study revealed inadequate standards of reporting regardless of the level of evidence (Table 3) and the design of the study. The value of a well-designed and well-conducted study will be undermined if it is underreported. The readers need to understand what actually was done rather than to assume what was done.

Poor standards of reporting of diagnostic accuracy studies in the recent orthopaedic literature are evident. The achievements of the CONSORT checklist in improving reporting of randomized controlled trials18 prompted numerous journals to use it for instructing authors. Adopting the STARD checklist also is desirable for journals to improve the reporting of diagnostic accuracy studies, although evidence regarding this is not yet available. The primary intention of the STARD statement is to improve reporting of diagnostic accuracy studies, but not methodologic standards. However, in studies with methodologic deficiencies, the authors may tend to underreport issues with poor standards. Stringent requirements for quality of reporting require authors to reveal the standards of all study aspects and that, in turn, may improve the methodologic quality in long run. We strongly recommend the STARD statement for authors, readers, reviewers, and editors when working with articles of diagnostic accuracy to evaluate and improve the quality of reporting.

References

1. Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, Gotzsche PC, Lang T. CONSORT GROUP (Consolidated Standards of Reporting Trials). The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med. 2001;134:663-694.
2. Armitage P, Berry G. Statistical Methods in Medical Research. 3rd ed. London, UK: Blackwell; 1994: 131.
    3. Bhandari M, Guyatt GH, Lochner H, Sprague S, Tornetta P III. Application of the Consolidated Standards of Reporting Trials (CONSORT) in the Fracture Care Literature. J Bone Joint Surg Am. 2002;84:485-489.
    4. Bhandari M, Montori VM, Swiontkowski MF, Guyatt GH. User's guide to the surgical literature: how to use an article about a diagnostic test. J Bone Joint Surg Am. 2003;85:1133-1140.
    5. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, Lijmer JG, Moher D, Rennie D, de Vet HC. Standards for Reporting of Diagnostic Accuracy. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. BMJ. 2003;326:41-44.
    6. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, Moher D, Rennie D, de Vet HC, Lijmer JG. Standards for Reporting of Diagnostic Accuracy. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clin Chem. 2003;49:7-18.
    7. Centre for Evidence-Based Medicine, Institute of Health Sciences, Oxford. Levels of evidence and grades of recommendations. Available at http://www.cebm.net/levels_of_evidence.asp. Accessed September 21, 2005.
    8. Deeks JJ. Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. BMJ. 2001;323:157-162.
    9. Deville WL, Bezemer PD, Bouter LM. Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy. J Clin Epidemiol. 2000;53:65-69.
    10. Harper R, Reeves B. Reporting of precision of estimates for diagnostic accuracy: a review. BMJ. 1999;318:1322-1323.
    11. Haynes RB, Wilczynski NL. Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey. BMJ. 2004;328:1040-1042.
    12. Hoffman EB, Allin J, Campbell JA, Leisegang FM. Tuberculosis of the knee. Clin Orthop Relat Res. 2002;398:100-106.
    13. ISI web of knowledge. Journal citation reports® JCR science edition 2004. Available at http://wok.mimas.ac.uk/. Accessed January 25, 2006.
    14. Jaeschke R, Guyatt G, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1994;271:389-391.
      15. Karim Z, Wakefield RJ, Quinn M, Conaghan PG, Brown AK, Veale DJ, O'Connor P, Reece R, Emery P. Validation and reproducibility of ultrasonography in the detection of synovitis in the knee: a comparison with arthroscopy and clinical examination. Arthritis Rheum. 2004;50:387-394.
      16. Landis JR, Koch G. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-174.
      17. Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, Bossuyt PM. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999;282:1061-1066.
      18. Moher D, Jones A, Lepage L. CONSORT Group (Consolidated Standards for Reporting of Trials). Use of the CONSORT statement and quality of reports of randomized trials: a comparative before-and-after evaluation. JAMA. 2001;285:1992-1995.
      19. Newcombe RG. Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med. 1998;17:857-872.
        20. Nogueira MP, Paley D, Bhave A, Herbert A, Nocente C, Herzenberg JE. Nerve lesions associated with limb-lengthening. J Bone Joint Surg Am. 2003;85:1502-1510.
        21. Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research: getting better but still not good. JAMA. 1995;274:645-651.
        22. Siddiqui MA, Azuara-Blanco A, Burr J. The quality of reporting of diagnostic accuracy studies published in ophthalmic journals. Br J Ophthalmol. 2005;89:261-265.
        23. Simel DL, Samsa GP, Matchar DB. Likelihood ratios with confidence: sample size estimation for diagnostic test studies. J Clin Epidemiol. 1991;44:763-770.
          24. Smidt N, Rutjes AW, van der Windt DA. Quality of reporting of diagnostic accuracy studies. Radiology. 2005;235:347-353.
          25. Teefey SA, Rubin DA, Middleton WD, Hildebolt CF, Leibold RA, Yamaguchi K. Detection and quantification of rotator cuff tears: comparison of ultrasonographic, magnetic resonance imaging, and arthroscopic findings in seventy-one consecutive cases. J Bone Joint Surg Am. 2004;86:708-716.
          26. Walton J, Mahajan S, Paxinos A, Marshall J, Bryant C, Shnier R, Quinn R, Murrell GA. Diagnostic values of tests for acromioclavicular joint pain. J Bone Joint Surg Am. 2004;86:807-812.
          27. Weinstein S, Obuchowski NA, Lieber ML. Clinical evaluation of diagnostic tests. AJR Am J Roentgenol. 2005;184:14-19.

          APPENDIX

          Articles of diagnostic accuracy included in our study

          1. Aoki Y, Yasuda K, Tohyama H, Ito H, Minami A. Magnetic resonance imaging in stress fractures and shin splints. Clin Orthop Relat Res. 2004;421:260-267.
            2. Banit DM, Kaufer H, Hartford JM. Intraoperative frozen section analysis in revision total joint arthroplasty. ClinOrthop Relat Res. 2002;401:230-238.
              3. Bhat M, McCarthy M, Davis TR, Oni JA, Dawson S. MRI and plain radiography in the assessment of displaced fractures of the waist of the carpal scaphoid. J Bone Joint Surg Br. 2004;86:705-713.
                4. Brown MD, Gomez-Marin O, Brookfield KF, Li PS. Differential diagnosis of hip disease versus spine disease. ClinOrthop Relat Res. 2004;419:280-284.
                  5. Dao KD, Solomon DJ, Shin AY, Puckett ML. The efficacy of ultrasound in the evaluation of dynamic scapholunate ligamentous instability. J Bone Joint Surg Am. 2004;86:1473-1478.
                    6. Davies AP, Vince AS, Shepstone L, Donell ST, Glasgow MM. The radiologic prevalence of patellofemoral osteoarthritis. ClinOrthop Relat Res. 2002;402:206-212.
                      7. Egol KA, Amirtharajah M, Tejwani NC, Capla EL, Koval KJ. Ankle stress test for predicting the need for surgical fixation of isolated fibular fractures. J Bone Joint Surg Am. 2004;86:2393-2398.
                        8. Hilibrand AS, Schwartz DM, Sethuraman V, Vaccaro AR, Albert TJ. Comparison of transcranial electric motor and somatosensory evoked potential monitoring during cervical spine surgery. J Bone Joint Surg Am. 2004;86:1248-1253.
                          9. Hoffman EB, Allin J, Campbell JA, Leisegang FM. Tuberculosis of the knee. ClinOrthop Relat Res. 2002;398:100-106.
                            10. Jari S, Paton RW, Srinivasan MS. Unilateral limitation of abduction of the hip: a valuable clinical sign for DDH? J Bone Joint Surg Br. 2002;84:104-107.
                              11. Juliao SF, Rand N, Schwartz HS. Galectin-3: a biologic marker and diagnostic aid for chordoma. ClinOrthop Relat Res. 2002;397: 70-75.
                                12. Kanamiya T, Hara M, Naito M. Magnetic resonance evaluation of remodeling process in patellar tendon graft. ClinOrthop Relat Res. 2004;419:202-206.
                                  13. Kaste SC, Hill A, Conley L, Shidler TJ, Rao BN, Neel MM. Magnetic resonance imaging after incomplete resection of soft tissue sarcoma. ClinOrthop Relat Res. 2002;397:204-211.
                                    14. Keeney JA, Peelle MW, Jackson J, Rubin D, Maloney WJ, Clohisy JC. Magnetic resonance arthrography versus arthroscopy in the evaluation of articular hip pathology. ClinOrthop Relat Res. 2004; 429:163-169.
                                      15. Keret D, Ezra E, Lokiec F, Hayek S, Segev E, Wientroub S. Efficacy of prenatal ultrasonography in confirmed club foot. J Bone Joint Surg Br. 2002;84:1015-1019.
                                        16. Khachatourians AG, Patzakis MJ, Roidis N, Holtom PD. Laboratory monitoring in pediatric acute osteomyelitis and septic arthritis. ClinOrthop Relat Res. 2003;409:186-194.
                                          17. Kocher MS, Mandiga R, Zurakowski D, Barnewolt C, Kasser JR. Validation of a clinical prediction rule for the differentiation between septic arthritis and transient synovitis of the hip in children. J Bone Joint Surg Am. 2004;86:1629-1635.
                                            18. Koslowsky TC, Mader K, Gausepohl T, Heidemann J, Pennig D, Koebke J. Ultrasonographic stress test of the metacarpophalangeal joint of the thumb. ClinOrthop Relat Res. 2004;427:115-119.
                                              19. Lee DH, Lee KH, Lopez-Ben R, Bradley EL. The double-density sign: a radiographic finding suggestive of an os acromiale. J Bone Joint Surg Am. 2004;86:2666-2670.
                                                20. Lee FY, Yu J, Chang SS, Fawwaz R, Parisien MV. Diagnostic value and limitations of fluorine-18 fluorodeoxyglucose positron emission tomography for cartilaginous tumors of bone. J Bone Joint Surg Am. 2004;86:2677-2685.
                                                  21. Luhmann SJ, Jones A, Schootman M, Gordon JE, Schoenecker PL, Luhmann JD. Differentiation between septic arthritis and transient synovitis of the hip in children with clinical prediction algorithms. J Bone Joint Surg Am. 2004;86:956-962.
                                                    22. McConnell T, Creevy W, Tornetta P III. Stress examination of supination external rotation-type fibular fractures. J Bone Joint Surg Am. 2004;86:2171-2178.
                                                      23. Mehin R, Yuan X, Haydon C, Rorabeck CH, Bourne RB, McCalden RW, MacDonald SJ. Retroacetabular osteolysis: when to operate? ClinOrthop Relat Res. 2004;428:247-255.
                                                        24. Molloy S, Solan MC, Bendall SP. Synovial impingement in the ankle: a new physical sign. J Bone Joint Surg Br. 2003;85:330-333.
                                                          25. Nogueira MP, Paley D, Bhave A, Herbert A, Nocente C, Herzenberg JE. Nerve lesions associated with limb-lengthening. J Bone Joint Surg Am. 2003;85:1502-1510.
                                                            26. Prickett WD, Teefey SA, Galatz LM, Calfee RP, Middleton WD, Yamaguchi K. Accuracy of ultrasound imaging of the rotator cuff in shoulders that are painful postoperatively. J Bone Joint Surg Am. 2003;85:1084-1089.
                                                              27. Roder C, Eggli S, Aebi M, Busato A. The validity of clinical examination in the diagnosis of loosening of components in total hip arthroplasty. J Bone Joint Surg Br. 2003;85:37-44.
                                                                28. Saifuddin A, Heffernan G, Birch R. Ultrasound diagnosis of shoulder congruity in chronic obstetric brachial plexus palsy. J Bone Joint Surg Br. 2002;84:100-103.
                                                                  29. Saltzman CL, Rashid R, Hayes A, Fellner C, Fitzpatrick D, Klapach A, Frantz R, Hillis SL. 4.5-gram monofilament sensation beneath both first metatarsal heads indicates protective foot sensation in diabetic patients. J Bone Joint Surg Am. 2004;86:717-723.
                                                                    30. Schaefer O, Winterer J, Lohrmann C, Laubenberger J, Reichelt A, Langer M. Magnetic resonance imaging for supraspinatus muscle atrophy after cuff repair. ClinOrthop Relat Res. 2002;403:93-99.
                                                                      31. Shapeero LG, Vanel D, Verstraete KL, Bloem JL. Fast magnetic resonance imaging with contrast for soft tissue sarcoma viability. ClinOrthop Relat Res. 2002;397:212-227.
                                                                        32. Sharp RJ, Wade CM, Hennessy MS, Saxby TS. The role of MRI and ultrasound imaging in Morton's neuroma and the effect of size of lesion on symptoms. J Bone Joint Surg Br. 2003;85:999-1005.
                                                                          33. Sugimoto K, Takakura Y, Samoto N, Nakayama S, Tanaka Y. Subtalar arthrography in recurrent instability of the ankle. ClinOrthop Relat Res. 2002;394:169-176.
                                                                            34. Takao M, Ochi M, Oae K, Naito K, Uchio Y. Diagnosis of a tear of the tibiofibular syndesmosis: the role of arthroscopy of the ankle. J Bone Joint Surg Br. 2003;85:324-329.
                                                                              35. Teefey SA, Rubin DA, Middleton WD, Hildebolt CF, Leibold RA, Yamaguchi K. Detection and quantification of rotator cuff tears: comparison of ultrasonographic, magnetic resonance imaging, and arthroscopic findings in seventy-one consecutive cases. J Bone Joint Surg Am. 2004;86:708-716.
                                                                                36. Temmerman OP, Raijmakers PG, David EF, Pijpers R, Molenaar MA, Hoekstra OS, Berkhof J, Manoliu RA, Teule GJ, Heyligers IC. A comparison of radiographic and scintigraphic techniques to assess aseptic loosening of the acetabular component in a total hip replacement. J Bone Joint Surg Am. 2004;86:2456-2463.
                                                                                  37. Walton J, Mahajan S, Paxinos A, Marshall J, Bryant C, Shnier R, Quinn R, Murrell GA. Diagnostic values of tests for acromioclavicular joint pain. J Bone Joint Surg Am. 2004;86:807-812.
                                                                                    © 2006 Lippincott Williams & Wilkins, Inc.