Some classification systems for fracture of the radial head are based on the degree of displacement and the size of the fracture fragments [2, 5, 10, 14, 15]. The classification by Mason, being the most widely used, is based on fragment characteristics but does not quantify displacement. According to the modified classification by Broberg and Morrey, displacement of more than 2 mm and/or involvement of at least 30% of the articular surface indicates a displaced radial head fracture . A key distinction in fractures of the radial head is between stable fractures where the stepoff is the issue (but usually for block to motion, not arthrosis, which seems rarely to be a problem after these injuries) and unstable fractures where the issue is stability (eg, terrible triad, where the radial head plays a critical role in elbow stability) .
Fracture instability at operative exposure (meaning that the fracture is loose and mobile) seems important, because unstable fractures are typically associated with other injuries to the elbow or forearm [6, 11, 17]. Many partial fractures are depressed 2 mm or more but are impacted and stable (it takes force to move them on operative exposure) with an intact periosteum. We are curious whether radiographic factors such as a gap between fracture fragments or loss of contact indicate which fractures are mobile on operative exposure and—more importantly—which are associated with other fractures or ligament injuries . Rineer and colleagues  reported that partial articular (Mason 2) radial head fractures with at least one fracture fragment without bony contact (ie, no area where the fracture fragments are immediately adjacent and it is possible that the fracture is stable and difficult to move on operative exposure) are 21 times more likely to be associated with a complex, unstable injury pattern. For this radiographic finding to be useful, its interobserver reliability needs to be demonstrated.
This study addresses the following study questions: (1) Is there agreement between observers on radiographic gap, loss of contact between radial head fracture fragments, anticipated fracture instability/mobility on operative exposure, anticipated associated ligamentous injury, and decision for surgery? (2) Are there factors associated with the observer such as location of practice or subspecialization that increase interobserver reliability?
Patients and Methods
Study Design and Setting
The institutional research board at the principal investigator's (DR) hospital approved this study. Twenty-seven sets of AP and lateral elbow radiographs from patients treated for a radial head fracture were selected based on image quality to provide a spectrum of displacement and injury patterns. Seven of 23 fractures (30%) were whole head fractures (Figs. 1, 2) and 14 of 23 fractures (61%) were displaced (Figs. 3, 4) according to the criteria by Broberg and Morrey. We invited the members of the Science of Variation Group-fully trained practicing orthopaedic and trauma surgeons from around the world-to evaluate the radiographs on a web-based study platform (SurveyMonkey, Palo Alto, CA, USA).
Three hundred thirty-three invitations were sent, 179 responses were received (54%; not all of the members treat elbow fractures), and 168 surgeons (92% men, 8% women) completed the online survey (94% of the initial responders). The majority practiced in the Unites States (58%), had more than 5 years of experience (79%), were specialized in either the hand and wrist (41%) or orthopaedic traumatology (45%), had trainees in the operating room (81%), and treated more than 10 elbow fractures a year (53%) (Table 1).
After logon, the observers were asked general information about their practices. Subsequently, they were asked to answer five multiple-choice questions about fracture characteristics and treatment options for each of the 27 patients: (1) Is there a gap of more than 2 mm between one of the radial head fragments and the intact radius? (2) Is there complete loss of contact between a fracture fragment and the rest of the proximal radius? (3) Is the fracture unstable? (4) Are there likely to be associated ligament injuries or fractures? (5) Would you recommend operative treatment?
Variables, Outcome Measures, Data Sources, and Bias
Independent variables were observer characteristics and fracture characteristics. Dependent variables were binominal (yes/no answers to the questions). Agreement among observers was determined using the multirater kappa measure described by Siegel and Castellan . The multirater kappa measure is a frequently used statistic measure to describe chance-corrected agreement between ratings made by multiple observers (interobserver reliability) or between ratings made by one observer on multiple occasions (intraobserver reliability) . The generated kappa values were interpreted according to the guidelines by Landis and Koch : values of 0.01 to 0.20 indicate slight agreement; 0.21 to 0.40, fair agreement; 0.41 to 0.60, moderate agreement; 0.61 to 0.80, substantial agreement; and more than 0.81, almost perfect agreement. Zero indicates no agreement beyond that expected resulting from chance alone, -1.00 means total disagreement, and +1.00 represents perfect agreement.
The only incentive for observers to participate was group authorship.
As a measure of power, the precision was determined for estimating the true value of kappa in the population based on a 95% confidence interval for a kappa of 0.30 in 27 subjects (ie, 27 radiographs) of radial head fracture rated by 168 observers. According to the Fleiss-Cuzick estimator of kappa in the case of equal numbers of ratings for each subject, the 168 observers and 27 subjects provide a 95% precision of the true kappa value of ± 0.18 around the observed value of kappa . Multirater kappas were calculated with use of SPSS for Windows, 18.104.22.168 2012 (SPSS Inc, Chicago, IL, USA).
The interobserver agreement of the surgeons who participated in the survey was moderate (reference value, 0.41-0.60) for diagnosing a gap between radial head fracture fragments of more than 2 mm (κ = 0.55) and also for diagnosing a complete loss of bony contact between fragments (κ = 0.43), classifying a fracture of the radial head as unstable (κ = 0.49), and for recommending operative treatment (κ = 0.52). There was fair (reference value, 0.21-0.40) interobserver reliability for diagnosing anticipated ligament injuries associated with the radial head fracture (κ = 0.33) (Table 2).
Factors Associated With Increased Interobserver Agreement
Shoulder and elbow surgeons were the only subset of surgeons that agreed substantially (range, 0.51-0.61) in diagnosing a gap of more than 2 mm, anticipating fracture instability and recommending operative treatment. General orthopaedic surgeons had only fair or slight agreement on all five questions. Surgeons specialized in either orthopaedic traumatology or hand and wrist scored moderate on all five questions except for anticipated ligament injuries (Table 3).
The interobserver variability of complete loss of cortical contact, anticipated fracture instability, and anticipated associated ligament injuries were also affected somewhat by years in practice, supervision of trainees in the operating room, and number of elbow fractures treated in a year (Tables 4, 5, 6).
Loss of contact between radial head fracture fragments is strongly associated with other elbow or forearm injuries . A reliable radiographic sign of radial head fracture instability could help examiners anticipate and properly treat associated fractures and ligament injuries. We aimed to assess interobserver agreement on radial head fracture characteristics based on radiographs and to try to identify factors associated with increased interobserver reliability.
This study should be interpreted in light of several limitations. The quality of the radiographs was determined by what was obtained in the emergency department and was not standardized. No CT scans were available in the survey. Observers had no information about the patient or the injury. Also, we did not give observers any training or reference values on, for example, what defined “loss of contact” or “gap.” Also, this study was limited to interobserver agreement only, because intraobserver agreement is less relevant to clinical practice and because it is more practical to have the members of the collaborative volunteer one 20- to 30-minute session. Intraobserver agreement tends to be much less of a problem for most types of radiological diagnosis and studies consistently show that more sophisticated imaging improves intraobserver variation more than interobserver variation [1, 13, 19]. We also asked many secondary study questions of the data, including variations based on training and experience, all of which should be interpreted with caution and considered primarily hypothesis-generating. The members of our collaborative and those who chose to participate in this study may not be representative of the average surgeon. Our collaborative includes many types of surgeons and a decision not to participate generally means they are either not familiar with the topic or are too busy at the time.
Because this is a reliability study, we did not have an interoperative evaluation of the fracture as a reference standard and cannot comment on the validity of the answers of the surgeons. The next step would be interoperative verification of loss of bony contact as a radiographic predictor of radial head fracture instability and associated fractures or ligament injuries and determining to what extent these findings changed the prognosis and treatment. Finally, intuitively one might expect higher kappas for, for example, loss of contact, than our study results depict. It is possible that in imbalance in marginal totals (eg, the way in which the fractures with and without loss of contact have been distributed in our study) have resulted in the “ kappa paradox,” a result with lower kappas than one might expect [3, 4, 9].
The interobserver agreement on the radiographic diagnosis of loss of bony contact was moderate overall, suggesting that this is a useful radiographic finding that can be used to guide management and counsel patients. When a surgeon evaluating a radiograph of a radial head fracture diagnoses a gap between fracture fragments and loss of bony contact, he or she should understand that this diagnosis is moderately reliable and strongly associated with a risk of associated ligament injuries or fractures of the forearm or elbow. Our findings are consistent with Doornberg et al.  who measured substantial intraobserver and moderate interobserver reliability of classification of radial head fractures according to the modified Mason classification by Broberg and Morrey.
The observations based on training and experience were inconsistent with only the subgroup of shoulder and elbow surgeons having somewhat better agreement, so these should not be overinterpreted. Our interpretation of the findings of these secondary analyses is that there is no single factor (such as experience for instance) that accounts for interobserver variation. Interobserver variation is greater than one would expect even for simple diagnoses such as radial head contact and a large amount of the variation among observers remains unexplained. Interobserver variation therefore merits additional study to determine ways to reduce observer variation other than more sophisticated imaging and simplified classifications (which have only helped to a limited degree).
Loss of contact of radial head fracture fragments on radiographs is associated with intraoperative fragment mobility/instability and usually indicates other injuries to the elbow or forearm, which can be important but subtle (eg, interosseous ligament injury of the forearm). This study documents moderate reliability in the diagnosis of a gap between fracture fragments on radiographs. We recommend that clinicians inspect radiographs for loss of contact and a gap between radial head fracture fragments and scrutinize patients with this finding carefully for possible interosseous ligament injury, self-reduced elbow dislocation, or associated fractures. Future studies can address whether other tests (eg, CT) are even more reliable and accurate.
1. Bernstein J, Adler LM, Blank JE, Dalsey RM, Williams GR, Iannotti JP. Evaluation of the Neer system of classification of proximal humeral fractures with computerized tomographic scans and plain radiographs. J Bone Joint Surg Am.
2. Broberg MA, Morrey BF. Results of treatment of fracture-dislocations of the elbow. Clin Orthop Relat Res.
3. Bruinsma WE, Guitton TG, Warner JJ, Ring D. Interobserver reliability of classification and characterization of proximal humeral fractures: a comparison of two and three-dimensional CT. J Bone Joint Surg Am.
4. Cicchetti DV, Feinstein AR. High agreement but low kappa: II. Resolving the paradoxes. J Clin Epidemiol.
5. Cutler CW. Fractures of the head and neck of the radius. Ann Surg.
6. Davidson PA, Moseley JB Jr, Tullos HS. Radial head fracture. A potentially complex injury. Clin Orthop Relat Res.
7. Doornberg J, Elsner A, Kloen P, Marti RK, Dijk CN, Ring D. Apparently isolated partial articular fractures of the radial head: prevalence and reliability of radiographically diagnosed displacement. J Shoulder Elbow Surg.
8. Duckworth AD, McQueen MM, Ring D. Fractures of the radial head. Bone Joint J.
9. Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol.
10. Hotchkiss RN. Displaced fractures of the radial head: internal fixation or excision? J Am Acad Orthop Surg.
11. Itamura J, Roidis N, Mirzayan R, Vaishnav S, Learch T, Shean C. Radial head fractures: MRI evaluation of associated injuries. J Shoulder Elbow Surg.
12. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics.
13. Lozano-Calderon S, Blazar P, Zurakowski D, Lee SG, Ring D. Diagnosis of scaphoid fracture displacement with radiography and computed tomography. J Bone Joint Surg Am.
14. Mason ML. Some observations on fractures of the head of the radius with a review of one hundred cases. Br J Surg.
15. Müller M. The Comprehensive Classification of Fractures in Long Bones
16. Posner KL, Sampson PD, Caplan RA, Ward RJ, Cheney FW. Measuring interrater reliability among multiple raters: an example of methods for nominal data. Stat Med.
17. Rineer CA, Guitton TG, Ring D. Radial head fractures: loss of cortical contact is associated with concomitant fracture or dislocation. J Shoulder Elbow Surg.
18. Siegel S, Castellan N. Nonparametric Statistics for the Behavioral Sciences
1988;New York, NY, USAMcGraw-Hill.
19. Stieber J, Quirno M, Cunningham M, Errico TJ, Bendo JA. The reliability of computed tomography and magnetic resonance imaging grading of lumbar facet arthropathy in total disc replacement patients. Spine (Phila Pa 1976).
20. Zou G, Donner A. Confidence interval estimation of the intraclass correlation coefficient for binary outcome data. Biometrics.