Interobserver agreement for the Chest Wall Injury Society taxonomy of rib fractures using computed tomography images

To better evaluate and communicate about rib fractures, the CWIS recently proposed a new taxonomy. However, the interobserver agreement for rib fracture type was moderate and agreement was only fair for rib fracture displacement.

I n patients who sustained blunt chest trauma, rib fractures are the most common injury. 1,2 Most rib fractures heal with nonoperative therapy. 3 Nevertheless, there is a significant burden of morbidity and mortality in many patients with rib fractures, with worse outcomes for those with increasing age or number of rib fractures and presence of flail chest or associated injuries such as pulmonary contusions. [4][5][6][7][8][9] Avital step for improving care for thoracic trauma patients is providing a universal nomenclature for assessment and communication of chest wall injuries. The utility of such scoring systems for both internal organs and other orthopedic injuries has been described. 10 Current scoring systems for rib fractures are the Organ Injury Scale Chest Wall grade, Chest Trauma Score, and chest Abbreviated Injury Scale. [11][12][13] They include the number of fractures, presence of flail chest, and presence of bilateral rib fractures but do not specify rib fracture characteristics. The RibScore, which is a radiographic rib fracture scoring system based on chest computed tomography (CT), is more specific with additional components that consider fracture displacement and the anatomic area of the rib fracture location. 14 The first classification that focuses solely on rib fracture characteristics is based on the Müller AO classification. 15,16 This classification accounts for rib number, location, fracture type, and subtype and demonstrated a substantial agreement among four reviewers. 16 Most recently the Chest Wall Injury Society (CWIS) published a taxonomy of rib fractures resulting from an international consensus using a Delphi group of 113 respondents. 17 It proposes universal definitions for fracture displacement (undisplaced, offset, or displaced) and fracture type classification (simple, wedge, or complex). The location of the fracture on the chest wall is provided in three anatomical sectors (anterior, lateral, or posterior), although no consensus was reached on a universal definition of these anatomic boundaries. The complexity of the rib fracture type and displacement as defined by this taxonomy demonstrated a clinical correlation with pulmonary complications and adverse outcomes. 18 The capability of the taxonomy users to agree on definitions is paramount for the successful use of a universal rib fracture classification. We thus aimed to determine the interobserver agreement on the CWIS taxonomy for rib fractures on images from CT scans of patients with chest wall injury and to assess if it is influenced by the background of the observers. We hypothesized that there would be at least moderate agreement, regardless of the observers' background.

Study Design
Independent observers who were associated with CWIS, either as a member or as a colleague of a member who is involved in the care for patients with chest wall injury, were invited by email to evaluate axial, coronal, and sagittal images from 11 CT scans of rib fractures. An online platform (SurveyMonkey) was used to execute this survey. 19 Multiple reminders to complete the survey were sent by email. The Medical Research Ethics Committee (MEC-2020-0883) exempted the study. No sample size calculations were made. The Guidelines for Reporting Reliability and Agreement Studies was used to ensure proper reporting of methods, results, and discussion (Supplemental Digital Content, Supplementary Data 1, http://links.lww.com/TA/C658). 20 Observers A total of 2,306 invitations, which included reminder emails, were sent to health care professionals associated with CWIS. The invitation contained a link that provided a single opportunity to fill out the survey to avoid duplicates. The observers were asked to classify type and displacement of rib fractures using the CWIS taxonomy definitions, which were provided in the survey. The CWIS categories for fracture type are as follows: Simple, Wedge, and Complex. A simple fracture is defined as one fracture line across the rib, a wedge fracture has a second fracture line that does not span the whole width of the rib, and a complex fracture has at least two fracture lines with one or more fragments spanning the width of the rib. The CWIS categories for fracture displacement are as follows: Undisplaced, Offset, and Displaced. Undisplaced fractures are defined where there is at least 90% contact between the cortical surfaces, displaced fractures where there is no cortical contact, and offset in between where there is some cortical contact but less than 90%. The introduction of the questionnaire provided the definitions and explanatory images with excerpts from the original taxonomy paper for the fracture and dislocation types. 17 Observers were asked to indicate the anatomical sectors where the rib fracture was located, based on their own definitions, since no consensus was reached for the definition of the anatomic sectors for the localization of rib fractures in the CWIS taxonomy. At the end of the survey, observers were asked to explain how they distinguished between the anterior, lateral, and posterior rib sectors. Links to illustrations that demonstrated the definitions were provided for reference throughout any portion of the survey.

Fractures
The rib fractures included in the survey were identified from an institutional database of adult patients who were admitted and treated for thoracic injury at a level 1 trauma center. The research team selected the final images during a consensus meeting. No clinical information was provided. The selected 11 fractures were from 11 different patients. A set of three images from the chest CT in the axial, sagittal, and coronal plane was uploaded to the internet platform on a single page for every fracture ( Fig. 1). At least one fracture for each permutation of type, displacement, and location was included.

Evaluation
Upon login, the observers were asked about their demographics, professional background, number of patients with rib fractures treated by them and their institution, and the details of their clinical practice location. Observers were asked to classify the type of rib fracture, anatomical sector, and dislocation pattern for all cases using the CWIS taxonomy (Fig. 1D). The option to leave comments was provided at the end of the survey. The observers could complete the study at their own time and pace.

Statistical Analysis
All analyses were done using SAS version 9.4 (SAS Institute, Cary, NC). All responses were analyzed, including those left incomplete. Categorical variables are presented as frequencies and percentages. Missing values were not imputed. The Fleiss' multirater (Cohen's) κ and Gwet's first-order agreement coefficient (AC1) were used to calculate agreement among the surgeons concerning rib fracture type, anatomic sector, and displacement. κ Values and Gwet's statistics were interpreted as follows: 0.01 to 0.20 indicate slight agreement; 0.21 to 0.40, fair agreement; 0.41 to 0.60, moderate agreement; 0.61 to 0.80, substantial agreement; 0.81 to 0.90, strong agreement; and >0.90, almost perfect agreement. [21][22][23][24] Both κ and AC1 values are provided to ensure the internal validity of the findings, since κ values can be affected by marginal probability. 21 To evaluate potential influencing factors, a stratified analysis was conducted as follows: years of training, surgical specialty, years of work experience, caseload, current continent of practice, number of fellow surgeons at the institution who perform surgical stabilization of rib fractures (SSRF), and if the surgeon was a supervisor of residents or not. In addition, the observer boundaries were added to the stratified analyses used to differentiate between the anatomic sectors for rib fracture location specifically.
Additional analysis was conducted to assess agreement when dichotomizing fracture type into simple versus not simple, which constituted of wedge and complex fractures. Similarly, the fractures were dichotomized into displaced versus not displaced, which included the offset and undisplaced fractures. To further investigate agreement on fracture type and displacement pattern, the cases with less than 80% agreement were identified and qualitatively assessed to determine possible causes of disagreement. Statistical significance was declared where 95% confidence intervals (CIs) did not overlap.

Observers
In total, 90 health care professionals responded to the invitation (4% of total invitations sent, including reminders). Of these 90 participants, 76 (84%) finished the complete survey. Most observers identified as male, North American trauma surgeons with more than 10 years of practice experience. The majority reported that they supervised residents (Table 1).
For agreement in the stratified analysis (Supplemental Digital Content, Supplementary Data 2, http://links.lww.com/ TA/C659), the results for fracture displacement were comparable with those of fracture type. No statistically significant differences were found. In the dichotomized analysis, statistically significant differences were found for continent practiced, with more agreement in Europe than in North America, and fewest in other continents. Bordering on statistical significance were higher agreement in residents than in surgeons with more than 20 years' experience in practice and decreased agreement with lower number of surgeons practicing SSRF in the institution (Supplemental Digital Content, Supplementary Data 3, http:// links.lww.com/TA/C660).

Qualitative Analysis
Qualitative analysis was conducted for the cases with lower than expected (<80%) agreement of fracture type and fracture displacement. Three cases had low agreement on fracture type when using three categories but higher than 80% agreement when fracture type was dichotomized in simple versus not simple. In one case, there was less than 80% agreement on fracture type, regardless of the classification in two or three categories (Table 3).
Four cases had lower than expected agreement for fracture displacement when using three categories but higher than 80% agreement when fracture displacement was dichotomized in displaced versus nondisplaced. In four other cases, agreement was lower than 80% regardless of the classification in two or three categories for displacement (Table 3). These low agreement cases were further evaluated to gain understanding in the reason why agreement was lower than expected, which is described in the Discussion.

DISCUSSION
This study aimed to establish the interobserver agreement on fracture location, type, and displacement as defined by the CWIS taxonomy, 17 among a large and diverse group of surgeons in multiple continents who are involved in the care for patients with rib fractures. Strong agreement was found for the classification of the fracture location, and moderate agreement was found for rib fracture type. Fair agreement among observers was found for rib fracture displacement, which is lower than was hypothesized. The interobserver agreement on the classification by Bemelman et al. 16 by four observers was substantial (κ = 0.62 [95% CI, 0.59-0.65]), which is higher than what was established for the CWIS taxonomy. However, the presented results for the CWIS taxonomy may be more reflective of clinical practice, with more than 70 observers from diverse backgrounds. Also, no former experience in the classification of rib fractures was required for participation, nor formal training in the CWIS taxonomy was provided, besides provision of the definitions of the CWIS taxonomy in the introduction and a link to the article.
Agreement on the anatomic location of a rib fracture was strongest of the investigated fracture characteristics, even though multiple methods were used to define anterior, lateral, or posterior fractures. There appears to be a clear understanding of location of fracture regardless of the definition used. The possible exception is the surgical approach as the definition, because the least agreement was found among observers who determined location by the surgical approach that would be used for stabilizing the rib fracture. The axillary lines were most often used for determining the location and were associated with high agreement, although it is a surface landmark, which can be difficult to establish on single CT images. In the original taxonomy, there was no consensus on the sector boundaries, although the use of axillary lines was also the method that received the most votes. Therefore, based on these combined results, an estimation of the axillary lines might be the recommended method for future studies.
Further qualitative analysis on cases with low agreement on fracture type raised the suspicion that in several cases some of the observes did not see all the fracture lines in complex fractures. This probably resulted in a spread between wedge and complex type fractures. In addition, in two cases with low agreement on type, the fracture did not completely fit with the CWIS taxonomy definitions. One fracture consisted of one rib-width fracture line and multiple incomplete fracture lines, creating two butterfly Data are shown as number of observers (% agreement). *n is the number of complete responses. **Cases with <80% agreement in three categories and when combined in two categories for fracture displacement. †Cases with <80% agreement in three categories, with ≥80% agreement when combined in two categories for fracture type. ‡Cases with <80% agreement in three categories, with ≥80% agreement when combined in two categories for fracture displacement. §Cases with <80% agreement in three categories and when combined in two categories for fracture type. fragments (Fig. 2). Most observers (54%) responded that this was a wedge type fracture, suggesting that a fracture with multiple butterfly fragments should probably classified as such. In another case, there were two rib width fracture lines with a considerable space in between. No specific CWIS taxonomy definition exists for the maximum distance between multiple fracture lines to be considered either one wedge or complex fracture or two separate fractures. This resulted in low agreement on type for this case. We suggest estimating if the fragment could be part of a flail pattern or not, which practically would be around a maximum of 2 cm between the fracture lines to be still considered the same fracture, based on our expert opinion. Furthermore, the qualitative analysis demonstrated that agreement on rib fracture displacement was lowest in complex type fractures. With multiple fracture lines, it seemed to vary if observers would grade the fracture line with the least or the most cortical bone contact, or if they made an overall estimate of bone contact. Moreover, in some cases, it was suspected that a portion of the observers accounted for alignment rather than cortical bone contact of the fracture. In one case, a butterfly segment was completely displaced, with the contralateral cortex still in place. In this scenario, 49% of observers considered this a displaced fracture, whereas 51% evaluated this as offset. This suggests that agreement on displacement is typically lower for fractures that consist of more than one fracture line. It seems insufficiently clear in the CWIS taxonomy definition how to establish displacement in fractures with multiple fragments. Probably considering the cortical bone contact of both ends of the fracture rather than alignment should be the preferred method, because this probably reflects the instability of the fracture better, as displacement can worsen over time. 24 In addition, the stratified analysis demonstrated that health care providers from Europe agreed more than health care providers from other continents, without a possible explanation that could be supported by the collected data. Theoretically, the difference might reflect a more homogenous group of European participants compared with health care providers from other parts of the world, although, again, this theory is not supported by the collected data.
The CWIS taxonomy consists of three categories for rib fracture type and displacement. In theory, agreement improves by reducing the number of categories, based on chance alone. However, when the categories for fracture type and displacement were dichotomized, in "simple vs. not simple" and "displaced vs. not displaced," the agreement improved somewhat, although this was only statistically significant in the AC1 for fracture displacement. Moreover, the clinical significance of categorizing in three rib fracture types is not clear, since only complex fractures are associated with worse outcomes. 18 Potentially, it is sufficient to categorize rib fracture type in only in "simple" and "complex," not discerning wedge type fractures as a separate category from the simple type since agreement and clinical relevance in this category is low. However, for fracture displacement, three categories are probably justified. This is because the three fracture displacement categories are associated with clinical outcomes and displacement is commonly used to set the indication for surgery. 18 However, the offset category had the least agreement in the displacement category, warranting a further specification of the definition. For example, it is currently unclear if offset is defined as between 10% and 90% bone contact in one image or if it should be visible in multiple images, and in just one plane, for example, transversal, or at least one other plane. This should be better defined and evaluated in future studies accounting for rib fracture characteristics.
Over the past few years, CT reconstructions in threedimensional and unfolded reconstructions of the chest wall are becoming more widely available. 25,26 It is yet to be determined what the interobserver agreement will be for the classification of rib fractures aided with these imaging modalities. To evaluate their contribution, future interobserver studies will be necessary.
This study has some limitations. Most importantly, the provided imaging was limited in comparison with daily practice for practical and confidentiality reasons. Providing the whole CT scan with the possibility to change the settings, use zoom options, and enhance contrast might have resulted in more heterogeneity in the responses, lowering agreement. Second, a learning curve for applying the taxonomy definitions could have been present but was not accounted for, although not observed. Moreover, it is unknown how often observers had previously participated in this type of study or in the Delphi group developing the CWIS taxonomy. Third, it is unclear if absence of clinical information has influenced the results. Possibly, a history of high energetic thoracic trauma could stimulate the observer to select a more severe injury type and displacement.
Despite these limitations, this study suggests that the CWIS taxonomy has strong agreement on fracture location, moderate agreement on fracture type, and fair agreement on fracture displacement. Revisiting the definitions of the CWIS taxonomy on rib fracture type and displacement may be warranted. For rib fracture type, the "wedge" category could potentially be omitted. For rib fracture displacement, the percentage of bone contact rather than alignment should be clearer in the definition, especially for rib fractures with multiple fragments. Nevertheless, these changes will have to be evaluated. The role of additional or enhanced imaging from the chest CT scans by three-dimensional reconstructions will have to be assessed for its ability to increase agreement on the classification of fracture type and displacement as defined by the CWIS taxonomy. AUTHORSHIP S.F.M.V.W. contributed to the study design, data collection, statistical analysis, and manuscript writing. C.C. contributed to the selection of clinical cases, data collection, and critical revision of the manuscript. A.S. contributed to the statistical analysis, manuscript writing, and critical revision of the manuscript. E.M.M.V.L. contributed to the study design and critical revision of the manuscript. S.A.S.W. contributed to the study design, data collection, and critical revision of the manuscript. J.G.E. contributed to the study design and critical revision of the manuscript. F.M.P. contributed to the study design, selection of clinical cases, and critical revision of the manuscript. M.M.E.W. contributed to the study design, selection of clinical cases, and critical revision of the manuscript.

DISCLOSURE
The authors declare no conflicts of interest.