Introduction
There is an emphasis in orthopaedic surgery to provide quality health care through evidence-based medicine. The transparent and objective assessment of quality is increasingly demanded by patients, government agencies, payers, and healthcare providers [4 ]. Objective and reliable outcome reporting is imperative to assess the quality, effectiveness, and comparative value of different surgical interventions. Clinical outcome studies pertaining to hip preservation surgery (arthroscopy, surgical dislocation, osteotomies) report data that commonly include hip function, overall health, quality of life, and activity scores [1 , 2 , 13 , 20 , 21 ]. Although recognized as a critical component of evidence-based outcome studies, a review of randomized controlled studies concluded that there was a lack of clear definitions of complications and classification of complications [15 ]. Further, no standardized method exists for complication grading and reporting in hip surgery outcome studies [7 ]. Conclusions regarding outcomes are incomplete without a standardized, objective complication grading scheme applied concurrently. In turn, the lack of an established basis for grading and reporting complications renders comparisons among studies ineffective.
Although reported data from short- to medium-term single-institution retrospective analyses indicate improvement in quality of life, pain, and function for most patients after hip preservation surgery [1 , 16 , 21 ], a review of the literature on outcomes of surgery for acetabular dysplasia and femoroacetabular impingement (FAI) suggests the need for validated standardized outcome measures and a standardized measure of complications associated with surgical treatment [6 , 7 ]. A validated complication grading scheme for hip preservation surgery that can be used universally will allow easier interpretation of the literature and true comparisons of the risks of different hip procedures. For the purposes of reporting in a standardized manner, a complication grading scheme should be simple, objective, and reproducible, allowing orthopaedic surgeons to use it nationally and internationally. A well-defined grading scheme reduces the amount of interpretation by individual surgeons, ie, by eliminating classification terms such as minor, moderate, and severe, which often appear in the literature and are used inconsistently among studies owing to the subjective nature of the terminology. Moreover, it is essential to account for unplanned treatments necessary for management of a complication and the association with future morbidity or persistent disability.
The classification system of Clavien et al. was originally tested using 650 cholecystectomy cases [5 ] and contained four grades of increasing severity. In 2004, Dindo et al. proposed a modification that included a new five-grade classification system with the possibility of combining grades; “disability” was added to the system, “hospital stay” was eliminated as a complication and “organ failure” was changed to carry more weight in grading [12 ]. The so-called “Clavien-Dindo” classification has been adapted for use in general surgery [10 ] and for other surgical subspecialties such as gastroenterology/hepatology [3 , 11 , 17 , 22-25 , 28 ], urology [19 ], and nephrology [18 , 27 ]. Given this system appears useful for general surgery and other specialties, we explored its use in orthopaedic surgery.
We therefore determined the interreader and intrareader reliabilities of the adapted classification scheme as applied to orthopaedic surgery, specifically hip preservation surgery.
Patients and Methods
The classification system of Dindo et al. [12 ] (Table 1 ) was adapted for application to orthopaedic surgery (Table 2 ). Our adapted scheme consists of five grades based on (1) the treatment required to manage the complication and (2) the associated long-term morbidity. As delineated in Table 2 , complications requiring no change in the routine postoperative course are classified as Grade I, whereas those necessitating a change in outpatient management fall under Grade II. Complications requiring invasive surgical or radiographic management without any long-term morbidity are defined as Grade III. Grade IV has long-term morbidity or a life-threatening complication, and Grade V is death. For the purposes of illustration, a Grade IV complication after abdominal surgery indicates a complication that is life threatening, requires intensive care unit admission, or is associated with organ dysfunction. Our orthopaedic adaptation for a Grade IV surgical complication is defined as any intensive care unit admission, osteonecrosis of the femoral head or acetabulum, permanent nerve injury, major vascular injury, or pulmonary embolus, each representing the orthopaedic equivalent of organ dysfunction.
Table 1: Original classification of surgical complications [
12 ]
Table 2: Adapted complication classification chart
To evaluate the reliability of our adapted system we identified 10 readers to evaluate 44 clinical scenarios related to hip preservation surgery. The 10 surgeons were from a multicenter study group with diverse backgrounds (fellowship training in sports medicine, joint arthroplasty, and pediatric orthopaedics), all of whom have an interest in hip disorders in the adolescent and young adult population. Training readers in the classification system was done at an in-person meeting, during which the adapted scheme was described to eight of the 10 readers. Two international readers were not present for the training meeting; however, a brief description of the study had been provided to them by e-mail. The training session consisted of a description of the classification scheme, followed by 10 sample scenarios (not included in the 44 study scenarios), which were graded and discussed; all questions were answered. Subsequently, an e-mail was sent to all 10 readers containing the classification scheme and the previously graded sample scenarios. In addition, the e-mail message contained a recapitulation of the guidelines described during the training session to focus on grading the clinical scenario using the definition, concentrating on the treatment required to manage the complication and the potential for long-term morbidity. This correspondence acted as the only formal training for the two international readers who did not attend the in-person meeting.
Clinical scenarios were derived from a prospective multicenter outcomes database with approximately 500 surgical procedures (at the time of scenario development) currently used for open and arthroscopic hip preservation surgery performed to treat hip dysplasia or impingement syndromes, from which we included all cases (n = 40) in the database with recorded complications. From these 40 cases, we found there was an underrepresentation of Grade IV cases. Therefore, an additional four scenarios were randomly selected from case reports in the literature to provide an adequate number of Grade IV complications for this analysis that more accurately represents the distribution of surgical cases in the hip preservation registry. These cases were similar to those in the multicenter database: hip preservation surgery (arthroscopy, surgical dislocation, osteotomies) associated with postoperative complications. Thus, each grade of complication was equally represented (11 scenarios for each of four grades; death [Grade V] was excluded as there were no cases in the database). Two of us (ELS, JC) summarized the 44 study cases to include age, sex, type of hip preservation surgery, complication presentation and course, type of management, and long-term outcome (Appendix 1; supplemental materials are available with the online version of CORR.)
The scenarios were sent to the readers by electronic format with a designated space for grading at the end of the specific scenario; all readers had access to the classification scheme while grading. The readers sent their graded scenarios to the coordinating center for evaluation by the first and senior authors (ELS, JC), who used a grading key that was created by the senior authors and recorded errors on a spreadsheet. Two weeks later, the same scenarios were sent again to the readers in a different order randomly performed and graded in the same manner. The graded data were collected and analyzed for reliability across the different readers for the same scenario and for individual readers’ accuracy across scenarios. The interobserver agreements were assessed for each grade of the complication.
We evaluated initial analysis of patient characteristics using means and SDs for continuous variables and frequencies and percentages for categorical variables. We then calculated interrater and intrarater correlations using Fleiss’ and Cohen’s κ statistics, respectively. To assess the precision of the κ statistics, we calculated 95% CIs for each correlation. Statistical analyses were conducted using SAS® 9.2 (SAS Institute Inc, Cary, NC, USA).
We determined sample size to provide adequate variability to assess discrimination among complication grades and acceptably precise reliability estimates. Based on a simulation study of 5000 random samples, when the sample consisted of 44 subjects, each being assessed by nine raters on a four-category classification system, there would be a greater than 95% chance to reject the null hypothesis that Fleiss’ κ is less than 0.7, if the true Fleiss’ κ is 0.9. Chance-adjusted Fleiss’ and Cohen’s κ statistics with their 95% CIs were used to determine interobserver and intraobserver reliabilities, respectively [8 , 14 ].
Results
Differences in grading the scenarios were evenly distributed among the grades. The reliability of the two international graders was no different from that of the graders who attended the training meeting in which the classification system was first introduced.
The Fleiss’ κ values for the first reading were 0.909 (95% CI, 0.865-0.953) for Grade I, 0.85 (95% CI, 0.806-0.894) for Grade II, 0.87 (95% CI, 0.826-0.914) for Grade III, and 0.918 (95% CI, 0.874-0.962) for Grade IV (Table 3 ). Therefore, Grade II was the least well defined. The overall mean κ for the first reading was 0.886 (95% CI, 0.861-0.912). For the second reading, Fleiss’ κ values were 0.941 (95% CI, 0.897-0.985) for Grade I, 0.827 (95% CI, 0.783-0.871) for Grade II, 0.801 (95% CI, 0.757-0.845) for Grade III, and 0.876 (95% CI, 0.832-0.920) for Grade IV (Table 3 ). The overall mean κ for the second reading was 0.859 (95% CI, 0.833-0.884).
Table 3: Fleiss’ κ for interrater reliability
The results of the second reading were similar to the results of the first reading in terms of the range of observed κ values and means. In the first reading, Grades II and III had a slightly lower agreement rate with kappa chance adjusted agreements of 85% and 87% respectively versus 90.9% for Grade I and 91.8% for Grade IV in Reading 1. In the second reading, Grades II and III had kappa chance adjusted agreements of 82.7% and 80.1% versus 94.1% and 87.6% for Grades I and IV. Intraobserver testing yielded an overall Cohen’s κ value of 0.891 (95% CI, 0.857-0.925) (Table 4 ). Intraobserver testing yielded an overall Cohen’s κ value of 0.891 (95% CI, 0.857-0.925) (Table 4 ), indicating excellent repeatability.
Table 4: Cohen’s κ for intrarater reliability
Discussion
A standardized and reliable classification-grading scheme will allow complications to be assessed in an objective manner and allow complications to be compared among different outcome studies. The classification should emphasize the impact of the complication to the patient, the healthcare system, and the potential long-term morbidity. We adapted a validated general surgery grading system developed by Clavien et al. [5 ] for abdominal surgical complications, which relies on the magnitude of treatment needed to manage a complication and the potential for associated long-term morbidity [4 , 12 ]. In the original publication of this classification, Clavien et al. stated “lack of uniform reporting of negative outcomes makes interpretation of surgical literature difficult” [5 ], a statement particularly relevant to current orthopaedic outcome studies. Hip preservation surgery is a growing field in orthopaedics, yet surgeons have no universal method of measuring clinical outcome and reporting complications. Recent studies assert the reporting of complications is not homogeneous, well defined, or standardized in many orthopaedic studies [6 , 7 , 15 ]. Terms such as minor, moderate, and severe have been used, but they are unreliable, subjective, and often are defined separately by each author [7 , 9 ]. Standardized complication reporting is critical for future preservation surgery outcome analysis so that data from different centers may be clearly compared, and risks involved with specific surgical preservation techniques may be objectively evaluated. We therefore determined the interreader and intrareader reliabilities of the adapted classification scheme as applied to orthopaedic surgery, specifically hip preservation surgery. Our adapted classification system had high interobserver and intraobserver reliabilities for grading of complications.
There are certain limitations to this study. First, inexperience using the adapted classification scheme likely accounts for agreement being less than 100%. Nevertheless, we found high reliability, and we presume continued use of this grading system would be associated with improved reliability with time. Second, certain complications change with time, as in the case of osteonecrosis, which may take months to develop; thus, its grade will change to accommodate the appropriate level of management. A potential change in grade with time is not unique to this classification system. Additionally, differences in complication management for procedures such as hematoma evacuation or intensive care unit admission may vary among institutions, resulting in a discrepancy in grading a specific complication. In addition, there are various complications that would be given a specific grade and not all of them are similar. For example, a procedure to evacuate a hematoma may be considered not as severe to a surgeon as a procedure to treat an infection. Yet, if the end result of both procedures is that the problem was treated and there is no long-term morbidity, it is classified as Grade III. The subjective argument regarding which complication is more severe is not part of the final outcome of the patient and the eventual grade for this classification. Despite these possible variances in grading, the adapted grading scheme is objective at its base and is a major improvement for standardized reporting of complications associated with hip preservation surgery. Third, in view of the fact that the readers are from the same study group, there may be bias in the data. However, we had 10 observers representing three countries (Canada, United States, and Switzerland), indicating a strong reliability in slightly different medical cultures. Many of the complications (infection, nonunion, deep venous thrombosis) were not necessarily specific to hip preservation surgery. This may be an indication for the use and testing of the adapted classification scheme in future studies of complications in other orthopaedic procedures. The reliability obtained with this investigation should be directly applicable to hip preservation surgery, as the scenarios represented actual surgical cases, but we cannot say whether the reliability would apply to other scenarios, specialties, or surgeons. Finally, there may be differences in the definition of a complication as described in the next paragraph.
In their original classification, Clavien et al. [5 ] first defined the term complication. When used in conjunction with other outcome parameters such as function, we believe there should be a distinction in what is defined as a complication and what is defined as a failure of treatment. The definition of a complication for this classification is any deviation from the normal postoperative course not inherent in the procedure. As espoused by Dindo et al. [12 ], a complication is different from a sequela, which is inherent to the procedure and inevitable (such as scar tissue), or failure to cure, which occurs when the original goal of the surgery has not been met (ie, continued hip pain after arthroscopy). Further, they argue sequelae and failure to cure are not part of a surgical complication classification [12 ]. We agree with the concept that sequelae and failure to cure are better-assessed using outcome measures rather than reported as complications. Outcome measures other than this complication classification will better define which indications and techniques are successful for hip preservation surgery to decrease failures of treatment. Poor patient selection for surgery, such as hip preservation in a patient with advanced osteoarthritis, would be evaluated by outcomes rather than by this complication classification. Used in conjunction with this complication classification the risk of the procedure versus the outcome can be assessed. If a complication of the surgical procedure (such as avascular necrosis) results in long-term morbidity this would be included in the complication classification. For avascular necrosis this would be classified as Grade IV with long-term morbidity resulting from the surgery.
In the original validation study, the Clavien-Dindo classification was tested in a cohort of 6336 patients who underwent elective general surgery [12 ]. They found strong correlations between length of hospital stay, surgical complexity, and classification system grades. Reproducibility of the classification scheme also was assessed by using clinical scenarios that were mailed to international surgeons. Analysis of the clinical scenarios showed the classification system to be reproducible, simple, and useful [12 ]. In a review of the 5-year experience with this classification, there were 161 citations that used the classification system to grade complications between 2004 and March 2009. These were all in general surgical subspecialties, including gynecology [4 ]. DeOliveira et al. [10 ] adapted the Clavien-Dindo classification system for complications associated with pancreatic surgery; they concluded the adaptation provided an objective and reproducible assessment that could enable comparisons among centers. An adapted scheme also was used in a recent orthopaedic study evaluating complications after surgical hip dislocation procedures [26 ].
Based on our study results, we conclude the adapted Clavien-Dindo classification scheme for reported complications associated with hip preservation surgery has high interobserver and intraobserver reliabilities. Our adapted classification scheme endeavors to facilitate standardization of complication reporting and may be useful for all orthopaedic procedures. Additionally, it is applicable to future outcomes analysis of hip preservation surgery because of (1) any long-term morbidity from the complication, which is critical to the risk/benefit analysis of the procedure; and (2) the magnitude of treatment (inpatient, outpatient, reoperation) required to manage the complication, which is important to the overall cost to the patient, third-party insurer, and hospital.
Acknowledgments
We thank Amy Monreal BS, Joseph T. Nguyen MPH, and Pan Zhaoxing PhD for assistance with preparation of this manuscript.
References
1. Beaulé, PE., Duff, MJ. and Zaragoza, E. Quality of life following femoral head-neck osteochondroplasty for femoroacetabular impingement.
J Bone Joint Surg Am. 2007; 89: 773-779. 10.2106/JBJS.F.00681
2. Beck, M., Leunig, M., Parvizi, J., Boutier, V., Wyss, D. and Ganz, R. Anterior femoroacetabular impingement: part II. Midterm results of surgical treatment.
Clin Orthop Relat Res. 2004; 418: 67-73. 10.1097/00003086-200401000-00012
3. Chun, YS., Vauthey, JN., Ribero, D., Donadon, M., Mullen, JT., Eng, C., Madoff, DC., Chang, DZ., Ho, L., Kopetz, S., Wei, SH., Curley, SA. and Abdalla, EK. Systemic chemotherapy and two-stage hepatectomy for extensive bilateral colorectal liver metastases: perioperative safety and survival.
J Gastrointest Surg. 2007; 11: 1498-1504. 10.1007/s11605-007-0272-2
4. Clavien, PA., Barkun, J., Oliveira, ML., Vauthey, JN., Dindo, D., Schulick, RD., Santibañes, E., Pekolj, J., Slankamenac, K., Bassi, C., Graf, R., Vonlanthen, R., Padbury, R., Cameron, JL. and Makuuchi, M. The Clavien-Dindo classification of surgical complications: five-year experience.
Ann Surg. 2009; 250: 187-196. 10.1097/SLA.0b013e3181b13ca2
5. Clavien, PA., Sanabria, JR. and Strasberg, SM. Proposed classification of complications of surgery with examples of utility in cholecystectomy.
Surgery. 1992; 111: 518-526.
6. Clohisy, JC., Schutz, AL., St John, L., Schoenecker, PL. and Wright, RW. Periacetabular osteotomy: a systematic literature review.
Clin Orthop Relat Res. 2009; 467: 2041-2052. 10.1007/s11999-009-0842-6
7. Clohisy, JC., St John, LC. and Schutz, AL. Surgical treatment of femoroacetabular impingement: a systematic review of the literature.
Clin Orthop Relat Res. 2010; 468: 555-564. 10.1007/s11999-009-1138-6
8. Cohen, J. A coefficient of agreement for nominal scales.
Educ Psychol Measurement. 1960; 20: 37-46. 10.1177/001316446002000104
9. Davey, JP. and Santore, RF. Complications of periacetabular osteotomy.
Clin Orthop Relat Res. 1999; 363: 33-37. 10.1097/00003086-199906000-00005
10. DeOliveira, ML., Winter, JM., Schafer, M., Cunningham, SC., Cameron, JL., Yeo, CJ. and Clavien, PA. Assessment of complications after pancreatic surgery: a novel grading system applied to 633 patients undergoing pancreaticoduodenectomy.
Ann Surg. 2006; 244: 931-937. 10.1097/01.sla.0000246856.03918.9a
11. Santibanes, E., Ardiles, V., Gadano, A., Palavecino, M., Pekolj, J. and Ciardullo, M. Liver transplantation: the last measure in the treatment of bile duct injuries.
World J Surg. 2008; 32: 1714-1721. 10.1007/s00268-008-9650-5
12. Dindo, D., Demartines, N. and Clavien, PA. Classification of surgical complications: a new proposal with evaluation in a cohort of 6336 patients and results of a survey.
Ann Surg. 2004; 240: 205-213. 10.1097/01.sla.0000133083.54934.ae
13. Espinosa, N., Beck, M., Rothenfluh, DA., Ganz, R. and Leunig, M. Treatment of femoro-acetabular impingement: preliminary results of labral refixation. Surgical technique.
J Bone Joint Surg Am. 2007; 89: (suppl 2):36-53. 10.2106/JBJS.F.01123
14. Fleiss, JL. Measuring nominal scale agreement among many raters.
Psychol Bull. 1971; 76: 378-382. 10.1037/h0031619
15. Goldhahn, S., Sawaguchi, T., Audigé, L., Mundi, R., Hanson, B., Bhandari, M. and Goldhahn, J. Complication reporting in orthopaedic trials: a systematic review of randomized controlled trials.
J Bone Joint Surg Am. 2009; 91: 1847-1853. 10.2106/JBJS.H.01455
16. Leunig, M., Beaulé, PE. and Ganz, R. The concept of femoroacetabular impingement: current status and future perspectives.
Clin Orthop Relat Res. 2009; 467: 616-622. 10.1007/s11999-008-0646-0
17. McKay, A., Sutherland, FR., Bathe, OF. and Dixon, E. Morbidity and mortality following multivisceral resections in compex hepatic and pancreatic surgery.
J Gastrointest Surg. 2008; 12: 86-90. 10.1007/s11605-007-0273-1
18. Patel, S., Cassuto, J., Orloff, M., Tsoulfas, G., Zand, M., Kashyap, R., Jain, A., Bozorgzadeh, A. and Abt, P. Minimizing morbidity of organ donation: analysis of factors for perioperative complications after living-donor nephrectomy in the United States.
Transplantation. 2008; 85: 561-565. 10.1097/TP.0b013e3181643ce8
19. Permapongkosol, S., Link, RE., Su, LM., Romero, FR., Bagga, HS., Pavlovich, CP., Jarrett, TW. and Kavoussi, LR. Complications of 2,775 urological laparoscopic procedures: 1993 to 2005.
J Urol. 2007; 177: 580-585. 10.1016/j.juro.2006.09.031
20. Peters, CL., Schabel, K., Anderson, L. and Erickson, J. Open treatment of femoroacetabular impingement is associated with clinical improvement and low complication rate at short-term followup.
Clin Orthop Relat Res. 2010; 468: 504-510. 10.1007/s11999-009-1152-8
21. Philippon, MJ., Briggs, KK., Yen, YM. and Kuppersmith, DA. Outcomes following hip arthroscopy for femoroacetabular impingement with associated chondrolabral dysfunction: minimum two-year follow-up.
J Bone Joint Surg Br. 2009; 91: 16-23. 10.1302/0301-620X.91B1.21329
22. Reddy, SK., Morse, MA., Hurwitz, HI., Bendell, JC., Gan, TJ., Hill, SE. and Clary, BM. Addition of bevacizumab to irinotecan-and oxaliplatin-based preoperative chemotherapy regimens does not increase morbidity after resection of colorectal liver metastases.
J Am Coll Surg. 2008; 206: 96-106. 10.1016/j.jamcollsurg.2007.06.290
23. Reddy, SK., Pawlik, TM., Zorzi, D., Gleisner, AL., Ribero, D., Assumpcao, L., Barbas, AS., Abdalla, EK., Choti, MA., Vauthey, JN., Ludwig, KA., Mantyh, CR., Morse, MA. and Clary, BM. Simultaneous resections of colorectal cancer and synchronous liver metastases: a multi-institutional analysis.
Ann Surg Oncol. 2007; 14: 3481-3491. 10.1245/s10434-007-9522-5
24. Ribero, D., Abdalla, EK., Madoff, DC., Donadon, M., Loyer, EM. and Vauthey, JN. Portal vein embolization before major hepatectomy and its effects on regeneration, resectability and outcome.
Br J Surg. 2007; 94: 1386-1394. 10.1002/bjs.5836
25. Seda-Neto, J., Godoy, AL., Carone, E., Pugliese, V., Fonseca, EA., Porta, G., Pugliese, R., Miura, IK., Baggio, V., Kondo, M. and Chapchap, P. Left lateral segmentectomy for pediatic live-donor liver transplantation: special attention to segment IV complications.
Transplantation. 2008; 86: 697-701. 10.1097/TP.0b013e318183ed22
26. Sink, EL., Beaulé, P., Sucato, D., Kim, YJ., Millis, MB., Dayton, M., Trousdale, RT., Sierra, RJ., Zaltz, I., Schoenecker, P., Monreal, A. and Clohisy, J. Multicenter study of complications following surgical dislocation of the hip.
J Bone Joint Surg Am. 2011; 93: 1132-1136. 10.2106/JBJS.HS.K.00142
27. Sundaram, CP., Martin, GL., Guise, A., Bernie, J., Bargman, V., Milgrom, M., Shalhav, A., Govani, M. and Goggins, W. Complications after a 5-year experience with laparoscopic donor nephrectomy: the Indiana University experience.
Surg Endosc. 2007; 21: 724-728. 10.1007/s00464-006-9176-6
28. Tamura, S., Sugawara, Y., Kaneko, J., Yamashiki, N., Kishi, Y., Matsui, Y., Kokudo, N. and Makuuchi, M. Systematic grading of surgical complications in live liver donors according to Clavien’s system.
Transpl Int. 2006; 19: 982-987. 10.1111/j.1432-2277.2006.00375.x