Validation of Classification
Patients with a diagnosis of CFM were identified from our prospective craniofacial patient registry, following study review and approval by the institutional review board at the Children’s Hospital of Philadelphia. Radiographic records were reviewed, and CT scan data retrieved. CT scans were included if they were performed before any mandibular surgical treatment and of sufficient fidelity to enable fine resolution 3-dimensional (3d) reconstruction. For patients with multiple CT scans meeting inclusion criteria, the scan at oldest age was utilized to enable the broadest distribution of patient ages in our sample. The 3dCT reconstructions of the bilateral mandible and skull base in anteroposterior, lateral, and submental views were assembled for each patient and deidentified. The image sets were then loaded onto an online survey platform (SurveyMonkey, Palo Alto, Calif.)
Evaluators were identified from membership in the International Society of Craniofacial Surgery and were selected to represent a range of experience and practice locations. Demographic information for each evaluator was collected including type of training, location, experience, number of patients with CFM seen per year, and CFM classification system used in clinical practice. An attempt was made to capture evaluators from our previous investigation on interrater reliability of the Kaban modification of the Pruzansky classification22 to control for intrarater variability and allow for potential comparison of these results with those of our previous study. Each evaluator then reviewed each CFM patient 3dCT radiographs in 4 views and was asked to classify 1 hemimandible (type 0 through type 4) based on the new classification. In hemifacial cases, the reviewer assessed the more severe side, except in 3 patients where the unaffected side was selected to include the type 0 (normal) classification in the evaluation. The order of 3dCT radiographs was independently randomized for each evaluator.
Several of the same 3dCT scans used in the previous assessment of the Pruzansky–Kaban classification22 were used in this study, to enable comparison of how the 2 classification systems evaluated the same deformity.
Statistical analysis was performed using STATA 12.0 (StataCorp, College Station, Tex.). Percentage agreement was calculated among evaluators. One-way analysis of variance was used to assess for differences in percentage agreement between individual evaluators and individual scoring levels. Fleiss’ κ was used to assess interrater reliability using ReCal3 (available online at http://dfreelon.org/utils/recalfront/recal3/; accessed February–May, 2015). Comparison of evaluator percentage agreement between the Pruzansky–Kaban classification22 and proposed classification was made using a two-tailed t-test with equal variances.
Institutional Review Board
This study was approved by the Institutional Review Board of the Children’s Hospital of Philadelphia.
The 3dCT images of 43 patients who met inclusion criteria were assembled, which represented a patient age range of 0.1–15.8 years. CFM had been diagnosed as right-sided in 20 patients (47%), left-sided in 19 patients (44%), and bilateral in 4 patients (9%). The 3dCT images were evaluated by 15 craniofacial surgeons, representing a mean 15.2 years of experience (range, 1–40 years) and who each estimate seeing an average of 27 patients (range, 10–80) with CFM annually. Twelve surgeons (80%) were in academic practice; all 15 (100%) received fellowship training in craniofacial surgery, and they reported primarily using the Pruzansky (5, 33%), Kaban (3, 20%), OMENS (5, 33%), or a combination of the 3 (2, 14%) classifications (Table 2).
On average, evaluators classified 7.0% of patients as normal (T0), 19.2% mild (T1), 34.9% moderate (T2), 26.4% severe (T3), and 12.4% severe (T4) (Table 3). Evaluators demonstrated fair interrater reliability with average pairwise agreement of 50.4 ± 9.9% (Fleiss’ κ = 0.34). Average pairwise Cohen’s κ was 34.0%. When distinguishing deformities requiring graft or flap reconstruction (T3 and T4) from others (T0–T2), reviewers demonstrated substantial interrater reliability with average pairwise agreement of 83.0 ± 7.6% (κ = 0.64). Average pairwise Cohen’s κ was 64.7%. Interrater agreement was slightly lower among evaluators with more experience (more than 13 years in practice, κ = 0.31) compared with those with less experience (fewer than 13 years in practice, κ = 0.37). Interrater agreement was similar between cohorts of surgeons who see a high volume of patients with CFM (25 or more patients per year, κ = 0.34) and those who see a smaller volume (fewer than 25 patients per year, κ = 0.33).
Evaluator agreement using this classification (50.4%; 95% confidence interval, 44.7–56.1) was significantly greater than using the Pruzansky–Kaban classification22 (39.2%, 95% confidence interval, 34.3–44.1; P = 0.003). To further compare the results from this study to those of the previous study, we reviewed the findings from several 3dCT scans that were used in both studies. These were individually illustrative of improved interrater concordance using a new classification system designed for 3dCT. For example, when the 16 evaluators in the previous study were asked to apply the Pruzansky–Kaban classification to the deformity in Figure 2, 6.3% rated it “0,” 31.3% rated it “1,” 18.8% rated it “2A,” 12.5% rated it “2B,” and 31.3% rated it “3.” When asked to evaluate the same patient’s deformity using the same 3d reconstructions in this study according to the new classification, no evaluators rated it “0,” “1,” or “2”; 66.7% rated it “3”; and 33.3% rated it “4.”
The widespread dissemination of 3dCT has revolutionized the evaluation of many craniofacial conditions for diagnosis, preoperative planning, and postoperative assessment. The variability and involvement of multiple anatomic components of CFM render 3dCT particularly valuable; the 3d dentoskeletal and soft tissue views provide fine resolution for analysis.23 The 3dCT facilitates evaluation of deformity characteristics that enable decision making involving more recently popularized treatment techniques; for example, assessing mandibular body inadequacy in the setting of an absent ramus/condyle unit to support choice of a free fibular flap. Several studies attest to interest in a renewed classification of the CFM mandibular deformity in the era of 3dCT22,24; however, none have fully considered the evidence base or taken steps to validate a classification across a large series of patients.
An optimal diagnostic classification system should categorize a disease entity, facilitate communication among physicians and patients, and potentially guide treatment and prognosis. The different classes should be mutually exclusive and collectively exhaustive. Balancing a classification’s specificity to distinguish nuances of the disease, ability to unify sufficient patients into groups to speak meaningful about them, and simplicity to be integrated into routine clinical use by a variety of surgeons, is all challenging. The challenge is compounded for a disease process that affects different parts of the head and neck heterogeneously and for which surgical treatment techniques vary among surgeons and generally lack an evidence base. The historical record of evolving classifications for CFM reflects this tension, and different classification systems prioritize one over the other.6,8–17
The proposed classification for the CFM mandibular deformity is based on 3dCT diagnosis and incorporates a common treatment modality for each type. We found that expert evaluators demonstrated fair interrater reproducibility with average pairwise agreement of 50.4% using this classification. This was a significantly higher degree of interrater agreement than found when similar evaluators used the Kaban modification of the Pruzansky classification (average pairwise agreement 39.2%.)22
We further found that this classification lended to substantial agreement (κ = 0.64) among evaluators distinguishing deformities requiring graft or flap reconstruction (T3 and T4) from others (T0–T2). This is a considerable clinical threshold, because it differentiates between deformities that mandate treatment with tissue transfer and those that can be suitably treated with distraction or orthognathics. When evaluators used the Pruzansky–Kaban classification for the mandible in Figure 2, for example, 56% labeled the deformity as class 0, 1, or 2a and 44% labeled it class 2b or 3—in other words, evenly split on whether tissue transfer would be typically indicated for reconstruction. Using the new classification, 100% of the evaluators considered it to be T3 or T4—either of which typically indicate tissue transfer for reconstruction. Furthermore, the breakpoint between T3 and T4 deformities may represent the most subtle clinical distinction, given the subset of surgeons who do not use microvascular-free tissue transfer and would perform costochondral grafting for both deformity types.
Kaban et al25 were critical of the previous study,22 which found low interrater reliability among 16 craniofacial surgeons who used the Pruzansky–Kaban classification, because “their average of 15 years’ of experience does not ensure that the Pruzansky and Kaban system was used correctly.” We would disagree that an effective classification system should be predicated on “correct” usage—that it should be limited to those who have received specific training, been certified in its use, or some such arrangement. We believe that a classification is most effective when it is understandable, inclusive, and facilitates communication among professionals regardless of their depth of experience. Our findings that the interrater agreement was consistent between surgeons with higher- and lower-volumes of CFM patients, and slightly higher among surgeons with relatively less rather than more seniority, suggests that this classification is accessible and applicable across a range of levels of experience. That the interrater agreement of the Pruzansky–Kaban classification was limited—even among surgeons who all used the Pruzansky or full OMENS system to classify patients with CFM in their respective practices—speaks to its limited reproducibility in a period where 3dCT use has been widely adopted. But this may not discount its utility, say, within a single institution where tradition or formal training may better reinforce its correct use.
Classification of a surgical disease is also most useful when it guides surgical decision making. We learned from discussing our work on this classification with other craniofacial surgeons that the anatomical variations of greatest interest were the ones that influenced branch points in their own surgical algorithms for CFM management. Two imperatives, thus, shaped our refinement of the classification: what are the most common features of algorithms for managing the mandible in CFM and what is the evidence base for these practices? For instance, the zygomatic arch in CFM demonstrates variability that may not correspond with the mandible other OMENS features,26 yet if deformed may suggest to the surgeon the need to construct a neo-temporal fossa.27 Given reports of high ankylosis rates associated with temporomandibular joint reconstruction as opposed to apposition of the rib/fibular construct with the skull base in type 3 or 4 deformities, we did not subclassify based on zygomatic arch deformity or degree of medialization of the condylar remnant. As another example, the degree of soft-tissue deficiency in a patient with CFM could influence the decision making for mandibular bony reconstruction. A free fibular flap with a large myofascial cuff would augment the soft tissue to a greater degree than a costochondral graft, given a patient with borderline mandibular body bone stock who could otherwise be a candidate for either procedure. Nonetheless, a patient with sufficient mandibular body bone stock (type 3) could just as easily undergo costochondral grafting, with structural lipoaspirated fat grafting to treat the soft-tissue deficiency. Furthermore, Lauritzen et al13 point out that following adequate skeletal reconstruction in CFM, there may not be need for subsequent soft-tissue augmentation.
Thus, we aimed to streamline this classification to the fewest essential anatomic components to drive decision making, acknowledging that borderline cases could and should be decided by factors such as soft-tissue deficiency. We note that emphasizing the role of bony 3dCT does not dismiss the importance of soft-tissue evaluation and physical examination in CFM. The latter are important, but in our opinion are downstream modifiers of decision making after the 3dCT rather than the other way around.
The obvious advantage of a system that incorporates not only diagnostic classifications but also corresponding treatment modality is that it more directly guides surgical decision making. Lauritzen et al13 used this approach to CFM in 1985 by proposing a treatment-guidance scheme dividing skeletal deformities based on management considerations. There are 2 potential disadvantages, however. First, as treatment modalities evolve, a classification that prescribes certain treatments’ risks becoming obsolete. Second, in a situation where surgeons agree on a type of deformity but disagree on the appropriate treatment for it, a classification that prescribes certain treatments’ risks objection from those who disagree with therapy only. While acknowledging these 2 limitations, what we discovered was that what many surgeons consider the most relevant anatomy from a diagnostic standpoint was that which distinguished among the appropriateness of different surgical interventions. In other words, what was diagnostically most relevant was intrinsically tied to the treatment options. As new treatment paradigms appear in the future, so likely will new aspects of the deformity become important from a diagnostic standpoint. Hence, even if the classification did not set out prescribed treatment for each type of deformity, new treatment innovations are still likely to render its diagnostic value obsolete. Given this, we prioritized designing a classification to be most fully useful to the current paradigm of treatment, rather than to try and limit its usefulness now to somehow extend its lifetime in the future.
Several limitations of the study warrant discussion. The distribution of our evaluating surgeons is concentrated in North America and Europe, and future studies would benefit from representation of other regions. The classification scheme is tied to the 3dCT, and as the role of diagnostic modalities using ionizing radiation evolves, alternative diagnostic tests may supplant 3dCT, which could make this classification scheme less useful or even obsolete. Furthermore, because the spectrum of mandibular hypoplasia in CFM falls on a continuum, and intrinsic breakpoints do not exist to distinguish inherent deformity types to which responses could be compared, we used interrater agreement as the benchmark of validity. Next, the pool of raters overlapped with the previous study of the Pruzansky–Kaban classification, and their improved agreement could be due in part to repeat testing bias. Although this is conceivable, the 2-year interval between studies is likely to have eliminated much retention. Furthermore, given that this new classification was shared with evaluators only at the time they completed the web-based survey, their familiarity with it was limited; this should have biased our results disadvantageously, if anything. Finally, we attempted to reduce design bias by using 5 classification options to mirror the methodology of the previous study.
In conclusion, the proposed classification shows significantly improved agreement among surgeons in stratifying the CFM mandibular deformity compared with existing classifications. It shows substantial agreement among evaluators distinguishing deformities requiring tissue transfer-based reconstruction compared to those that do not. The improved, but only moderate, agreement across all deformity types likely reflects the challenges inherent to a disease with heterogeneity, a continuum of severity, for which many differing treatment modalities exist. Nonetheless, improved diagnostic agreement will hopefully prove to be an enabling step toward establishing a more firm evidence base in treatment and prognosis.
We wish to thank Priya Reddy, MD, for her assistance with manuscript preparation.
1. Gorlin RJ, Jue KL, Jacobsen U, et al. Oculoauriculovertebral dysplasia. J Pediatr. 1963;63:991–999
2. Gougoutas AJ, Singh DJ, Low DW, et al. Hemifacial microsomia: clinical features and pictographic representations of the OMENS classification system. Plast Reconstr Surg. 2007;120:112e–120e
3. Moulin-Romsée C, Verdonck A, Schoenaers J, et al. Treatment of hemifacial microsomia in a growing child: the importance of co-operation between the orthodontist and the maxillofacial surgeon. J Orthod. 2004;31:190–200
4. Dhillon M, Mohan RP, Suma GN, et al. Hemifacial microsomia: a clinicoradiological report of three cases. J Oral Sci. 2010;52:319–324
5. Ross RB. Lateral facial dysplasia (first and second branchial arch syndrome, hemifacial microsomia). Birth Defects Orig Artic Ser. 1975;11:51–59
6. Vento AR, LaBrie RA, Mulliken JB. The O.M.E.N.S. classification of hemifacial microsomia. Cleft Palate Craniofac J. 1991;28:68–76; discussion 77
7. Sakamoto Y, Nakajima H, Ogata H, et al. The use of mandibular body distraction in hemifacial microsomia. Ann Maxillofac Surg. 2013;3:178–181
8. Longacre JJ, deStephano GA, Holmstrand KE. The surgical management of first and second branchial arch syndromes. Plast Reconstr Surg. 1963;31:507–520
9. Grabb WC. The first and second branchial arch syndrome. Plast Reconstr Surg. 1965;36:485–508
10. Converse JM, Coccaro PJ, Becker M, et al. On hemifacial microsomia. The first and second branchial arch syndrome. Plast Reconstr Surg. 1973;51:268–279
11. Edgerton MT, Marsh JL. Surgical treatment of hemifacial microsomia. (First and second branchial arch syndrome). Plast Reconstr Surg. 1977;59:653–666
12. Tenconi R, Hall BDHarvold EP, Vargervick K, Chierici G Hemifacial microsomias: phenotypic classification, clinical implications and genetic aspects. 1983 New York Alan R. Liss:39–49 In: Treatment of Hemifacial Microsomia
13. Lauritzen C, Munro IR, Ross RB. Classification and treatment of hemifacial microsomia. Scand J Plast Reconstr Surg. 1985;19:33–39
14. David DJ, Mahatumarat C, Cooter RD. Hemifacial microsomia: a multisystem classification. Plast Reconstr Surg. 1987;80:525–535
15. Pruzansky S. Not all dwarfed mandibles are alike. Birth Defects. 1969;5:120
16. Kaban LB, Padwa BL, Mulliken JB. Surgical correction of mandibular hypoplasia in hemifacial microsomia: the case for treatment in early childhood. J Oral Maxillofac Surg. 1998;56:628–638
17. Santos DT, Miyazaki O, Cavalcanti MG. Clinical-embryological and radiological correlations of oculo-auriculo-vertebral spectrum using 3D-CT. Dentomaxillofac Radiol. 2003;32:8–14
18. Shah N, Bansal N, Logani A. Recent advances in imaging technologies in dentistry. World J Radiol. 2014;6:794–807
19. Kane AA, Lo LJ, Christensen GE, et al. Relationship between bone and muscles of mastication in hemifacial microsomia. Plast Reconstr Surg. 1997;99:990–997; discussion 998
20. Meazzini MC, Mazzoleni F, Canzi G, et al. Mandibular distraction osteogenesis in hemifacial microsomia: long-term followup. J Craniomaxillofacial Surg. 2005;33:370–376
21. Takahashi-Ichikawa N, Susami T, Nagahama K, et al. Evaluation of mandibular hypoplasia in patients with hemifacial microsomia: a comparison between panoramic radiography and three-dimensional computed tomography. Cleft Palate Craniofac J. 2013;50:381–387
22. Wink JD, Goldstein JA, Paliga JT, et al. The mandibular deformity in hemifacial microsomia: a reassessment of the Pruzansky and Kaban classification. Plast Reconstr Surg. 2014;133:174e–181e
23. Mielnik-Błaszczak M, Olszewska K. Hemifacial microsomia—review of the literature. Dent. Med. Probl. 2011;48:80–85
24. Madrid JRP, Montealegre G, Gomez V. A new classification based on the Kaban’s modification for surgical management of craniofacial microsomia. Craniomaxillofac Trauma Reconstr. 2010;3:1–7
25. Kaban LB, Padwa B, Mulliken JB. Mandibular deformity in hemifacial microsomia: a reassessment of the Pruzansky and Kaban classification. Plast Reconstr Surg. 2014;134:657e–658e
26. Tuin AJ, Tahiri Y, Paine KM, et al. Clarifying the relationships among the different features of the OMENS+ classification in craniofacial microsomia. Plast Reconstr Surg. 2015;135:149e–156e
Copyright © 2016 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of the American Society of Plastic Surgeons. All rights reserved.
27. Quinn PD Color Atlas of Temporomandibular Joint Surgery. 1998 St. Louis, Mo. Mosby