Craniofacial microsomia (CFM) is a complex congenital condition characterized by the underdevelopment of structures arising from the first and second pharyngeal arches.1,2 The phenotypic spectrum of CFM includes malformations of the orbits, ear, mandible, facial soft tissue, and facial nerve. Prevalence is approximately 1 in 3,500 live births3 and higher among boys and children from Hispanic and Native American families.4 Children with CFM have myriad health care needs that require multiple, cross-disciplinary evaluations and interventions, including surgeries that help restore facial form and function.5
Facial nerve palsy is one of the many anomalies associated with CFM6 and the extent of facial nerve involvement can be difficult to ascertain in infancy due to the patient’s inability to cooperate with a structured facial nerve examination. In older children, facial palsy is evaluated by asking children to imitate specific facial expressions that indicate the responsiveness of different branches of the facial nerve. However, with infants, clinicians must rely on informal observations of spontaneous facial expressions during clinic visits, a process that is potentially unreliable and may under-identify palsy, delaying the recognition of need for relevant treatments (eg, feeding/eating and speech interventions; reanimation surgery). Moreover, facial palsy in early childhood, either alone or in combination with scarring from surgical interventions, may undermine infants’ facial communications of emotions to caregivers. The caregiver’s recognition of and responsiveness to infant facial expressions is a critical developmental process that has been linked to children’s later attainment of emotion regulation and socialization skills.7–9 If the facial anomalies associated with CFM disrupt this process, psychosocial interventions could be used to minimize or prevent such problems.10–12 However, such an approach would require a novel, reliable method of assessing facial expressions in very young patients.
We are developing such a method, using standardized emotion induction tasks to elicit positive and negative facial expressions in 12- to 14-month-old infants with CFM (“cases”) and demographically similar infants without CFM (“controls”). In a previous study,13 we used human coders and a manual coding system (Facial Action Coding System for Infants and Young Children; Oster, 2016) to compare the facial responses of cases versus controls. Positive and negative emotion induction tasks successfully elicited the intended facial expressions, but overall expressiveness did not differ between the 2 groups of infants, and only modestly distinguished the phenotypic variants associated with CFM (eg, microtia with or without mandibular hypoplasia). These null findings may be related to the fact that this study relied upon manually coded observations of facial movement, and small and subtle, but important, movements may have been missed by the coders. These include the displacement, velocity, and acceleration of not only facial movements but also movements of the head as well. Head movement is potentially important because face and head movement are inseparable in emotion communication and are strongly coupled, together influencing how facial expressions of emotion are communicated and perceived.14–16 Second, this study relied upon manually coded observations of facial movement, and small and subtle, but important, movements may have been missed by the coders
In recent years, machine learning17 has enabled the use of automatic facial analysis (AFA), which tracks and quantifies facial and head movements directly from digital video data.18 AFA has made possible the reliable and efficient measurement of displacement, velocity, and acceleration of facial and head movement. In previous research, AFA has revealed strong associations between effects and the dynamics of facial and head movement.19–21
In the current study, we used AFA to investigate whether infants with and without CFM differ in the dynamics of facial and head movement during tasks designed to elicit positive and negative effects. Face and head movements were tracked from 2D video using a well-validated generic approach made possible by training the algorithm with high-resolution 3D face-scans.22–24 Expressiveness was operationalized as the displacement, velocity, and acceleration of automatically tracked facial landmarks and head pitch and yaw. We addressed 2 specific questions: Do infants with CFM differ from controls in terms of head and facial expressiveness? Are these group differences specific to phenotypic anomalies associated with CFM? Secondary analyses involving all participants examined the potential moderating influence of infant sex, ethnicity, and type of emotion induction (positive versus negative).
Participants were 113 ethnically diverse 13-month-old infants. Cases (n = 63) were recruited from hospital-based craniofacial centers at Children’s Hospital of Los-Angeles, Children’s Hospital of Philadelphia, Seattle Children’s Hospital, University of Illinois-Chicago, and University of North-Carolina-Chapel Hill. Inclusion criteria were (1) Have at least one of the CFM indicators developed by the Facial Asymmetry Collaborative for Interdisciplinary Analysis and Learning (FACIAL) network25 (microtia, anotia, facial asymmetry, preauricular or facial tag(s), epibulbar dermoid, macrostomia); (2) Be diagnosed by a regional craniofacial team; (3) Be between the ages of 12 and 24 months (or corrected age if born between 34 and 36 weeks’ gestation); (4) Have a legal guardian able to provide informed written consent and willing to comply with all study procedures; and (5) Be available for the duration of the study. Exclusion criteria were (1) Diagnosis of a known syndrome (eg, Townes-Brocks or Nager syndromes); (2) Presence of an abnormal karyotype or major medical or neurological condition (eg, cancer, cerebral palsy); (3) Premature birth (less than 34 weeks’ gestation); (4) Any circumstance that would preclude the family’s ability to participate fully in the research; (5) A sibling already participating in the Craniofacial Microsomia: Longitudinal Outcomes for Children Pre-Kindergarten (CLOCK) study; and (6) consenting parent unable to speak English or Spanish. Because the dynamics of facial movement can vary as children age, participants more than 15 months of age were excluded.
Controls (n = 50) were recruited through pediatric practices located near the hospitals from which cases were recruited. These sources were supplemented by flyers posted in pediatric practices and from available infant studies participant pools. Inclusion criteria were demographic characteristics that met frequency-matching criteria for the case cohort with respect to infant age and sex, socioeconomic status, and language spoken in the home (English or Spanish).
Exclusion criteria included (1) meeting one or more of the exclusionary criteria for cases; (2) diagnosis or history of any disorder, condition, or injury that would affect facial features; and age older than 15 months.
Phenotypic Classification of Cases
We classified the participant’s phenotype with a case-by-case integration of standardized photographic ratings of facial features and data taken from a medical history interview and medical charts.26 The photographic protocol and classification method27 generated 3 phenotypic subgroups: Microtia only (absence of other CFM-related features; n = 16); Microtia plus mandibular hypoplasia (n = 38); and Other combinations of CFM- associated malformations (n = 9).
Infants’ expressiveness was observed in response to 2 standardized emotion inductions, one intended to elicit positive affect and the other negative affect.28 For each task, infants were seated in a highchair in front of an experimenter and their mother to the side (see figure, Supplemental Digital Content 1, which displays observational procedure, https://links.lww.com/PRSGO/A954).
Positive Emotion Task (PosET)
The experimenter engaged the infant by blowing soap bubbles toward them and using her voice to build suspense and elicit surprise, amusement, or interest (see figure, Supplemental Digital Content 2, which displays examples of Negative Emotion Task (left) and Positive Emotion Task (right), https://links.lww.com/PRSGO/A955).
Negative Emotion Task (NegET)
The examiner first allowed the infant to play with an attractive toy and then covered the toy with a clear plastic bin just out of the infant’s reach, which typically elicited frustration, anger, or distress (Supplemental Digital Content 2). The NegET was terminated if the infant became too upset or at the mother’s request.
The 2 tasks (ie, PosET and NegET) were each repeated 1–3 times unless the infant became too distressed to continue. Both tasks were recorded using a Sony DXC190 compact-camera.
Automatic Face Analysis (AFA)
A person-independent 3D face tracker (Zface, Jeni et al. 2016) was used to track the 3D coordinates of 49 fiduciary points and head pitch (ie, head nods) and yaw (ie, head turns) in each video frame [see figure, Supplemental Digital Content 3, which displays examples of AFA tracking results (head orientation pitch (green), yaw (blue), and roll (red), and the 49 fiducial points). https://links.lww.com/PRSGO/A956]. Tracking results were overlaid on the source videos and manually reviewed. Frames that could not be tracked or failed visual review were not analyzed.
Expressiveness Using Facial Movement Dynamics
The movement of the 49 detected 3D fiduciary points corresponds to the movement (without rigid head movements) of the corresponding facial features (ie, eyes, eyebrows, and mouth) and was used to measure expressiveness from facial movement. The displacement, velocity, and acceleration of each one of the 49 detected fiduciary points was computed. The root mean square (RMS) was then used to measure the magnitude of variation of the fiduciary points’ displacement, velocity, and acceleration, respectively. The RMSs of the fiduciary points displacement, fiduciary points velocity, and fiduciary points acceleration are referred to as facial displacement, velocity, and acceleration, respectively. Because the movements of individual points are highly correlated, principal components analysis was used to reduce the number of parameters. The first principal components of displacement, velocity, and acceleration accounted for 63%, 76%, and 75% of the respective variance and were used as measurements of facial expressiveness.
Expressiveness Using Head Movement Dynamics
Head angles in the horizontal (ie, pitch) and vertical (ie, yaw) directions were used. Head angles were converted into angular displacement by subtracting the overall mean head angle from each observed head angle within each valid segment (ie, consecutive valid frames). Angular velocity and angular acceleration for pitch and yaw were computed as the derivative of angular displacement and angular velocity, respectively. Similar to facial movement, the RMS was used to measure the magnitude of variation of the angular displacement, the angular velocity, and angular acceleration for pitch and yaw, respectively. RMSs of the angular displacement, angular velocity, and angular acceleration for pitch and yaw were used as measurement of head expressiveness.
Expressiveness was the primary outcome and was operationalized using the dynamics of facial and head movement during both the PosET and NegET. Facial expressiveness was operationalized as the displacement, velocity, and acceleration of the automatically tracked 49 landmarks. Head expressiveness was operationalized as the angular displacement, angular velocity, and angular acceleration of pitch and yaw, respectively.
To confirm that the induction tasks elicited the targeted affects, general estimating equations using an independent correlation matrix was used to compare expressiveness between PosET and NegET after adjustment for case status, sex, and ethnicity (Hispanic or Latino versus non-Hispanic/non-Latino). Linear regression with robust standard error estimates was used to estimate differences in expressiveness between cases and controls, as well as differences across phenotype, after adjustment for sex and ethnicity.
Due to the exploratory nature of this research, P values were unadjusted for multiple comparisons and they did not serve as the sole basis for estimating the strength of the findings. Instead, we assessed the magnitude of observed effect sizes, their precision, and the consistency of these estimates across multiple measures and the 2 emotion induction tasks.29 Standardized mean difference effect sizes were calculated using a modification of Cohen’s d30 that divides the estimated mean difference by the RMS error of the model.
Average age at the time of the assessment was 13.2 months (SD = 0.71; Table 1). Relative to controls, cases were more likely to be male, to be Hispanic or Latino, to receive Medicaid insurance, and to have received speech, language, or hearing services.
We observed strong differences in expressiveness between the positive and negative effect tasks across 19 out of 21 measures (P < 0.01). Head and face displacement, velocity, and acceleration were consistently greater during the NegET than PosET (Table 2).
After adjustment for sex and ethnicity, we observed little difference between cases and controls in holistic facial expressiveness and head movement (Table 3). The one significant finding (pitch displacement during the positive emotion task) was in the expected direction: less expressiveness in cases than controls.
Analyses by Phenotype
Significant differences emerged between 2 of the 3 phenotypes and controls. For microtia with mandibular hypoplasia, face and head dynamics were significantly lower for cases than controls for 3 of 18 comparisons and 2 others were marginally significant (Table 4).
For other CFM-associated phenotypes, face and head dynamics were significantly lower for cases than controls for 2 of 18 comparisons and marginally significant for 2 others (Table 5). For microtia only, no differences were found between cases and controls.
After adjustment for case status and sex, Hispanic/Latino infants had lower levels of expressiveness in facial velocity and acceleration, pitch velocity and acceleration, and yaw velocity and acceleration. Effect sizes ranged from −0.7 to −0.1 with the strongest effects observed for the NegET (P values ranged from <0.001 to 0.71).
The facial features associated with CFM increase risk for alterations in facial expressiveness and may lead to impairments in social and emotional development. We explored whether such effects emerge by the end of the first year of life and whether they vary among children with different phenotypic features in the broad category of CFM. To elicit individual differences in expressiveness, we used 2 well-validated emotion induction tasks that are known to provoke individual differences in affective reactivity. Expressiveness was objectively quantified using automated face analysis. Consistent with previous literature in both infants and adults, face and head movement displacement, velocity, and acceleration strongly differed between negative and positive tasks. Highly significant differences were found on 7 of 9 measures.
When we included all children with CFM, little evidence emerged for differences between infants with and without CFM. Of 18 comparisons between cases and controls, only one was statistically significant, which is about what one would expect by chance. However, when phenotype was considered, we observed more discernible differences between controls and 2 subgroups of cases. Cases with microtia plus mandibular hypoplasia and those with other-CFM associated features were less facially expressive than controls. During both positive and negative emotion tasks, the microtia plus mandibular dysplasia subgroup had lower displacement face movement and lower velocity and acceleration of head pitch (ie, head nods). The latter subgroup had lower displacement head pitch during both positive and negative tasks, lower displacement head yaw (ie, head turns) during the negative task, and marginally lower head yaw displacement in the positive task. No differences were found for the less severe, microtia only group. Nonetheless, we must acknowledge that adjustment for multiple comparisons (eg, Bonferroni correction) would require a P value of <0.001 to be considered statistically significant. However, as our analyses were exploratory in nature—using novel techniques to assess facial and head expressiveness—statistical significance is of less concern at this early stage of investigation. Furthermore, we are currently assessing facial expressiveness in the same cohort of children at approximately 3 years of age to see if we can replicate these patterns.
In comparison with the findings of Hammal et al.,13 the current findings more strongly suggest that CFM, and in particular individuals with microtia plus mandibular hypoplasia and other associated CFM features, is associated with a reduction in expressiveness as early as 13 months of age. Several factors may account for the increased sensitivity of our approach. First, the holistic measures were continuous, whereas Hammal et al13 used summed, binary measures (occurrence versus not occurrence), which emphasized the density of actions (ie, how many were occurring) rather than their intensity. Measurement of intensity was central to our approach and we sampled a far larger number of facial movements. In the prior study, they attended to 9 facial action units, whereas in the present study, we sampled 49 points across the face plus head pitch and yaw. Second, the temporal envelope of expressiveness was explicitly quantified in the current study (eg, velocity), whereas the action unit approach was insensitive to intensity change over time. Previous work in both infants and adults suggests that variation in displacement, velocity, and acceleration over time is strongly related to affect and interpersonal communication. Third, Hammal et al. used manual coding whereas our approach was fully automatic. Recent breakthroughs in computer vision and machine learning have made possible reliable automatic coding of action units that is consistent with experts’ manual coding.31 Although both human experts and algorithms now can code action units comparably, face and head movement dynamics can only be measured reliably by automatic algorithms. Reliable, automated measurement of face and head movement dynamics was necessary to further investigate expressiveness of infants with CFM.
An unexpected finding was that Hispanic infants were less expressive than non-Hispanic infants, regardless of cases status and phenotype. Previous research has found that Chinese infants are less expressive than Euro-American infants, with Japanese infants either comparable with Euro- American infants or between them and Chinese infants.32 Cross-cultural difference in expressiveness between Hispanic and non-Hispanic American infants have not previously been documented to our knowledge. For clinical purposes in evaluating expressiveness in Hispanic infants, it would be important to have use of separate norms for Hispanic and Euro-American non-Hispanic infants, as well as for East Asian infants.
Clinically, the present findings suggest that infants with only microtia have minimal risk for alterations in facial expressiveness. Elevated risk is suggested for infants with more severe phenotypic presentations of CFM. Increased monitoring and surgical or behavioral intervention may be indicated for these subgroups of patients.
Two limitations of the current study may be noted. One is that Hispanic infants were over-represented among cases relative to controls. As a consequence, ethnicity was included as a covariate in the analyses. Because ethnicity was related to expressiveness, controlling for ethnicity may have reduced sensitivity to detect CFM-related differences in expressiveness. We may have under-estimated CFM effects. The other limitation is the relatively small number of facial landmarks we quantified relative to what is possible. We sampled 49 landmarks. To provide denser sampling of facial movement, future work should consider using a larger number of landmarks.
In this study, we have demonstrated the initial application of a novel, machine learning approach to the measurement of facial expressiveness in infants with a congenital condition that carries an elevated risk of facial palsy. Infants with CFM phenotypes beyond isolated microtia were less expressive than control infants. This finding, which requires replication, suggests that infants with more severe CFM begin to diverge in expressiveness from controls by 13 months of age. Longitudinal studies will be needed to learn whether these differences are stable or increase through early childhood, whether similar effects emerge for the less severe phenotype of microtia only and whether individual variation in facial expressiveness among infants with CFM predicts their psychosocial status in the preschool years.
The authors thank our colleagues Drs. Kathy Kapp-Simon (University of Illinois-Chicago), Amelia Drake (University of North Carolina-Chapel Hill), Alexis Johns (Children’s Hospital of Los Angeles), and Leanne Magee (Children’s Hospital of Philadelphia) at the participating craniofacial centers and the families who so generously volunteered their time to participate in this research.
1. Gorlin RJ, Cohen MM, Hennekam RCM. Syndromes of the Head and Neck. 2001.Oxford [England]: New York Oxford University Press.
2. Heike CL, Hing AV. Pagon RA BT, Dolan CR, Stephens K. Craniofacial microsomia overview. In: GeneReviews. 2009.2010/03/20 ed. Seattle, Wash.: University of Washington.
3. Poswillo D. The aetiology and pathogenesis of craniofacial deformity. Development. 1988;103:207–212.
4. Harris J, Källén B, Robert E. The epidemiology of anotia and microtia. J Med Genet. 1996;33:809–813.
5. Birgfeld CB, Heike CL, Saltzman BS, et al. Reliable classification of facial phenotypic variation in craniofacial microsomia: a comparison of physical exam and photographs. Head Face Med. 2016;12:14.
6. Heike CL, Luquetti DV, Hing AV. Adam MP, Ardinger HH, Pagon RA, et al. Craniofacial microsomia overview. 2009 Mar 19 [Updated 2014 Oct 9]. In: GeneReviews® [Internet]. Seattle, Wash.: University of Washington, Seattle; 1993–2018. Available at https://www.ncbi.nlm.nih.gov/books/NBK5199/
7. Bowlby J. Attachment. 1969.New York, N.Y.: Basic Books.
8. Pederson DR, Moran G. Expressions of the attachment relationship outside of the strange situation. Child Develop. 1996;67:915–927. doi:http://dx.doi.org/10.2307/1131870
9. Tronick EZ. Emotions and emotional communication in infants. Am Psychol. 1989;44:112–119.
10. Butterfield PM, Martin CA, Prairie AP. Emotional Connections: How Relationship Guide Early Learning. Instructor’s Manual. 2003.Washing ton, D.C.: National Center for Infants, Toddlers and Families.
11. Lee G, McCreary L, Breitmayer B, et al. Promoting mother-infant interaction and infant mental health in low-income Korean families: attachment-based cognitive behavioral approach. J Spec Pediatr Nurs. 2013;18:265–276.
12. Bogart KR, Tickle-Degnen L, Ambady N. Compensatory expressive behavior for facial paralysis: adaptation to congenital or acquired disability. Rehabil Psychol. 2012;57:43–51.
13. Hammal Z, Cohn JF, Wallace ER, et al. Facial expressiveness in infants with and without craniofacial microsomia: preliminary findings. Cleft Palate Craniofac J. 2018;55:711–720.
14. Keltner D. Signs of appeasement: evidence for the distinct displays of embarrassment, amusement and shame. J Personal Soc Psychol. 1995;68:441–454.
15. Ambadar Z, Cohn JF, Reed LI. All smiles are not created equal: morphology and timing of smiles perceived as amused, polite, and embarrassed/nervous. J Nonverbal Behav. 2009;33:17–34.
16. Busso C, Deng Z, Grimm M, et al. Rigid head motion in expressive speech animation: analysis and synthesis. IEEE Transactions on Audio, Speech and Language Processing. 2007;15:1075–1086. doi: 10.1109/TASL.2006.885910 [published Online First: Epub Date].
17. Bishop CM. Pattern Recognition and Machine Learning. 4 (2006): 359–422.Springer.
18. Corneanu C, Oliu M, Cohn JF, et al. Survey on RGB, thermal, and multimodal approaches for facial expression analysis: history, trends, and affect-related applications. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2016;38:1548–1568. doi: 10.1109/TPAMI.2016.2515606 [published Online First: Epub Date].
19. Kacem A, Hammal Z, Cohn JF, et al. Detecting depression severity by interpretable representations of motion dynamics. IEEE International Conference on Automatic Face and Gesture Recognition. 2018.Xi’an, China: IEEE.
20. Hammal Z, Cohn JF, Messinger DS. Head movement dynamics during play and perturbed mother-infant interaction. IEEE Trans Affect Comput. 2015;6:361–370.
21. Hammal Z, Cohn JF, Heike CL, et al. Automatic measurement of head and facial movement for analysis and detection of infant positive and negative affect. Frontiers in Human-Media Interaction. 2015;2:1–11.
22. Jeni LA, Cohn JF, Kanade T. Dense 3D face alignment from 2D video for real-time use. Image Vis Comput. 2017;58:13–24.
23. Jeni LA, Tulyakov S, Yin L, et al. First 3D Face Alignment in the Wild (3DFAW) Challenge. 2016.Amsterdam, the Netherlands, European Conference on Computer Vision.
24. Jeni LA, Cohn JF. Person-independent 3d gaze estimation using face frontalization. ChaLearn Looking at People and Faces of the World:Face Analysis CVPR Workshop and Challenge 2016. 2016.Las Vegas, NV.
25. Birgfeld CB, Heike C. Craniofacial microsomia. Semin Plast Surg. 2012;26:91–104.
26. Heike CL, Wallace E, Speltz ML, et al. Characterizing facial features in individuals with craniofacial microsomia: a systematic approach for clinical research. Birth Defects Res A Clin Mol Teratol. 2016;106:915–926.
27. Speltz ML, Kapp-Simon KA, Johns AL, et al. Neurodevelopment of infants with and without craniofacial microsomia. J Pediatr. 2018;198:226–233.e3.
28. Goldsmith H. H., Rothbart M. K.. Laboratory Temperament Assessment Battery (Lab-TAB): Locomotor version 3.1. Available from H. Hill Goldsmith
(Doctoral dissertation, Ph. D., Personality Development Laboratory, Department of Psychology, University of Wisconsin, Madison, WI).1999).
29. Rothman KJ. Six persistent research misconceptions. J Gen Intern Med. 2014;29:1060–1064.
30. Cohen J. Statistical Power Analysis for the Social Sciences. 1988.Hillsdale, N.J.: Lawrence Erlbaum Associates.
31. Hammal Z, Chu WS, Cohn JF, et al. Automatic Action Unit Detection in Infants Using Convolutional Neural Network. 2017.San Antonio, Tex., International Conference on Affective Computing and Intelligent Interaction.
32. Camras LA, Oster H, Campos J, et al. Production of emotional facial expressions in European American, Japanese, and Chinese infants. Develop Psychol. 1998;34:616–628.