Communicating During COVID-19: The Effect of Transparent Masks for Speech Recognition in Noise : Ear and Hearing

Journal Logo

Research Article

Communicating During COVID-19: The Effect of Transparent Masks for Speech Recognition in Noise

Thibodeau, Linda M.1; Thibodeau-Nielsen, Rachel B.2; Tran, Chi Mai Quynh1; Jacob, Regina Tangerino de Souza3

Author Information
Ear and Hearing 42(4):p 772-781, July/August 2021. | DOI: 10.1097/AUD.0000000000001065



The COVID-19 pandemic has resulted in the widespread use of face masks to minimize the spread of the virus (CDC 2020). However, the increased prevalence of opaque masks has led to notable difficulties in communication among individuals with normal and impaired hearing because of blocking facial cues for lipreading and expression as well as dampening acoustic information. These effects have a significant impact on communication in numerous settings such as employment, education, and judicial, but perhaps one of the greatest challenges has been communication in medical settings. For example, a neurologist working on an inpatient ward wrote, “Although I am an ‘expert’ in hearing, the COVID-19 ward stretched my understanding of the fear and isolation that accompanies an inability to hear” (Cosetti 2020). Considering the importance of visual cues for people with hearing impairment, the National Association for the Deaf has recommended use of transparent masks, also called window cloth masks (National Association of the Deaf 2020). These masks are designed with a window, which allows the observation of the lips, tongue, and jaw movements.

Masks are known to change the acoustics of speech. Several researchers have compared the degradation of the acoustic signal that occurs when a face mask is worn. As shown in Figure 1, the degradation of the acoustic signal while using face masks varies according to their type and material (Saeidi et al. 2016; Atcherson et al. 2020; Bottalico et al. 2020; Corey et al. 2020; Goldin et al. 2020; Pörschmann et al. 2020). The mask type with the least acoustic reduction relative to the signal produced with no mask is the surgical mask [peak reduction ~ 2.3 dB (Bottalico et al. 2020)] compared to the transparent cloth mask [peak reduction ~ 21.2 dB in the high frequencies (Atcherson et al. 2020)].

Fig. 1.:
Maximum sound pressure level reduction caused by opaque (left) or transparent (right) masks compared to no mask.

The decrease in the intensity of high frequencies, where many cues for consonant perception are located, further compromises speech intelligibility. When a frequency band above 3.5 kHz is removed from the speech signal in a speech-in-noise test, significant difficulties in speech understanding are observed (Apoux & Bacon 2004).

Recent studies of the impact of these acoustic signal degradations on speech recognition are shown in Table 1. All researchers found that speech recognition was significantly reduced when speakers wore a face mask (Wittum 2013; Atcherson et al. 2017; Bandaru et al. 2020; Bottalico et al. 2020; Hampton et al. 2020), but only one used auditory-visual presentation rather than just auditory alone (Atcherson et al. 2017). As expected, persons with severe hearing loss performed significantly better in conditions with auditory-visual cues compared to auditory-only cues. However, comparisons of the transparent mask to the opaque mask were made in the auditory-only condition which precludes evaluation of the contribution of facial cues.

TABLE 1. - Results of studies on the effects of face masks on adult speech recognition in quiet and in noise
Author(s) (yr) Adult Participants Groups Tests/SNR (dB) Masks Results Significance (ρ)
Wittum (2013) 21 NH SPIN SM NM > SM > SM with blood shield <0.01
Multi-talker babble noise without and with blood shield
SNR = 0
Atcherson et al. (2017) 30 10 NH CST Paper mask (SM) and transparent mask Transparent mask > SM for HIM and HIS <0.001
10 HIM BKB Multi-talker babble noise
10 HIS SNR = +10
Bandaru et al. (2020) 20 NH W-22 lists in quiet (SRT, SDS) N95 + Face Shield No N95 + Face Shield > <0.0001
N95 + Face Shield
Bottalico et al. (2020) 40 NH CNC, Speech-Shaped-Noise FM NM>SM> <0.001
SNR = +3 SM N95>FM
Hampton et al. (2020) 5 NH BKB Simulated babble noise Facial PPE* No PPE > PPE in simulated operating theater setting <0.05
SNR = −7.4 Adaptive
*Facial PPE suitable for aerosol-generating procedures.
BKB, Bamford–Kowal–Bench; CNC, consonant-nucleus-consonant; CST, connected speech test; FM, fabric mask; HI, hearing impairment; HIM, hearing impairment moderate-to-moderately severe; HIS, hearing impairment severe to profound; NH, normal hearing; NM, no mask; PPE, personal protective equipment; SDS, speech discrimination score; SM, surgical mask; SNR, signal-to-noise ratio; SPIN, speech perception in noise test; SRT, Speech Reception Threshold; W-22, W-22 word list.

With the onset of the pandemic, researchers have attempted to document the effect of face masks on communication, but protocols such as those used in the studies in Table 1 have not been possible because of quarantine and social distancing restrictions. Two recent studies of the effects of face masks employed subjective responses to questionnaires regarding communication challenges (Saunders et al. 2020; Trecca et al. 2020). When completing an online survey regarding the impact of face coverings, many of the 460 respondents with normal and impaired hearing reported being unaware of the extent to which they relied on the lips and facial expressions until face coverings had become ubiquitous (Saunders et al. 2020). Likewise, in a group of 59 patients who were interviewed about use of face masks by health personnel, over 50% reported moderate to severe difficulties as a result of the reduced acoustic transmission (44.1%) or the impossibility of lipreading (55.9%) (Trecca et al. 2020).

The self-report questionnaires document the challenges in communication posed by use of face masks; however, the magnitude of the potential benefits of using a transparent mask is unknown. Therefore, a novel online paradigm was created to measure speech recognition in a quiet location that would be accessible to anyone with internet service. The purpose of this study was to investigate the effect of face masks with and without a transparent window on speech recognition in noise among adults with normal and impaired hearing. Specifically, we aimed to answer four questions:

  1. Does the use of a transparent mask provide better auditory-visual speech recognition than an opaque mask?
  2. Does the use of a transparent mask provide similar auditory-visual speech recognition to that which occurs with the use of no mask?
  3. Is auditory-visual speech recognition across mask conditions commensurate with ratings of confidence and concentration?
  4. If there are differences in auditory-visual speech recognition in noise between mask types, can they be attributable to acoustic variations between the opaque mask and transparent mask stimuli as measured in an auditory-only condition?

The first three questions were addressed in a large sample size of participants with either normal or impaired hearing who received video-recorded stimuli (i.e., auditory-visual) for three conditions: no mask, transparent mask, and opaque mask. To answer the last question, a smaller follow-up study was conducted with persons with normal hearing who received only the audio files (i.e., auditory-only) of the same stimuli used in the auditory-visual study.



Announcements regarding the online study were distributed through university training programs, hearing loss support groups, list serv posts, and social media sites. Responses included 162 adults, ages 18–64 years, with 136 normal hearing, 14 confirmed or suspected hearing loss and use of assistive listening device (including hearing aids and/or cochlear implants) (HL+ALD), and 12 confirmed or suspected hearing loss but no use of assistive technology (HL-ALD). Data from eight participants (3 NH, 4 HL+ALD, 1 HL-ALD) were excluded because they responded at the conclusion that they “always” experienced interruptions as a result of internet failure leaving a total of 154 participants (Mage = 43 years, 11 males and 143 females). Applying the same exclusion criteria, the auditory-only follow-up study included 29 new adult volunteers with normal hearing (Mage = 25 years, 3 males and 26 females). Informed consent approved by the Institutional Review Board was obtained online for the study. All respondents reported English as their first language.

Study Design

Auditory-visual stimuli were presented in an online format accessible via a Qualtrics URL link ( Participants were asked to complete the study alone in one sitting, while in a quiet room, on a computer, laptop, smartphone, or tablet and wear their assistive technology, if applicable. If they did not use hearing technology, earphones were recommended for better sound quality. To guard against random volume adjustments throughout the study, participants were asked to adjust their volume while listening to a sample noise file that was equivalent to the noise level presented in the study and to leave it at this level for the remainder of the study. Following multiple-choice demographic questions, participants received video recordings of the sentences in randomized order across the face mask conditions. These instructions were similar for the smaller follow-up study which was a separate Qualtrics online assessment. For this follow-up study, participants received only the audio recordings of the sentences recorded with the opaque and transparent mask to assess speech recognition when no visual cues were provided.


Speech recognition in noise was evaluated using seven, 10-sentence lists from the hearing in noise test (HINT) (Nilsson et al. 1994). The HINT consists of 25 lists of 10 sentences. Although originally designed as an adaptive speech-recognition-in-noise test to determine the signal-to-noise ratio (SNR) necessary for 50% correct recognition, the presentation levels of the speech and noise were fixed at a −5 dB SNR for this online study. One list was assigned as practice and two lists were used for each of the three conditions: no mask, transparent mask, and opaque mask conditions (See Video in Supplemental Digital Content 1,, which demonstrates stimuli for each of the three conditions). Following five practice sentences, participants were presented one of two randomized sets of 30 sentences (set A or B) including 10 sentences for each of the three mask conditions. Within each randomized set of sentences, the three mask conditions were completely randomized. For the auditory-only follow-up study, the same randomizations were presented only in the transparent and opaque conditions.

The HINT sentences were recorded by a woman in a sound booth on a smartphone placed at three feet, 0° azimuth, to capture head and shoulders. Two loudspeakers were placed at 45° azimuth from the talker, five feet away to present the multi-talker noise downloaded from the internet (Your Questions Answered Brand 2016) and played from a Lenovo (model P70) laptop routed through an audiometer (GSI 61). The stimuli levels were measured with a Quest (Model 1800) sound level meter that was placed at the location of the smartphone. The monitored-live voice stimuli presented with no mask were 65 dB SPL and the multi-talker noise was 70 dB SPL. The resulting SNR of −5 dB was comparable to other behavioral studies of face mask effects on speech recognition in noise which had SNR values that ranged from +10 to −7 dB (see Table 1). The same recordings were exported to video files for auditory-visual presentation and to audio files for auditory-only presentation.

The face mask was custom-made from double-layered cotton fabric and the window was 5 mil premium crystal-clear, multipurpose vinyl. The mask met the CDC published guidelines for effective masks which have tightly woven, breathable cotton of two or three layers (CDC 2020). The same mask was used for the opaque and transparent conditions; however, during the presentation of sentences with the opaque mask, one layer of the same cotton cloth was inserted to cover the transparent plastic portion as shown in Figure 2.

Fig. 2.:
Face mask conditions no mask (left), transparent mask (center), and opaque mask (right).

To explore the possible acoustic differences between the mask conditions when the single layer of cotton material was added to obstruct the view of the lips, speech noise from the GSI 61 Audiometer was presented through KEMAR with a mouth simulator. The intensity of the speech noise at the Shure microphone placed at the recommended social distance of six feet from KEMAR was 70 dB SPL. The overall attenuation relative to no mask was 1.09 dB greater for the opaque mask than for the transparent mask condition. Similar to the findings of Bottalico et al. (2020), the greatest attenuation was in the higher frequencies between 4 and 5 kHz, where the intensity was reduced by 11.32 and 13.64 dB for the transparent and opaque masks, respectively (see Figure in Supplemental Digital Content 2,, which shows the output for the transparent and opaque face mask compared to the no mask condition).


Each video was presented to participants who were asked to play it only one time to replicate communication in the real world. The videos were set to automatically advance after 10 seconds to eliminate participants’ ability to watch a video more than once. All participants completed five sentences in the no-mask condition for practice. Immediately after a video was presented, participants were redirected to a new page asking them to type in what they heard or guess if they were uncertain. After responding to all 30 stimuli, participants were asked to use a Likert five-point scale to rate their levels of confidence (1, Extremely confident to 5, Not confident at all) and concentration (1, Very little concentration to 5, Extreme concentration) for each condition. The procedures were analogous for the follow-up auditory-only study with the exception of the confidence and concentration ratings.

Scoring Reliability

Percentage of words typed in correctly was determined for each condition by one researcher. Scoring reliability for the auditory-visual study was evaluated by comparing the results with those obtained by two additional independent scorers. An intraclass correlation coefficient was calculated based on 37 sentences scored by two raters and yielded high reliability for both reliability scorers (0.986 and 0.999) (Hallgren 2012).

Data Analysis

All percent correct data were arcsine transformed to account for unequal variances that occur in proportional data (Studebaker 1985) prior to statistical analysis. Given the convenience sampling resulted in age variability across hearing status groups (mean age in years: NH(43.1); HL-ALD (46.1); and HL+ALD (38.2)), a one-way ANOVA conducted to compare ages across these groups showed no significant difference (F(2, 149) = 1.06, ρ = 0.350). Therefore, age was not included in any main analyses. Repeated-measures ANOVAs were performed to evaluate the main effects of mask condition, hearing loss status, and randomization set. Post hoc t-tests were performed for significant main effects with Bonferroni corrections applied resulting in an alpha level of ρ < 0.008.


The average percent correct number of words correctly typed for each condition in the auditory-visual study (no mask, transparent, opaque) by hearing status (NH, HL-ALD, HL+ALD) is shown in Table 2 and Figure 3. To assess whether style of mask affected sentence recognition, a 3 × 3 × 2 repeated-measures ANOVA with mask type (no mask, transparent, and opaque) as the within-subjects factor was performed. Between-subject factors were hearing status (NH, HL-ALD, and HL+ALD) and sentence randomization (set A or B). The significant main effects of mask type and hearing status are summarized in Table 2. There was not a main effect of sentence randomization set (F(1, 148) = 2.55, ρ = 0.112, η2 = 0.02) nor any significant interactions (all ρ’s > 0.058). Although not significant, there was a trend for participants with hearing loss who used hearing assistive technology to score lower than those with normal hearing and those with hearing loss who did not use technology across all mask conditions.

TABLE 2. - Analyses of sentence recognition scores for each mask type by hearing status for auditory-visual presentation
Mask Type NH HL-ALD HL+ALD Grand Mean (SD)
Mean % (SD)
No mask 85.5 (13.5) 84.5 (11.2) 60.8 (30.9) 83.8 (16.1)
Transparent mask 70.8 (19.5) 70.0 (15.9) 42.0 (25.8) 68.9 (20.9)
Opaque mask 60.6 (20.1) 60.8 (18.3) 35.1 (30.7) 58.9 (21.6)
Grand mean (SD) 72.3 (20.6) 71.8 (17.9) 46.1 (30.3)
ANOVA—main effect hearing status ANOVA—main effect mask type
NH vs HL-ALD vs HL+ALD F(2, 148) = 12.71 No mask vs transparent vs opaque F(2,296) = 64.65
ρ < 0.001 ρ < 0.001
η2 = 0.15 η2 = 0.30
Post hoc t-tests
NH vs HL+ALD t(141) = 2.99, ρ = 0.015 Transparent vs opaque t(153) = 8.51, ρ < 0.001*
Cohen’s d = 1.18 Cohen’s d = 0.69
NH vs HL-ALD t(142) = 0.12, ρ = 0.906 No mask vs transparent t(153) = 14.37, ρ < 0.001*
Cohen’s d = 0.04 Cohen’s d = 1.16
HL+ALD vs HL-ALD t(19) = 2.66, ρ = 0.020 No mask vs opaque t(153) = 19.47, ρ < 0.001*
Cohen’s d = 1.18 Cohen’s d = 1.57
HL-ALD, confirmed or suspect hearing loss but no use of assistive listening device; HL+ALD, confirmed or suspect hearing loss and use of assistive listening device; NH, normal hearing.

Fig. 3.:
Sentence recognition for three mask types: no mask, transparent mask, and opaque mask when presented with auditory-visual cues.

Transparent Versus Opaque Mask

To answer the first research question regarding the benefits of a transparent mask, the post hoc t-tests on the significant main effect of mask type showed participants performed significantly better when listening to and watching a speaker wearing a transparent mask (M = 68.9%) over an opaque mask (M = 58.9%). The Cohen’s d effect size (0.69) is considered medium (Cohen 1988) which is noteworthy, given the typical variability of percent-correct speech recognition scores. This finding suggests the importance of transparent masks not only for those with hearing loss but also those with normal hearing.

Transparent/Opaque Versus No Mask

To answer the question if the performance with the transparent mask was commensurate with that obtained with no mask, the post hoc t-test collapsed across subject groups showed participants were significantly less accurate in sentence recognition with the transparent mask (M = 68.9%) than when listening and watching with no mask (M = 83.8%). The Cohen’s d effect size of 1.16 was considered extremely large and highlights the importance of these findings. As expected, the performance with the opaque mask (M = 58.9%) was significantly lower compared to the no-mask condition.

Confidence and Concentration Ratings

Participants’ subjective ratings of the challenges during the speech recognition tasks mirrored their accuracy scores as shown in Figures 4 and 5. Two 3 × 3 × 2 repeated-measures ANOVAs controlling the same factors as before (i.e., mask type, hearing status, and sentence randomization set) were performed for the confidence and the concentration ratings.

Fig. 4.:
Confidence ratings for three mask types: no mask, transparent mask, and opaque mask when presented with auditory-visual cues.
Fig. 5.:
Concentration ratings for three mask types: no mask, transparent mask, and opaque mask when presented with auditory-visual cues.

As shown in Figure 4 for the confidence ratings, there was a significant main effect of mask type (F(2, 292) = 162.77, ρ < 0.001, η2 = 0.53) and hearing status (F(2, 146) = 3.38, ρ = 0.037, η2 = 0.04), but no significant interactions (all ρ’s > 0.178). Post hoc analyses of mask type demonstrated that participants were more confident when listening to and watching a speaker without a mask (M = 2.05, SD = 0.90) than when listening to and watching a speaker wearing a transparent mask (M = 3.31, SD = 0.86; t(152) = 21.89, ρ < 0.001, Cohen’s d = 1.79) or an opaque mask (M = 4.27, SD = 0.82; t(151) = 32.97, ρ < 0.001, Cohen’s d = 2.68). Furthermore, as expected, confidence was rated significantly higher for the transparent mask compared to the opaque mask conditions (t(151) = 15.70, ρ < 0.001, Cohen’s d = 1.63).

Post hoc analyses of the main effect of hearing status for the confidence ratings showed that the HL+ALD group reported significantly less confidence overall compared to the NH group (t(140) = 2.77, ρ < 0.006, Cohen’s d = 0.93) and also less than the HL-ALD group, but not significant (ρ > 0.008) The ratings were similar by the two groups who had normal hearing or suspected hearing loss but did not use technology.

The results of the second repeated-measures ANOVA for the subjective ratings of concentration in each condition were consistent with those obtained for confidence. Of particular note, when comparing the two styles of face masks, participants reported having to concentrate less when listening to and watching a speaker wearing a transparent mask (M = 3.93, SD = 0.85; t(153) = 9.86, ρ < 0.001, Cohen’s d = 0.79) than when listening to and watching a speaker wearing an opaque mask (M = 4.55, SD = 0.78) as would be expected. The concentration ratings by hearing status mirrored the pattern of the speech recognition scores as show in Figure 5 with a trend for the HL+ALD group to report greater concentration than the other two groups (NH and HL-ALD) who had similar ratings.

Auditory-visual Versus Auditory-only Presentation Follow-up Study

Finally, a follow-up study was conducted to confirm that the benefits afforded by the transparent over the opaque were not the result of some acoustic intensity and or quality differences in the custom-made mask when used for recording the opaque and transparent conditions. Although the measurements of the attenuation effects of the opaque and transparent mask showed only minimal differences (1.09 dB) between the masks, there was still a concern for possible distortion effects that would not be captured in a simple attenuation comparison. In an attempt to capture the effects of any quality or intensity differences in the recordings made with the opaque and transparent masks that might affect speech recognition, a more homogeneous sample of young, potentially more sensitive listeners was recruited (N = 29; Mage = 25).

The average percent correct recognition scores for the auditory-only stimuli recorded with the opaque and transparent masks are shown in the right side of Figure 6. A paired-samples t-test showed significantly better performance in the opaque mask condition over the transparent mask condition (t(28) = 7.11, ρ < 0.001, Cohen’s d = 1.32). This reduced performance in the auditory-only condition with the transparent mask (M = 40.7%, SD = 18.48) compared to the opaque mask (M = 58.2%, SD = 21.73) supports the conclusion that the improvement in speech recognition with the transparent mask was attributable to the addition of the visual cues from the lips, tongue, and jaw rather than acoustic effects.

Fig. 6.:
Sentence recognition for two mask types: transparent mask and opaque mask when presented with auditory-visual cues (left) and with auditory-only cues (right).

It is interesting to compare the performance of normal-hearing listeners who received auditory-visual cues to the performance of those who received the same auditory cues but with no visual cues. This would be analogous to conversing with someone wearing an opaque or transparent mask when he/she is turned around, assuming the SNR is constant. This comparison would provide insight into the potential additional benefits of seeing cues on the face other than those provided by the lips and mouth.

Two independent-samples t-tests were conducted to compare performance of normal-hearing individuals when given auditory-visual cues versus auditory-only cues. For the opaque mask conditions, there were no significant differences in sentence recognition scores (t(160) = 0.423, ρ = 0.673, Cohen’s d = 0.09). In other words, participants scored equally poor regardless of whether they were able to see and hear the speaker with an opaque mask (M = 60.6%, SD = 20.09) or only hear the speaker wearing an opaque mask (M = 58.2%, SD = 21.73). By contrast, there were significant differences when comparing performance for the transparent mask conditions, auditory-visual versus auditory-only. Specifically, participants who received auditory and visual input, performed significantly better (M = 70.8%, SD = 19.51) than participants who received only auditory input, (M = 40.7%, SD = 18.48), (t(160) = 7.21, ρ < 0.001, Cohen’s d = 1.48). This 30% improvement resulted in an extremely large effect size as represented by Cohen’s d of 1.48 and suggests that the transparent mask allowed persons with normal hearing to receive more facial cues to identify the sentences more accurately despite the reduction in the acoustic cues caused by the transparent mask.


Speech recognition in noise was significantly improved for persons with normal hearing and with confirmed or suspected hearing loss when able to see the speaker’s face through a transparent mask compared to an opaque mask which blocked facial cues. As expected, there was a trend for those who self-reported use of hearing technology to have greater difficulty across all the conditions than those with normal hearing or hearing loss but did not use technology. The large Cohen’s d when comparing the performance of those who self-reported using assistive listening devices to those who did not suggests that this trend is noteworthy. Given the nonsignificant interaction, the conclusions regarding the benefits of using a transparent mask apply across all subject groups.

Also, noteworthy, are the results of the two groups with normal hearing who received either the auditory-visual or auditory-only stimuli. One might not expect persons with normal hearing to experience over a 20% decline in speech recognition in noise when a talker uses an opaque mask compared to no mask. Furthermore, normal-hearing persons had significant difficulty when listening to the auditory-only stimuli recorded with the transparent compared to the opaque mask. The addition of the facial cues in the transparent mask condition allowed normal-hearing listeners to overcome the signal degradations confirmed in the auditory-only study. Consequently, they performed significantly better with the transparent mask over the opaque mask when receiving auditory-visual cues. Finally, the benefits of the transparent mask over the opaque mask extended beyond improved speech recognition as listeners reported significantly increased confidence and reduced effort to concentrate. This supports the findings of Saunders et al. (2020) who reported from a survey of 460 persons in the United Kingdom, that face masks affected the anxiety, fatigue, and emotions of both the speaker using the mask and also when listening to someone who is wearing a mask.

It is important to note that there was a 30% improvement for the transparent mask condition by persons with normal hearing when listening to and watching the auditory-visual stimuli compared to auditory-only stimuli (see Fig. 6). This finding agrees with previous studies which have shown that the addition of visual cues in a speech recognition task results in improvements from 30% to 50% over hearing auditory cues alone (Walden et al. 1975, p. 277, Table 3; Tye-Murray et al. 2016, p. 384, Fig 3). In contrast, for the opaque mask condition, the auditory-visual presentation was only 3% better than the auditory-only presentation which reflects the limited facial cues available for speech recognition with the opaque mask. These results highlight the potential benefits of wearing the transparent face mask even for persons with normal hearing.

The attenuation effects of various mask types have been of concern in numerous studies which is why the same mask was used in the current study to minimize acoustic differences between mask types. The insertions of the thin, single layer of cotton material in the transparent mask to make it opaque resulted in minimal attenuation differences (1.09 dB). When translating the findings of mask attenuation effects to speech recognition difficulties, it should be remembered that the studies included in Figure 1 involved measurements typically made with a steady state signals such as white or pink noise at distances of three or six feet from the source. When communicating in the real world with face masks, the signals are likely impacted in more ways because of fluctuating background noise and plexiglass shields between the talker/listener dyad. Furthermore, measurements of acoustic effects don’t account for the cognitive and psychosocial stress that may occur when a listener is unable to follow conversation. Perhaps as important as the considerations regarding acoustic attenuation caused by face masks, are the results in the present study regarding concentration and effort shown in Figures 4 and 5. The use of transparent masks may result in increased confidence and reduced concentration leading to greater overall success in the communication experience.

The ability to communicate effectively is paramount to one’s well-being and quality of life especially during times of stress (Saunders et al. 2020; McKee et al. 2020). The use of transparent over the opaque masks could not only enhance comprehension of medical recommendations, but also productivity in the workplace by reducing concentration efforts when communicating with others wearing face masks. Furthermore, during these stressful times, the use of transparent masks allows visibility of affect and potentially an emotional connection to relate to persons even when social distancing is required. This may be especially relevant for elderly individuals who are isolated in nursing homes and hospitals. This is supported by recent studies that have documented the detrimental effects of face masks on reading emotions. Carbon (2020) reported that face masks reduced the accuracy of identifying 12 different facial expressions such that disgusted faces were often judged as angry and other faces showing happiness or sadness were judged as neutral. Mheidly et al. (2020) provide a comprehensive overview of how various types of emotions are perceived in various regions of the face. They suggested that the opaque face masks block the middle and lower face and may result in missing cues that convey empathy or positive regard. Specifically, in a medical setting, opaque masks may result in increased patient anxiety during the interaction as well as reduced perception by the physician of patient reactions when receiving medical information. Specific suggestions are offered in both studies to reduce the negative effects on the processing of emotional information such as using more eyebrow movements, including frequent gestures or body language, and using transparent face masks.

Equally important are infants and young children who are establishing relationships with persons in their daycare and educational environments. They need the visual confirmation of a positive affirmation as they are developing social competency and trying to learn under compromised stimulus input. Children with hearing loss are particularly vulnerable as many are still developing language and need optimal access to communication models and to visual cues (Knowland et al. 2016; Saad et al. 2020).

The use of FDA-approved transparent masks such as the Communicator Surgical with Clear Window ( and the ClearMask ( could facilitate communication which may lead to confidence in healthcare providers, greater cooperation with medical procedures, and positive social interactions to improve overall disposition. Some may argue that the transparent masks limit visibility because of the “fog effect” caused by the moisture released while speaking. However, solutions range from simply applying a thin film of a liquid soap or bubbles to more expensive options of a product designed for antifog (McNally & Childress 2020). The use of the face shield would likely afford similar visual benefits; however, the protective effect may be compromised compared to the face mask with a window (CDC 2020). Alternative solutions to address communication barriers caused by face masks include using captioning apps on a smartphone or a communication board (McKee et al. 2020). Although more costly, another solution would be the use of a remote microphone on the talker and a wireless receiver on the listener. Research has shown significant benefits of remote microphone technology for speech recognition in noise (Thibodeau 2020), but the effect of combining this technology with auditory-visual cues over auditory alone has not been investigated.

The limitations of this study include the use of a single, custom-made face mask. Other designs are available which may yield differences in the degree of benefit provided by a transparent mask depending on the visibility of the mouth, lips, tongue, and jaw and the type of material. Ideally, the study design could include comparisons of performance with FDA-approved transparent and opaque masks in auditory-only and auditory-visual conditions in the same group of listeners. Because of the online protocol used in this study, it was not possible to control the frequency response of the participant’s playback device. The listener was instructed to adjust the volume to a comfortable level and not change it. Therefore, any acoustic effects related to the playback devices should have been constant across conditions. Another limitation was the lack of analyses regarding visual acuity and personal hearing technology. The use of broad categories of hearing ability precluded more in-depth analysis regarding performance as a function of technology type. Future research could be designed to address potential influencing factors that were not controlled via the open-invitation nature of the time-sensitive research. Of particular value would be a priori research questions to explore factors such as visual acuity, type of technology used, years of experience with technology, onset of hearing loss, age, and gender. Furthermore, the HINT sentences may not represent typical communication that may include more contextual cues, complex language structures, and/or distracting environmental signals.


In summary, these results highlight the need to consider face masks not only as COVID-19 protection and prevention but also in terms of accessibility to communication. Even persons with normal hearing recognized speech in background noise significantly better when the talker used a transparent compared to an opaque mask. The use of a transparent mask is a cost-effective solution to enhance communication that can be implemented by everyone and could have a significant impact on several aspects of well-being across the lifespan during these challenging times.


The authors are grateful to the doctoral students in the Hearing Health Lab for their contributions to this work and to Ms. Camila Medina for her graphic arts assistance.


Apoux F., Bacon S. P. Relative importance of temporal information in various frequency regions for consonant identification in quiet and in noise. J Acoust Soc Am, (2004). 116, 1671–1680.
Atcherson S. R., Mendel L. L., Baltimore W. J., Patro C., Lee S., Pousson M., Spann M. J. The effect of conventional and transparent surgical masks on speech understanding in individuals with and without hearing loss. J Am Acad Audiol, (2017). 28, 58–67.
Atcherson S. R., Finley E. T., McDowell B. T., Watson C. More speech degradations and considerations in the search for transparent face coverings during the COVID-19 pandemic. Audiol Today, (2020). 32, 20–27.
Bandaru S. V., Augustine A. M., Lepcha A., Sebastian S., Gowri M., Philip A., Mammen M. D. The effects of N95 mask and face shield on speech perception among healthcare workers in the coronavirus disease 2019 pandemic scenario. J Laryngol Otol, (2020). 134, 895–898.
Bottalico P., Murgia S., Puglisi G. E., Astolfi A., Kirk K. I. Effect of masks on speech intelligibility in auralized classrooms. J Acoust Soc Am, (2020). 148, 2878.
Carbon C. C. Wearing face masks strongly confuses counterparts in reading emotions. Front Psychol, (2020). 11, 566886.
Centers for Disease Control and Prevention (CDC) Considerations for wearing masks. (2020).
Cohen J. (Statistical Power Analysis for the Behavioral Sciences. (1988). 2nd ed. Lawrence Erlbaum Associates.
Corey R.M., Jones U., Singer A. C. Acoustic effects of medical, cloth, and transparent face masks on speech signals. J Acoust Soc Am, (2020). 148, 2371–2375.
Cosetti M. K. Hearing from the COVID-19 epicenter—A neurotologist’s reflection from the front lines. JAMA Otolaryngol Head Neck Surg, (2020). 146, 889–890.
Goldin A., Weinstein B. E., Shiman N. How do medical masks degrade speech perception? Hear Rev, (2020). 27, 8–9.
Hallgren K. A. Computing inter-rater reliability for observational data: An overview and tutorial. Tutor Quant Methods Psychol, (2012). 8, 23–34.
Hampton T., Crunkhorn R., Lowe N., Bhat J., Hogg E., Afifi W., De S., Street I., Sharma R., Krishnan M., Clarke R., Dasgupta S., Ratnayake S., Sharma S. The negative impact of wearing personal protective equipment on communication during coronavirus disease 2019. J Laryngol Otol, (2020). 134, 577–581.
Knowland V. C., Evans S., Snell C., Rosen S. Visual speech perception in children with language learning impairments. J Speech Lang Hear Res, (2016). 59, 1–14.
McKee M., Moran C., Zazove P. Overcoming additional barriers to care for deaf and hard of hearing patients during COVID-19. JAMA Otolaryngol Head Neck Surg, (2020). 146, 781–782.
McNally C., Childress T. Strategies to keep clear windows from fogging up. (2020).
Mheidly N., Fares M. Y., Zalzale H., Fares J. Effect of face masks on interpersonal communication during the COVID-19 pandemic. Front Public Health, (2020). 8, 582191.
National Association of the Deaf COVID-19: Deaf and hard of hearing communication access recommendations for the hospital. (2020).
Nilsson M., Soli S. D., Sullivan J. A. Development of the hearing in noise test for the measurement of speech reception thresholds in quiet and in noise. J Acoust Soc Am, (1994). 95, 1085–1099.
Pörschmann C., Lübeck T., Arend J. M. Impact of face masks on voice radiation. J Acoust Soc Am, (2020). 148, 3663–3670.
Saad A.M., Hegazi M. A., Khodeir M. S. Comparison between lip-reading ability in normal and hearing-impaired children. QJM Int J Med, (2020). 113, i79.
Saeidi R., Huhtakallio I., Alku P. Analysis of face mask effect on speaker recognition. (2016). 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016): Vol 2.
Saunders G. H., Jackson I. R., Visram A. S. Impacts of face coverings on communication: An indirect impact of COVID-19. Int J Audiol, (2020). 1–12.
Studebaker G. A. A “rationalized” arcsine transform. J Speech Hear Res, (1985). 28, 455–462.
Thibodeau L. M. Benefits in speech recognition in noise with remote wireless microphones in group settings. J Am Acad Audiol, (2020). 31, 404–411.
Trecca E. M. C., Gelardi M., Cassano M. COVID-19 and hearing difficulties. Am J Otolaryngol, (2020). 41, 102496.
Tye-Murray N., Spehar B., Myerson J., Hale S., Sommers M. Lipreading and audiovisual speech recognition across the adult lifespan: Implications for audiovisual integration. Psychol Aging, (2016). 31, 380–389.
Walden B. E., Prosek R. A., Worthington D. W. Auditory and audiovisual feature transmission in hearing-impaired adults. J Speech Hear Res, (1975). 18, 272–280
Wittum K. J. The Effects of Surgical Masks on Speech Perception in Noise. (2013). [Research Graduation Thesis, The Ohio State University]. OSU Campus Repository.
Your Questions Answered Brand Ten hours of people talking [Video]. YouTube.2016). [no longer available].

Auditory-visual speech recognition; Aural rehabilitation; Face masks; Pandemic

Supplemental Digital Content

Copyright © 2021 Wolters Kluwer Health, Inc. All rights reserved