A commonly encountered everyday listening situation is to attend to a single target sound in the presence of multiple interfering sounds, the so-called “cocktail-party problem” (Cherry 1953). In daily life, both target and interferers are likely to be speech. The intelligibility of speech in competing speech depends on a number of factors, for example, target–interferer voice similarity (Festen & Plomp 1990) and the temporal characteristics of the interfering speech (e.g., Hygge et al. 1992; Peters et al. 1998). One prominent feature in a typical cocktail-party setting is the spatial separation between target and interferers. The spatial separation results in release from masking and, as a consequence, increased speech recognition performance (for an overview, see Bronkhorst 2000).
This spatial advantage is commonly referred to as spatial release from masking (SRM). SRM is calculated as the difference in performance threshold between conditions where interfering sounds are colocated with a target sound and spatially separated from a target sound. Monaural and binaural processes contribute to SRM (Glyde et al. 2013a; Hawley et al. 2004). The monaural contribution to SRM is based on the listener’s opportunity to attend to the ear with the highest signal to noise ratio (SNR) (e.g., when the interfering sound is on one side of the listener). This “better-ear effect” is, however, not straightforward in many typical listening conditions where interfering sounds are distributed around the listener and the target sound is located between these interferers. In such conditions, binaural processing contributes to SRM (Hawley et al. 2004). The binaural processing is thought to stem in part from the disparity in interaural time differences and interaural level differences between target and interferers when they are spatially separated, resulting in binaural unmasking (Culling et al. 2004; Glyde et al. 2013b; Kidd et al. 2010). The interaural differences are subcortically computed binaural cues (Grothe et al. 2010) and thus reliant on separate auditory streams from the left and the right ear. Interaural difference cues are thought to facilitate auditory stream segregation and selective attention to the target (Bregman 1990). In addition, in multisource environments, subjects with normal hearing (NH) seem able to attend to the information from the ear with the better SNR at each specific point in time, either by a binaural (Brungart & Iyer 2012; Glyde et al. 2013a) or a monaural (Edmonds & Culling 2006) process. Recently, such “better-ear glimpsing” was shown to substantially contribute to performance in speech-in-speech tasks (Schoenmaker et al. 2017).
SRM can vary substantially (see, e.g., Table 1 in Bronkhorst 2000, showing SRM values between 0 and 11 dB). The magnitude of SRM in individuals with NH, and the relative contribution of the monaural and binaural processes to SRM, depends on several factors. The spatial configuration of the interfering sounds, the proximity of the interfering sounds to the target, and the number of interferers all influence SRM (Bronkhorst 2000; Hawley et al. 1999; Hawley et al. 2004; Yost 2017). For example, Yost (2017) demonstrated that for male speech (single words) interfering with female target speech (single words presented in the front azimuth), SRM decreased from about 6 dB for two interferers to about 1 dB for six interferers. Hawley et al. (2004) reported a factor of two larger SRMs for speech interferers and speech-shaped noise interferers than for modulated and unmodulated noise interferers. SRM was also substantially higher for interferer arrangements on one side of the head of the subject compared with arrangements where interferers were on both sides of the head. Furthermore, when interferers are intelligible and similar to the target (e.g., same sex of voices), the masking effect of the interferers is not only energetic, but also informational. Such informational masking increases the potential for, and the magnitude of, SRM because speech recognition thresholds (SRTs) are elevated in the colocated condition. As an example, separating two intelligible interfering sentences (female voices) that are similar to the target (also female voice) by 15° to left and right from target sentences resulted in about 12 dB spatial release (Swaminathan et al. 2015). When interferers were unintelligible (reversed), SRM was substantially reduced to a few dB. Furthermore, Swaminathan et al. (2015) also demonstrated that trained musicians achieved larger SRM than nonmusicians in conditions high on informational masking, suggesting training and experience may have a significant effect on SRM.
Experiments using simulated spatial locations, achieved by headphone presentation of stimuli filtered by head-related impulse responses, show that the monaural contribution to SRM reduces as interferers are distributed around the listener in the front azimuth with the target frontally positioned (e.g., Hawley et al. 2004). The interpretation of those data could be that the better-ear effect was significantly reduced. While monaural processing assessed by headphone presentation offers experimental control, for example, to simulate a monaural condition, the head-related impulse responses used are usually not taking individual differences into account. Rather, generic head-related impulse responses (Gardner & Martin 1995) are used. Because there is substantial intersubject variability in the transfer functions from sound field to the ear canal (Middlebrooks 1999; Moller et al. 1996; Wightman & Kistler 2005), evaluation of spatial hearing in sound field might reflect real-life conditions to a greater extent.
One of the objectives of the present study was to quantify the binaural and monaural contributions to SRM in sound field, using a setup that has been used clinically for 2 decades (Berninger & Karlsson 1999). The setup has been shown to be sensitive to detect deficits associated with simulated, as well as congenital, unilateral hearing loss (UHL) (Asp et al. 2018; Johansson et al., Reference Note 1), and includes not only interfering speech signals in the front but also in the rear azimuth (please see Subjects and Methods), which should be a common condition in daily life.
Effect of UHL on Recognition of Speech in Competing Speech
In adults (20 to 69 years old), the prevalence of UHL (≥25 dB HL at 0.5, 1, 2, and 4 kHz) is 7.9%, which is similar to the prevalence of bilateral hearing loss (7.8%), according to the National Health and Nutrition Survey in the United States 1999 to 2004 (n = 5742) (Agrawal et al. 2008). This corresponds to approximately 14 million adult Americans suffering from UHL at important speech frequencies. Individuals with UHL report a perceived handicap in real-life situations, despite NH in the unimpaired ear (Chiossoine-Kerdel et al. 2000; Dwyer et al. 2014; Gatehouse & Noble 2004; Newman et al. 1997). Part of this perception may stem from having little or no access to binaural processing due to the poor audibility in the impaired ear. Notably, individuals with UHL typically perform poorer than individuals with NH in laboratory tests reflecting real-life conditions, that is, conditions with high demands on spatial hearing (Firszt et al. 2017; Rothpletz et al. 2012). Several studies have also demonstrated significant decreases in sound localization accuracy and speech recognition in multisource spatially separate noise for simulated UHL (simUHL) in adults and children (Asp et al. 2018; Asp et al. 2012; Corbin et al. 2017; Firszt et al. 2017). However, the variability in UHL performance (both simulated and long-standing UHL) is large, with some individuals performing similar to individuals with normal bilateral hearing (Agterberg et al. 2012; Slattery & Middlebrooks 1994). It is possible that some of the observed variability is related to the various degrees of UHL in previous studies (e.g., Rothpletz et al. 2012). In keeping with the goals of the present study, the inclusion criteria for subjects with profound UHL were therefore such that there should be no audibility of target and interferer signals in the UHL ear (see Subjects section in Subjects and Methods).
Three main research questions motivated the present study. First, a comparison of speech recognition performance between subjects with NH and profound sensorineural UHL in a complex listening environment was desired. By arranging the interferers on both sides of the subject, binaural difference cues should help those listeners with audibility in both ears to more efficiently attend to the ear with the highest SNR. Second, a goal was to estimate the binaural and monaural contribution to SRM in the setup used, extending the research performed using simulated spatial conditions presented using headphones (e.g., Glyde et al. 2013a; Hawley et al. 2004). The relative contribution of monaural and binaural SRM was particularly interesting to study because sound field presentation was used. To the authors’ best knowledge, the monaural and binaural contribution to SRM in sound field with symmetrical speech interferers has not been studied, although attempts to achieve a “monaural” condition by means of earplugs and hearing protectors have been made (Marrone et al. 2008a). A third objective was to study the effect of a simulated mild-to-moderate UHL on SRM to test the possibility that binaural processing could contribute to a spatial advantage despite reduced audibility in one ear.
Aim of Study
The aim of this study was to evaluate binaural and monaural contributions to recognition of speech in competing speech and SRM in a demanding multisource listening environment.
SUBJECTS AND METHODS
Recognition of speech in competing speech was assessed in sound field in subjects with NH (n = 13) and profound sensorineural UHL (n = 9). Performance for subjects with NH was assessed for normal and simUHL conditions to study the effect of acute changes in unilaterally reduced audibility on SRM. SimUHL was achieved by an earplug (EAR Classic foam earplug; 3M, Minneapolis, MN) in the right ear of the subjects with NH. The right ear was chosen as the UHL ear for all the subjects to minimize the number of variables.
Four competing speech interferers were either colocated with or spatially and symmetrically separated (±30° and ±150°) from the target speech (0°). Subjects with NH were thus tested in four conditions: normal binaural listening with spatially separated (Normalsep) and colocated (Normalcoloc) interferers, and simUHL conditions with the same spatial configurations (simUHLsep and simUHLcoloc). Subjects with profound sensorineural UHL were tested with spatially separated (UHLsep) and colocated (UHLcoloc) interferers.
The test order for separated and colocated conditions and normal and simUHL listening conditions was counterbalanced for the subjects with NH. UHL subjects were randomized to start with either spatial condition.
The main outcome measures were the absolute SRT and SRM. SRM was computed as the difference (in dB) in SRT between colocated and spatially separated conditions. To estimate the monaural and binaural contributions to SRM, the difference in SRM between subjects with NH and profound UHL was computed.
The study was approved by the regional ethical committee in Gothenburg, Sweden. The subjects received oral and written information about the study before enrollment, and written informed consent was obtained for all the participants.
Subjects With NH
Thirteen healthy adult volunteers [mean (SD) age = 40.0 (12.8) years; range = 19 to 60 years] who were native Swedish speakers participated. Immediately before assessment of speech recognition, pure-tone thresholds were recorded and otomicroscopy performed. The mean (SD) pure-tone averages (PTAs) across 0.5, 1, 2, and 4 kHz were 4.1 dB HL (5.2) and 5.5 dB HL (6.0) in the left and right ear, respectively, as measured according to ISO 8253-1. (2010). While there was a tendency that older subjects had higher pure-tone thresholds at individual frequencies, simple linear regressions showed that the left and right PTAs were not related to age (left: PTA = 0.19 × age – 3.7, r = 0.48, p = 0.10; right: PTA = 0.22 × age – 3.3, r = 0.47, p = 0.11). All subjects had PTAs ≤20 dB HL. There were no differences in pure-tone thresholds between the left and right ear across frequencies (ps > 0.05, Bonferroni corrected t test, dependent samples).
Subjects With Profound Unilateral Sensorineural Hearing Loss
Because one of the aims of the study was to estimate the monaural and binaural contribution to SRM, formal inclusion criteria for the subjects with profound UHL stated that (1) the target signal and interferers should be entirely inaudible in the UHL ear at SRT; (2) the contralateral hearing thresholds should be ≤25 dB HL; and (3) the hearing loss should be of sensorineural origin. To recruit eligible subjects according to the first inclusion criterion, the pure-tone thresholds in the UHL ear were related to the function describing hearing level of speech (see Table 5 in Pavlovic, 1987). Also based on the remaining inclusion criteria, 9 subjects with profound sensorineural UHL 25 to 61 years of age (mean = 48.4 years) were recruited (Table 1). All of the subjects were native Swedish speakers and had pure-tone hearing thresholds ≤20 dB HL at 0.125, 0.25, 0.5, 1, 2, 3, 4, 6, and 8 kHz in one ear, except subject 3 who had a threshold of 25 dB HL at 8 kHz. The thresholds in the unimpaired ear were comparable to the thresholds in the left and right ear of the subjects with NH across frequencies (ps > 0.05). Two of the subjects had childhood UHL, one of which was self-reported as congenital. One subject had idiopathic progressive sensorineural UHL. Four subjects had idiopathic sudden unilateral sensorineural hearing loss. One of these subjects reported tinnitus lateralized to the ear with UHL and had 30 years of experience with the profound UHL condition, while the other 3 subjects had relatively recent (≤2 years) sudden sensorineural UHL. One subject had profound sensorineural UHL because adolescence secondary to schwannoma. One subject had profound sensorineural UHL by 20 years of age after cholesteatoma surgery.
Quantification of SimUHL in Subjects With NH
The standard for obtaining the attenuation provided by hearing protectors (“real ear attenuation at threshold”) states that measurements should be performed in sound field and assumes bilateral hearing protectors (Berger & Kerivan 1983; ISO 4869-1. 1990). When estimating the simUHL in the present study, this standard was not followed because poor fitting of either the right or left hearing protector may result in an inaccurate estimate of a UHL. The effect of the deeply inserted earplug in the right ear was quantified in a double-walled sound booth by measuring pure-tone thresholds with and without the earplug according to ISO 8253-1. (2010). Telephonics Dynamic Headphones-39 ear phones were used, which should result in a valid estimate of attenuation when using foam ear plugs (Tufts et al. 2012).
Recognition of Speech in Competing Speech
An adaptive psychoacoustic task was used to assess the SRT for recognition of speech in competing speech. The setup and procedure for this task when speech interferers are spatially separated from the target speech are described in Asp et al. (2018) and are elaborated later to also describe the colocated condition.
Setup, Target Speech, and Competing Speech
Recognition of speech in competing speech was measured in sound field in a double-walled sound booth (4.0 × 2.6 × 2.1 m) with a mean ambient sound level = 20 dB (A) obtained during 15-sec measurement and reverberation time T30 = 0.09 sec at 0.5 kHz, as recorded with a B&K 2238 Mediator and a B&K 2260 Investigator (Brüel & Kjær, Nærum, Denmark). Subjects were seated in the center of the room, 1.8 m from a loudspeaker at 0° azimuth, from which the target signal was presented. Four loudspeakers were placed in the corners of the room, corresponding to ±30° azimuth (frontal horizontal plane) and ±150° azimuth (in the rear horizontal plane), thus surrounding the subject (Berninger & Karlsson 1999).
The target speech (female voice) was the Hagerman sentences (Hagerman 1982). Each sentence consisted of five words that formed a grammatically correct sentence with low semantic predictability in a fixed syntax (e.g., “Peter höll nio nya lådor,” in translation: “Peter held nine new boxes”). Twelve lists (and one training list), each containing 10 sentences, were used. The interferers comprised four noncorrelated recordings of a single male talker reading a novel. The interferers were presented either from the four corner-placed loudspeakers, or colocated with the target signal (0° azimuth), at a fixed overall level of 63 dB SPL Ceq (12 min recording time), as measured at the position of the subjects’ head (Berninger & Karlsson 1999). Natural pauses occurred in each of the four interfering signals, likely contributing to an interaural asymmetry in the amount of masking (and consequently the amount of head shadow), the interferers contributed over time in the separated condition. In other words, moment-by-moment better-ear glimpsing was theoretically possible in the separated condition.
Subjects were instructed to face the frontal loudspeaker during the entire test. They were not informed that the target signal originated from 0° or about the different spatial configurations (separated and colocated) because this may influence SRM (Ihlefeld et al. 2006). They were asked to repeat the words of one training list (always the same list) and two target lists, and their oral responses were recorded by an experimenter outside the test room. The experimenter listened to the target signal and the subject’s responses through a feedback system and scored the responses after each sentence. Guessing was encouraged, and no feedback was provided. Words had to be repeated grammatically correctly to be scored as correct. The training started at a SNR of +10 dB. For the following training sentences, the target speech level decreased up to three times in 5 dB steps, then up to three times in 3 dB steps, and then in 2 dB steps until the number of correct words in a sentence was ≤2. After training, the scheme for level adjustment of the target speech was +2 dB for zero correctly identified words, +1 dB for one correctly identified word, 0 dB for two correctly identified words, −1 dB for three correctly identified words, −2 dB for four correctly identified words, and −3 dB for five correctly identified words, aiming at a threshold of 40% words correct. That threshold and the adaptive scheme for level adjustment were based on computer simulations and analysis of the maximum steepness of the psychometric function (Hagerman 1979,1982; Hagerman & Kinnefors 1995). The SRT was defined as the mean of the SNRs for the last 10 presented sentences (Hagerman & Kinnefors 1995; Plomp & Mimpen 1979).
SRT and SRM values were normally distributed across all listening conditions. A repeated-measures analysis of variance (ANOVA) with two within-subject factors (listening condition and spatial condition) was used to study the effect of simUHL and spatial cues on the SRT for subjects with NH. A repeated-measures ANOVA with the between-subject variable age entered as a covariate was used to analyze the within-subject effect of simUHL on SRM.
Within-subject statistical analyses of colocated versus separated SRT were performed in subjects with profound UHL (Student’s paired t test), and a between-subject comparison (normal versus profound UHL and simUHL versus profound UHL) of the SRT was performed using a Student’s unpaired t test.
Student’s t test was used to test if SRM was significantly different from zero. All statistical analyses were performed using Statistica version 13 (Statsoft, Inc., Tulsa, OK).
Subjects With NH
Recognition of Speech in Competing Speech
The mean (SD) SRT was −15.3 dB (2.2 dB) in the Normalsep condition and −11.6 dB (1.6 dB) in the Normalcoloc condition (Fig. 1A). For simUHL conditions, the mean (SD) SRT was −12.3 dB (2.2 dB) and −10.0 dB (1.4 dB) in the separated and colocated condition, respectively. A two-way (Listening Condition × Spatial Condition) repeated-measures ANOVA showed significant main effects of listening condition (normal versus simUHL) [F(1,12) = 17.6; p < 0.01] and spatial condition (colocated versus separated) [F(1,12) = 135.8; p < 0.001), but no interaction.
Spatial Release From Masking
The mean (SD) SRM was 3.7 dB (1.6 dB) in the normal condition and 2.3 dB (1.8 dB) in the simUHL condition (Fig. 1B). The mean SRM values were statistically significantly different from zero for normal (t = 8.4; p < 0.001) and simUHL (t = 4.6; p < 0.001) (Student’s t test). SRM occurred for all subjects with NH in the normal condition and for 10 of 13 subjects (77%) in the simUHL condition (see individual colocated as a function of separated SRTs in Fig. 1C).
Effect of SimUHL on SRM
The mean pure-tone threshold (across 0.5, 1, 2, and 4 kHz) in the right ear after insertion of the earplug was 38.6 dB HL. The mean thresholds and corresponding SDs per audiometric frequency are summarized in Table 2. Neither the individual PTAs for simUHL (range = 23.8 to 48.0 dB HL) nor the individual attenuation achieved by the simUHL (range = 21.3 to 42.5 dB) were related to the age of the subjects (PTA simUHL: r = 0.038, p = 0.90; PTA attenuation: r = −0.38, p = 0.20). The relatively large age span (19 to 60 years) and previous studies showing conflicting results regarding the effect of age on SRM (Füllgrabe et al. 2014; Gallun et al. 2013; Srinivasan et al. 2016) warranted the inclusion of age as a covariate in a within-subject ANOVA of the effect of simUHL on SRM. Before analysis, the covariate age was centered around the mean (40 years) to avoid type 1 errors and decrease the risk of loss of statistical power (Schneider et al. 2015). The analysis showed a statistically significant main effect of the simUHL on SRM that accounted for 22% of the variance [F(1,22) = 6.19; mean squares = 12.6; p = 0.02]. The between-factor age was not significant but formed a statistically significant two-way interaction with the within-subject factor simUHL that accounted for 33% of the variance [F(1,22) = 10.7; mean squares = 21.7; p < 0.01].
To illustrate the interaction between age and simUHL, Figure 2 shows the relationships between SRTs and age, and SRM and age, for normal and simUHL listening conditions. Visual inspection of Figure 2 suggests relatively similar SRTs across age for normal listening conditions, as well as for simUHL for colocated target and competing speech. However, separated SRTs seem to increase linearly with increasing age for simUHL. As such, the interaction effect of age and simUHL on SRM seemed to be driven by decreasing performance in the separated condition. A post hoc simple linear regression analysis showed a significant relationship between the separated SRT and age for the simUHL condition (SRT = 0.14 × age – 17.7; r = 0.80; p = 0.001).
Subjects With Profound Sensorineural UHL
Recognition of Speech in Competing Speech and SRM
The mean (SD) SRT for the UHLsep condition was −11.8 dB (1.6 dB). A significant increase in SRT (1.8 dB; p = 0.02; t = 2.8) occurred for UHLcoloc for which the mean (SD) SRT was −10.0 dB (1.0 dB) (Fig. 1A), corresponding to a statistically significant SRM with a SD of 1.9 dB. The mean SRM was significantly different from zero (t = 2.8; p = 0.02). In contrast with normal conditions in NH subjects, SRM did not occur in all subjects with profound UHL (Fig. 1D). For subjects 3 and 8, SRM was negative (i.e., SRTs were lower in the UHLcoloc condition). There was no effect of age on SRM in subjects with profound UHL, as revealed by simple linear regression analysis (r = −0.064; p = 0.87; n = 9).
Comparison of SRTs for Subjects With NH and Profound Sensorineural UHL
Figure 1A illustrates mean SRTs for subjects with NH in normal and simUHL conditions, and for profound UHL. Visual inspection of Figure 1A suggests an increase in SRT in the separated condition as hearing thresholds in one ear increases. By contrast, in the colocated condition, performance for simUHL and profound UHL appears comparable despite the large difference in hearing thresholds (Tables 1 and 2).
A statistical comparison confirmed that the SRT for profound UHL was higher than for subjects with NH when the interferers were spatially separated from (p < 0.001; t = 4.1) and colocated with (p = 0.02; t = 2.6) the target speech. The SRT for profound UHL was comparable to simUHL (separated: p = 0.51, t = 0.68; colocated: p = 0.98, t = 0.03).
Estimate of the Binaural and Monaural Contribution to SRM
One of the aims of the present study was to quantify the magnitude of the binaural and monaural contribution to SRM in the setup used. As noted earlier, the SRM for subjects with NH and profound UHL was 3.7 and 1.8 dB, respectively. The binaural and monaural contributions to SRM were thus estimated to 1.9 and 1.8 dB, respectively.
This study aimed at characterizing deficits in spatial hearing in subjects with profound sensorineural UHL and to estimate the binaural and monaural contributions to the spatial advantage that exist in a demanding multisource listening environment with speech interferers arranged on both sides of the head of the listener. To increase our understanding of the effect of audibility in subjects with UHL on SRM, the spatial advantage for a simulated mild-to-moderate UHL was also assessed. SRTs were lower (better) for subjects with NH than for profound UHL, for both colocated and spatially separated conditions. A simUHL increased normal thresholds significantly, which resulted in simUHL thresholds comparable to profound UHL for both spatial conditions. The increase in separated threshold from normal to simUHL was modulated by age, with a more pronounced effect of a simUHL in older subjects. These data suggested that small amounts of residual hearing for a simUHL may be beneficial in young individuals in separated interfering conditions. Furthermore, the data demonstrated the importance of bilateral NH for recognition of speech in spatially separate competing speech. The pattern of SRTs in the two spatial configurations and the associated SRM across subject groups is discussed later.
Recognition of Speech in Colocated Competing Speech
In the colocated condition, where no spatial cues existed and the interaural time difference and interaural level difference for both the target and competing speech signals were zero, the significant SRT difference between NH subjects and profound UHL was 1.5 dB. While the basic mechanism for this binaural summation is unclear, it exists for diotic versus monaural speech perception in subjects with NH (Bronkhorst & Plomp 1988) and is suggested to relate to an advantage of having two independent observations of the stimuli (Schooneveldt & Moore 1989). Furthermore, the difference was previously demonstrated between subjects with NH and mild-to-profound UHL for a single colocated speech interferer (4.5 dB; Rothpletz et al. 2012). Rothpletz et al. (2012) used a design where the competing speech was qualitatively similar to the target speech, that is, the masking of the target signal was informational rather than energetic (Durlach et al. 2003). The informational masking in combination with the single competing talker design in the study by Rothpletz et al. (2012), as well as the task-related differences (closed-set forced choice task), may explain the larger difference between subjects with NH and UHL that they found.
The increase in SRT from normal to simUHL (1.5 dB) in the colocated condition was equal to the statistically significant difference in SRT between NH and profound UHL. The amount of informational masking may influence the effect of a simUHL on binaural summation, as demonstrated by previous data from speech in colocated interfering two-talker speech high on informational masking where essentially no difference between NH and simUHL conditions was found (Marrone et al. 2008a).
Recognition of Speech in Spatially Separated Competing Speech and SRM
SRTs decreased (improved) for subjects with NH and profound UHL when the competing speech was spatially separated from the target speech compared with when it was colocated. The SRM found for NH (3.7 dB) is similar to, for example, that found for four speech interferers in symmetrical (Bronkhorst & Plomp 1992) and three in asymmetrical (Hawley et al. 2004) spatial configurations, although these previous experiments were performed using headphone presentation simulating free-field conditions.
The SRM was smaller for profound UHL than for normal conditions (49% of the magnitude), but, crucially, it was significantly different from zero. It appears that the majority of the subjects with profound UHL were able to capitalize on cues that did not exist or were minimized in the colocated condition. One such cue is the head shadow, that is, the attenuation of two of the four interferers when they were arranged ipsilateral to the profound UHL, creating a higher SNR in the normal ear compared with the colocated condition. In addition, in the spatially separated condition, the four competing speech signals were distributed in the front and rear azimuth creating moment-by-moment variations in the SNR at each ear. “Glimpsing” of the target signal at temporarily more favorable SNRs was, therefore, more likely to occur in the separated than in the colocated condition, where the four interferers resembled more of a continuous babble. Such auditory processing is consistent with data from subjects with NH in symmetrical speech interferer configurations (Brungart & Iyer 2012; Glyde et al. 2013a) and models of better-ear processing based on conditions with one interfering sound (Zurek 1993). The SRM values from subjects with profound UHL presented here suggest that glimpsing is possible with only one functioning ear, despite symmetrical arrangement of interferers, and give some support to the concept of moment-by-moment better-ear glimpsing (Brungart & Iyer 2012; Glyde et al. 2013a) because SRM was twice as large with two ears (3.7 dB) compared with one ear (1.8 dB).
The decrease in SRM for simUHL conditions seemed modulated by the age of the subjects (cf. Fig. 2), which was statistically confirmed by entering the between-subject variable age as a covariate in an ANOVA where the simUHL was a within-subject factor. Post hoc analysis showed that the separated SRT for simUHL increased as a function of age, thereby reducing SRM. Because there was no association between age and the PTA for the simUHL ear, the audibility in the plugged ear should not be a confounding factor. Age has previously been identified as an important factor for SRM (Jakien & Gallun 2018; Srinivasan et al. 2016), but we are not aware of previously published findings suggesting an interactive effect of age and UHL on SRM. In the present study, an adaptation to the earplug may have occurred more rapidly for younger than for older subjects, allowing binaural glimpsing. A rapid adaptation to a unilateral earplug for spatial tasks exists in both humans (Kumpik et al. 2010) and animals (Kacelnik et al. 2006). Alternatively, age-related changes in temporal processing in the brainstem, as shown in animal models (Walton 2010), may be more sensitive to acute unilateral plugging. The data presented here suggest that small amounts of audibility in the ear with UHL could provide young listeners with binaural difference cues that facilitate glimpsing of the target signal and motivate future clinical studies in which age is considered as an important factor in studying the effect of UHL on SRM.
While the SRM was reduced for simUHL, it was still statistically significantly different from zero. This finding contrasts with previous SRM results in simUHL in a speech-on-speech task (Marrone et al. 2008b), where the SRM for simUHL was close to zero. Several differences between the present study and Marrone et al. (2008b) may explain this difference. Informational masking was higher in Marrone et al. (2008b) because of the use of same-sex talkers, which should be an important distinction from the present study. Moreover, the simUHL in listeners with NH in Marrone et al. was achieved by a combination of an earplug and an earmuff so that audibility in the plugged ear was likely reduced to a greater extent than here. Hearing thresholds for the simUHL in Marrone et al. were not reported, complicating comparison with the present study.
Conclusions and Clinical Implications
Increasing availability of treatment options for individuals with profound UHL, including implantable bone conduction hearing devices and cochlear implants, underpin the need for clear-cut preoperative assessment of any deficit in spatial hearing. Postoperative follow-up of the treatment benefit should be likewise important and could possibly guide the fitting process of the implanted device. Recent data show substantial and relatively rapid improvements across subjects with acquired severe to profound UHL (<10 years UHL experience) in speech-on-speech and sound localization tasks after cochlear implantation (Buss et al. 2018). However, long duration of profound UHL and spectral mismatch between the ear with NH and the implanted ear may degrade performance (Bernstein et al. 2016).
The results from the present study show that recognition of speech in competing speech (with and without spatial cues) is distinctly impaired in individuals with profound unilateral sensorineural hearing loss. Spatially separating competing speech signals so that they are presented symmetrically from the left and right in the front and rear azimuth seem to improve speech recognition for profound UHL, but the improvement is only half of the magnitude of the improvement for subjects with NH. We conclude that SRM assessed in demanding multisource conditions may be achieved by both monaural and binaural processes. While this finding contrasts somewhat with SRM research using interferers arranged both to the left and to the right of the subject in the front azimuth (Hawley et al. 2004), the nature of the competing speech signals in the present study likely contributed significantly to that result. Probably, the interfering speech material allowed auditory “glimpsing” of the target signal in the ear with the most favorable SNR throughout the test. Furthermore, the solving of the cocktail-party problem by a monaural process may be much more complex in reverberant conditions (Bronkhorst & Plomp 1990).
The SRM data obtained from the simUHL condition merit some discussion from a clinical point of view. The simUHL was mild-to-moderate and slightly sloping, which in fact appears as a rather realistic “hearing loss” (see mean thresholds in Table 2). SRTs were increased for the simUHL condition, consistent with recent findings (Asp et al. 2018), with a larger effect in the separated (3.0 dB) than the colocated (1.5 dB) condition. Even though the acute nature of a simUHL is not entirely applicable to individuals with long-standing UHL, these results indicate that SRTs in conditions with separated rather than colocated interferers should be assessed clinically in sound field to detect any deficits in subjects with various mild-to-moderate UHL profiles.
We note finally that the wide range of the size of the effects of spatial separation on speech recognition in noise (depending on, e.g., target–interferer similarity, and spatial arrangement and number of interferers as noted in the Introduction) suggests that clinical tests should be used to evaluate relative performance among groups (e.g., NH compared with UHL) rather than aiming at predicting real-world behavior.
The authors are grateful to Maria Drott, Malin Apler, Jenny Andersson, Linda Persson, and Ann-Charlotte Persson for assistance in measurements; Per-Olof Larsson for technical assistance; and the subjects for participating.
Agrawal Y., Platz E. A., Niparko J. K.. Prevalence of hearing loss and differences by demographic characteristics among US adults: Data from the National Health and Nutrition Examination Survey, 1999-2004. Arch Intern Med, (2008). 168, 1522–1530.
Agterberg M. J., Snik A. F., Hol M. K., et al. Contribution of monaural and binaural cues to sound localization in listeners with acquired unilateral conductive hearing loss: Improved directional hearing with a bone-conduction device. Hear Res, (2012). 286, 9–18.
Asp F., Jakobsson A. M., Berninger E. The effect of simulated unilateral hearing loss
on horizontal sound localization accuracy and recognition of speech in spatially separate competing speech
. Hear Res, (2018). 357, 54–63.
Asp F., Mäki-Torkko E., Karltorp E., et al. Bilateral versus unilateral cochlear implants in children: Speech recognition
, sound localization, and parental reports. Int J Audiol, (2012). 51, 817–832.
Berger E. H., Kerivan J. E.. Influence of physiological noise and the occlusion effect on the measurement of real-ear attenuation at threshold. J Acoust Soc Am, (1983). 74, 81–94.
Berninger E., Karlsson K. K.. Clinical study of Widex Senso on first-time hearing aid users. Scand Audiol, (1999). 28, 117–125.
Bernstein J. G., Goupell M. J., Schuchman G. I., et al. Having two ears facilitates the perceptual separation of concurrent talkers for bilateral and single-sided deaf cochlear implantees. Ear Hear, (2016). 37, 289–302.
Bregman A. S.. Auditory Scene Analysis: The Perceptual Organization of Sound. (1990). Cambridge, MA: MIT Press.
Bronkhorst A. The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions. Acta Acustica, (2000). 86, 117–128.
Bronkhorst A. W., Plomp R. The effect of head-induced interaural time and level differences on speech intelligibility in noise. J Acoust Soc Am, (1988). 83, 1508–1516.
Bronkhorst A. W., Plomp R. A clinical test for the assessment of binaural speech perception in noise. Audiology, (1990). 29, 275–285.
Bronkhorst A. W., Plomp R. Effect of multiple speechlike maskers on binaural speech recognition
in normal and impaired hearing. J Acoust Soc Am, (1992). 92, 3132–3139.
Brungart D. S., Iyer N. Better-ear glimpsing efficiency with symmetrically-placed interfering talkers. J Acoust Soc Am, (2012). 132, 2545–2556.
Buss E., Dillon M. T., Rooth M. A., et al. Effects of cochlear implantation on binaural hearing
in adults with unilateral hearing loss. Trends Hear, (2018). 22.
Cherry E. Some experiments on the recognition of speech, with one and with two ears. J Acoust Soc Am, (1953). 25, 975–979.
Chiossoine-Kerdel J. A., Baguley D. M., Stoddart R. L., et al. An investigation of the audiologic handicap associated with unilateral sudden sensorineural hearing loss. Am J Otol, (2000). 21, 645–651.
Corbin N. E., Buss E., Leibold L. J.. Spatial release from masking
in children: Effects of simulated unilateral hearing loss
. Ear Hear, (2017). 38, 223–235.
Culling J. F., Hawley M. L., Litovsky R. Y.. The role of head-induced interaural time and level differences in the speech reception threshold for multiple interfering sound sources. J Acoust Soc Am, (2004). 116, 1057–1065.
Durlach N. I., Mason C. R., Kidd G. Jr, et al. Note on informational masking. J Acoust Soc Am, (2003). 113, 2984–2987.
Dwyer N. Y., Firszt J. B., Reeder R. M.. Effects of unilateral input and mode of hearing in the better ear: Self-reported performance using the speech, spatial and qualities of hearing scale. Ear Hear, (2014). 35, 126–136.
Edmonds B. A., Culling J. F.. The spatial unmasking of speech: Evidence for better-ear listening. J Acoust Soc Am, (2006). 120, 1539–1545.
Festen J. M., Plomp R. Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. J Acoust Soc Am, (1990). 88, 1725–1736.
Firszt J. B., Reeder R. M., Holden L. K.. Unilateral hearing loss: Understanding speech recognition
and localization variability-implications for cochlear implant candidacy. Ear Hear, (2017). 38, 159–173.
Füllgrabe C., Moore B. C., Stone M. A.. Age
-group differences in speech identification despite matched audiometrically normal hearing: Contributions from auditory temporal processing and cognition. Front Aging Neurosci, (2014). 6, 347.
Gallun F. J., Diedesch A. C., Kampel S. D., et al. Independent impacts of age
and hearing loss on spatial release in a complex auditory environment. Front Neurosci, (2013). 7, 252.
Gardner W. G., Martin K. D.. Hrtf measurements of a Kemar. J Acoust Soc Am, (1995). 97, 3907–3908.
Gatehouse S., Noble W. The speech, spatial and qualities of hearing scale (SSQ). Int J Audiol, (2004). 43, 85–99.
Glyde H., Buchholz J., Dillon H., et al. The effect of better-ear glimpsing on spatial release from masking
. J Acoust Soc Am, (2013a). 134, 2937–2945.
Glyde H., Buchholz J. M., Dillon H., et al. The importance of interaural time differences and level differences in spatial release from masking
. J Acoust Soc Am, (2013b). 134, EL147–EL152.
Grothe B., Pecka M., McAlpine D. Mechanisms of sound localization in mammals. Physiol Rev, (2010). 90, 983–1012.
Hagerman B. Reliability in the determination of speech reception threshold (SRT). Scand Audiol, (1979). 8, 195–202.
Hagerman B. Sentences for testing speech intelligibility in noise. Scand Audiol, (1982). 11, 79–87.
Hagerman B., Kinnefors C. Efficient adaptive methods for measuring speech reception threshold in quiet and in noise. Scand Audiol, (1995). 24, 71–77.
Hawley M. L., Litovsky R. Y., Colburn H. S.. Speech intelligibility and localization in a multi-source environment. J Acoust Soc Am, (1999). 105, 3436–3448.
Hawley M. L., Litovsky R. Y., Culling J. F.. The benefit of binaural hearing
in a cocktail party: Effect of location and type of interferer. J Acoust Soc Am, (2004). 115, 833–843.
Hygge S., Rönnberg J., Larsby B., et al. Normal-hearing and hearing-impaired subjects’ ability to just follow conversation in competing speech
, reversed speech, and noise backgrounds. J Speech Hear Res, (1992). 35, 208–215.
Ihlefeld A., Sarwar S. J., Shinn-Cunningham B. G.. Spatial uncertainty reduces the benefit of spatial separation in selective and divided listening. J Acoust Soc Am, (2006). 119, 3417–3417.
ISO 4869-1. (Acoustics - Hearing Protectors - Part 1: Subjective Method for the Measurement of Sound Attenuation. (1990). Geneva, Switzerland: International Organization for Standardization.
ISO 8253-1. (Acoustics - Audiometric Test Methods - Part 1: Pure-Tone Air and Bone Conduction Audiometry. (2010). Geneva, Switzerland: International Organization for Standardization.
Jakien K. M., Gallun F. J.. Normative data for a rapid, automated test of spatial release from masking
. Am J Audiol, (2018). 27, 529–538.
Kacelnik O., Nodal F. R., Parsons C. H., et al. Training-induced plasticity of auditory localization in adult mammals. PLoS Biol, (2006). 4, e71.
Kidd G. Jr, Mason C. R., Best V., et al. Stimulus factors influencing spatial release from speech-on-speech masking. J Acoust Soc Am, (2010). 128, 1965–1978.
Kumpik D. P., Kacelnik O., King A. J.. Adaptive reweighting of auditory localization cues in response to chronic unilateral earplugging in humans. J Neurosci, (2010). 30, 4883–4894.
Marrone N., Mason C. R., Kidd G. Tuning in the spatial dimension: Evidence from a masked speech identification task. J Acoust Soc Am, (2008a). 124, 1146–1158.
Marrone N., Mason C. R., Kidd G. Jr. The effects of hearing loss and age
on the benefit of spatial separation between multiple talkers in reverberant rooms. J Acoust Soc Am, (2008b). 124, 3064–3075.
Middlebrooks J. C.. Individual differences in external-ear transfer functions reduced by scaling in frequency. J Acoust Soc Am, (1999). 106(3 Pt 1), 1480–1492.
Moller H., Sorensen M. F., Jensen C. B., et al. Binaural technique: Do we need individual recordings? J Audio Eng Soc, (1996). 44, 451–469.
Newman C. W., Jacobson G. P., Hug G. A., et al. Perceived hearing handicap of patients with unilateral or mild hearing loss. Ann Otol Rhinol Laryngol, (1997). 106, 210–214.
Pavlovic C. V.. Derivation of primary parameters and procedures for use in speech intelligibility predictions. J Acoust Soc Am, (1987). 82, 413–422.
Peters R. W., Moore B. C., Baer T. Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people. J Acoust Soc Am, (1998). 103, 577–587.
Plomp R., Mimpen A. M.. Improving the reliability of testing the speech reception threshold for sentences. Audiology, (1979). 18, 43–52.
Rothpletz A. M., Wightman F. L., Kistler D. J.. Informational masking and spatial hearing
in listeners with and without unilateral hearing loss. J Speech Lang Hear Res, (2012). 55, 511–531.
Schneider B. A., Avivi-Reich M., Mozuraitis M. A cautionary note on the use of the Analysis of Covariance (ANCOVA) in classification designs with and without within-subject factors. Front Psychol, (2015). 6, 474.
Schoenmaker E., Sutojo S., van de Par S. Better-ear rating based on glimpsing. J Acoust Soc Am, (2017). 142, 1466.
Schooneveldt G. P., Moore B. C.. Comodulation masking release for various monaural and binaural combinations of the signal, on-frequency, and flanking bands. J Acoust Soc Am, (1989). 85, 262–272.
Slattery W. H. III, Middlebrooks J. C.. Monaural sound localization: Acute versus chronic unilateral impairment. Hear Res, (1994). 75, 38–46.
Srinivasan N. K., Jakien K. M., Gallun F. J.. Release from masking for small spatial separations: Effects of age
and hearing loss. J Acoust Soc Am, (2016). 140, EL73.
Swaminathan J., Mason C. R., Streeter T. M., et al. Erratum: Musical training, individual differences and the cocktail party problem. Sci Rep, (2015). 5, 14401.
Tufts J. B., Palmer J. V., Marshall L. Measurements of earplug attenuation under supra-aural and circumaural headphones. Int J Audiol, (2012). 51, 730–738.
Walton J. P.. Timing is everything: Temporal processing deficits in the aged auditory brainstem. Hear Res, (2010). 264, 63–69.
Wightman F., Kistler D. Measurement and validation of human HRTFs for use in hearing research. Acta Acust United Ac, (2005). 91, 429–439.
Yost W. A.. Erratum: Spatial release from masking
based on binaural processing for up to six maskers [J. Acoust. Soc. Am. 141, 2093-2106 (2017)]. J Acoust Soc Am, (2017). 141, 2473.
Zurek P. M.. Studebaker G. A., Hochberg I. Binaural advantages and directional effects in speech intelligibility. In Acoustical Factors Affecting Hearing Aid Performance (1993). 2nd ed., Boston: Allyn and Bacon.255–276.
Johansson M., Asp F., Berninger E. Children With Congenital Unilateral Sensorineural Hearing Loss: Effects of Late Hearing Aid Amplification-A Pilot Study. Ear and hear, (2019). [Volume Publish Ahead of Print] doi: 10.1097/AUD.0000000000000730