An almost universal complaint among people with hearing loss is difficulty understanding speech in background noise (Abrams & Kihm 2015). This problem, however, is not limited to those with hearing loss. A proportion of listeners with normal or “near-normal” hearing also seek medical or audiological advice for this issue (Stephens et al. 2003; Ruggles et al. 2012). Indeed, it has been estimated that approximately 5 to 15% of people who seek a hearing assessment because of difficulties in challenging situations have normal audiometric thresholds (≤20 dB HL, 0.5 to 4 kHz; Saunders & Haggard 1989; Hind et al. 2011; Tremblay et al. 2015; ≤25 dB HL, 0.5 to 4 kHz; Spankovich et al. 2018). Such clients are usually assured that their hearing is fine, and there is little that clinicians can suggest in the way of causation, diagnosis, or rehabilitation to address these functional listening difficulties (Zhao et al. 2008). This often results in an unsatisfactory clinical experience, leaving the patient to feel that their hearing concerns have not been listened to or taken seriously. When the client receives the finding of a clinically normal audiogram, this can “invalidate” their problem because it focuses on the absence of observable pathology (Pryce 2006; Pryce & Wainwright 2008), which in turn can lead to feelings of dismissal, confusion, and increased anxiety (Pryce & Hall 2014).
One possible cause of hearing-in-noise difficulties is thought to be cochlear damage from excessive noise exposure. On the basis of animal studies, Kujawa and Liberman (2009) suggested that noise-induced cochlear synaptopathy underlies impaired encoding of sound leading to speech-in-noise difficulties. However, conclusive evidence of a link between noise exposure, synaptic loss, and deficits in suprathreshold sound processing in individuals with clinically normal audiograms has not yet been demonstrated. Some studies have indicated a relationship between noise exposure and reduced auditory brainstem response (ABR; Stamper & Johnson 2015; Bramhall et al. 2017) and increased ABR wave I/wave V ratios (Liberman et al. 2016; Grose et al. 2017). An association has also been demonstrated between ABR wave I amplitude and frequent or constant tinnitus in young military veterans (Bramhall et al. 2018). Most studies, however, mainly those testing younger adults ≤35 years, have found no evidence of suprathreshold deficits or the anticipated reduction in ABR wave I amplitude as a function of increasing noise exposure (Prendergast et al. 2016, 2017; Fulbright et al. 2017; Grinn et al. 2017; Guest et al. 2017).
In our study, 122 adults (30 to 57 years) with normal or near-normal hearing completed behavioral testing, and a subset (n = 68) also completed electrophysiological testing. While the electrophysiology results demonstrated a significant negative correlation between lifetime noise exposure and the amplitude of ABR wave I consistent with noise-induced cochlear synaptopathy, the behavioral testing did not show a link between noise exposure and performance on any suprathreshold auditory processing or speech-in-noise tasks (Yeend et al. 2017; Valderrama et al. 2018). Rather, the behavioral results identified several cognitive skills (sentence closure, working memory, and attention) and hearing factors (extended high-frequency [EHF] thresholds and medial olivocochlear suppression strength) that were significantly related to the ability to process speech in noise (see Yeend et al. 2017). This set of results, while providing some evidence of noise-induced cochlear synaptopathy in humans, suggests that any effects of synaptopathy may be confined to the auditory periphery, and other cognitive and auditory factors play a more important role in determining speech-in-noise outcomes.
The focus of this study is on these “other cognitive and auditory factors” that affect speech-in-noise outcomes. In particular, several studies have now shown that elevated, EHF threshold levels (above 8 kHz) are associated with increased noise exposure (Liberman et al. 2016; Prendergast et al. 2017) or poorer speech-in-noise perception (Badri et al. 2011; Yeend et al. 2017). These results suggest that the basal area of the human cochlea, which is responsive to high-frequency (HF) sound, may be more susceptible to noise-related damage or damage from other causes such as ototoxicity and aging. In this case, EHF audiometry could provide an early indicator of cochlear injury in cases where people report problems understanding speech in noise (Mehrparvar et al. 2011; Rodríguez Valiente et al. 2016).
Additionally, a number of studies have shown that cognitive processes play an important role in speech-in-noise processing (Kujala et al. 2004; Rudner & Lunner 2014; Stenbäck et al. 2016; Bressler et al. 2017; Dryden et al. 2017). For example, it is well established that performance of older listeners with hearing loss on speech-in-noise tasks is affected by their working memory capacity (Lunner 2003; Rudner et al. 2011; Classon et al. 2013; Keidser et al. 2015; Heinrich et al. 2016), which is commonly assessed using the Reading Span Test (RST; Daneman & Carpenter 1980). Yet, for normal-hearing listeners, the impact of working memory on speech-in-noise performance has not been as reliably demonstrated (for review see Füllgrabe & Rosen 2016). Schoof and Rosen (2014) assessed younger (19 to 29 years) and older (60 to 72 years) adults with normal hearing on a comprehensive test battery, including the RST, and concluded that age-related declines in cognitive processing (working memory and processing speed) do not always lead to difficulties understanding speech in noise. In contrast, Gordon-Salant and Cole (2016) included both the Listening Span Test, which is a working memory span test involving listening to and recalling verbal materials (Daneman & Carpenter 1980), and the RST. They showed that younger (18 to 25 years) and older (61 to 75 years) listeners with normal hearing with low working memory capacity are at a disadvantage when recognizing speech in noise. Similarly, our study of noise-exposed normal hearers showed that working memory was one of the key factors associated with speech-in-noise performance (Yeend et al. 2017).
Given that cognitive processes and EHF hearing are emerging as important factors in speech-in-noise perception in our work and that of other researchers in the field, in this study, we focused on the potential clinical application of these factors. Our aim was to devise a diagnostic criterion for speech-in-noise difficulty that clinicians could use to help explain the possible source of clients’ speech-in-noise difficulties. The data set described in Yeend et al. (2017) included a wide range of auditory and cognitive factors, as well as speech-in-noise measures. From this data set, we identified those participants with the poorest speech-in-noise performance (n = 30) and compared them to those who performed best (n = 30). We then identified the main factors that separated the two groups and used these factors to build a regression model to predict speech-in-noise difficulties. The resulting regression formula defined the “diagnostic criterion,” which was then assessed to determine how well it predicted speech-in-noise difficulties in our complete data set of 122 participants with normal hearing.
MATERIALS AND METHODS
Treatment of participants was approved by the Australian Hearing and Macquarie University Human Research Ethics Committees and complied with the National Statement on Ethical Conduct in Human Research.
One hundred and twenty-two adults, aged 30 to 57 years, with normal (less than or equal to 20 dB HL at 0.25 to 6 kHz) or near-normal hearing as defined by Moore et al. (2012), that is, less than or equal to 25 dB HL up to 2 kHz; less than or equal to 30 dB HL at 3 kHz; less than or equal to 35 dB HL at 4 kHz; and less than or equal to 40 dB HL at 6 kHz, participated. It was necessary to have an inclusion criteria broad enough to accommodate participants with a wide range of lifetime noise exposures; some of whom were likely to have at least some hearing thresholds outside the generally accepted definition of normal (≤20 dB HL). All participants completed an online survey followed by a comprehensive laboratory test session. The laboratory session included audiometry, auditory processing tasks and cognitive measures. Full details of the test protocol can be found in Yeend et al. (2017), but a brief description of relevant tests is provided here.
Speech, Spatial, and Qualities of Hearing Scale
The average score for the speech items only (questions 1 to 5) of the Speech, Spatial and Qualities of Hearing scale (SSQ12; Noble et al. 2013) was used to estimate self-reported ability to understand speech in noise. For each of the five questions, participants’ rated their ability to follow speech when there is competing background noise, for example, in a busy restaurant, or room with many people talking.
Listening in Spatialized Noise Sentences
The high-cue condition of the Australian version (2.202) of the Listening in Spatialized Noise Sentences (LiSN-S) test was used to measure ability to understand speech in noise (Cameron et al. 2011). This condition presents target speech and background speech spoken in a different voice and presented at ±90° and was selected because it is the most realistic listening scenario of the four LiSN-S conditions. The prescribed gain amplifier option was selected for all participants plus an additional 6 dB of overall amplification was provided.
National Acoustic Laboratories Dynamic Conversations Test
Six monologues (one practice and five test passages) from the National Acoustic Laboratories Dynamic Conversations Test (NAL-DCT), presented at −7 dB signal-to-noise ratio, were used to assess real-world “on-the-go” speech comprehension in background noise (Best et al. 2018).
Composite Speech-in-Noise Score
This score included the self-report measure and two speech-in-noise test results. Although it is true that self-report measures of speech-in-noise ability can be influenced by factors such as personality or misjudgment (Saunders & Haggard 1989, 1992), it is also the case that laboratory test results do not always reflect a person’s performance in real-world conditions. By combining both subjective and behavioral measures, we have created a composite score, which aims to represent both perceived and actual ability to understand speech in noise and allow us to identify symptomatic versus nonsymptomatic participants within the sample. For each participant, scores from the SSQ12 speech items, LiSN-S high-cue condition, and NAL-DCT were transformed into standardized z scores by subtracting the sample mean and dividing by the SD. These were then averaged to obtain an overall measure: the composite speech-in-noise score (CSS). Participants were ranked according to CSS (a lower CSS indicates a poorer overall performance), and two subgroups each comprising 30 participants were identified: those with the lowest scores (low CSS) and those with the highest scores (high CSS).
Other Test Measures
Participants were asked to indicate their age (years), gender (male, female, indeterminate/intersex/unspecified), and highest level of education achieved (primary school through to postgraduate university degree).
Participants were asked whether they had contact with chemicals (e.g., solvents, paints, degreasers, jet fuels, gasoline, or cleaning fluids) in current or past employment. They were also asked whether they had ever taken potentially ototoxic medications (e.g., aspirin, nonsteroidal anti-inflammatories, antibiotics, loop diuretics, anticancer drugs, or other medications that affected their hearing) in high doses.
Total lifetime noise exposure (Pa2h) was calculated for each participant based on their answers to an online survey about leisure noise activities, workplace noise exposure, and use of hearing protection during each decade of life. This value was then transformed (log Pa2h) such that a 1-unit difference corresponds to a change in noise exposure by a factor of 10.
An index of music training score was calculated to indicate each participant’s highest level of musical training (formal and informal) using responses from the Music Use questionnaire (Chin & Rickard 2012).
Hearing thresholds were tested in a sound-treated room using a modified Hughson Westlake procedure with a 2 dB step size (Le Prell et al. 2013). Average hearing threshold level was calculated for three frequency regions: low-frequency (LF) 0.25 to 2 kHz, HF 3 to 6 kHz, and EHF 9 to 12.5 kHz.
The Mimosa Acoustics HearID Auditory Diagnostics System (software version 5.1.9) was used to measure distortion product otoacoustic emissions (DPOAEs). A DPgram (2f1 − f2, f2/f1 ratio = 1.25, f1 = 65 dB SPL and f2 = 55 dB SPL) was recorded with 8 points/octave between 1 and 12 kHz (for f2). An average HF DPOAE at 3 to 6 kHz was calculated for each participant.
An automated research module (TE50_B2000_N60; Mimosa Acoustics 2014) was used to record and calculate each participant’s medial olivocochlear reflex (MOCR) strength statistic (0.5 to 2.5 kHz band) following the method described in Marshall et al. (2014).
A modified version of the threshold-equalizing noise (TEN) test with ER-3A insert earphones (Moore et al. 2012) was used to assess ability to detect a 3 and 4 kHz tone presented in TEN, at an elevated level.
Temporal Fine Structure
The temporal fine structure (TFS) task described in Moore and Sek (2009) was used to evaluate sensitivity to TFS. A complex tone, with a fundamental frequency of 400 Hz and centre frequency of 4400 Hz, was presented at 75 dB SPL in a TEN masker (60 dB SPL/ERBN at 1 kHz), and participants were required to choose the item in which the sound appeared to “fluctuate.” Participants completed a practice task followed by one adaptive test run and a TFS threshold was recorded for each participant.
Amplitude modulation detection thresholds were assessed using a 3-alternative forced choice adaptive procedure. A 3.5 kHz carrier tone modulated at 4 (AM4) and 90 Hz (AM90) was presented at 75 dB SPL with a TEN masker (55 dB SPL/ERBN at 1 kHz).
Subtest 3 “elevator counting with distraction” and subtest 5 “elevator counting with reversal” of the Test of Everyday Attention (TEA Version A; Robertson et al. 1994) were used to assess selective attention and attention switching, respectively. The results were averaged to give a combined attention score.
The matrices subtest of the multiple-choice Kaufman Brief Intelligence Test (Kaufman & Kaufman 2004) was used to test nonverbal intelligence. Participants were required to identify a logical pattern within an incomplete picture matrix and then choose one of four alternatives to complete the matrix. A raw score was noted for each participant.
The Australian-English version of the RST (Daneman & Carpenter 1980) was used to assess working memory. Participants read aloud sentences presented in blocks (3 to 6 items), indicated whether they “made” sense and when asked, recalled either the first or last word from each sentence. The percentage of correct words each participant recalled was calculated.
The Text Reception Threshold (TRT) test was used to assess generalized language skills (controlled sentence completion and lexical access; Zekveld et al. 2007; Zekveld 2017). Fifty sentences (10 practice and 40 test trials) from the Speech Perception in Noise Test (Kalikow et al. 1977) were presented; each word appeared at 500 msec intervals, and vertical masking bars were varied adaptively. An unmasked threshold was calculated for each participant.
Presentation Mode and Level
For LiSN-S, NAL-DCT, TFS1, AM4, AM90, and TEA, stimuli were presented at suprathreshold levels (range 68 to 76 dB SPL) and the TFS1, AM4, AM90 stimuli were focused on the 3 to 6 kHz frequency region. For DPOAEs and MOCR, stimuli levels were lower (50 to 65 dB SPL), while the TEN test was administered at a “loud but OK” level set by the participant. The LiSN-S, NAL-DCT, and TEA were administered binaurally, whereas the TFS, AM, TEN and MOCR strength tests were administered monaurally to the test ear. The right ear was designated as the test ear for all except seven participants who either had slightly better left ear thresholds (n = 5), a rounded right-ear tympanometric peak (n = 1), or a narrow but normal right external ear canal (n = 1). Pure tone audiometry and DPOAEs were administered to both the test and nontest ears.
Complete data was available for 77 participants. However, the NAL-DCT and TRT tests were added to the test battery after the study began; hence, not every participant completed these tasks. The MOCR data set was also incomplete because results of sufficient quality were obtained in only 82% of participants. Multiple imputation was used to estimate the missing data for the NAL-DCT (n = 29), TRT unmasked threshold (n = 28), and MOCR strength (n = 22). This process generated 10 imputed data sets by filling in missing data with values predicted from other variables in the data set. Each of the completed data sets are different because the predicted values include a random component, which allows the uncertainty in the missing values to be taken into account when doing statistical tests. Five imputed data sets are usually adequate (Rubin 1987), but to be conservative, we generated 10 sets using the following variables: age, gender, education level, ototoxicity, music training, SSQ12 (speech), average LF, HF, and EHF hearing level, DPOAE test ear, MOCR strength, LiSN-S, NAL-DCT passage score, TFS1, AM-4 and 90 Hz, TEN, TEA, RST, TRT, nonverbal intelligence, and noise exposure. For all analyses, the relevant statistical methods were applied to each imputed data set and then combined to obtain overall results.
Data analysis was performed in Statistica (Statsoft, version 10) and R (R Core Team 2016; version 3.3.1), with the additional R packages pROC (Robin et al. 2011; version 1.8) and Amelia (Honaker et al. 2011; version 1.74). Differences between the low CSS group (n = 30) and the high CSS group (n = 30) were tested using independent samples t tests or Mann–Whitney U tests where appropriate. After identifying the significant differences between the two groups, these were included in a multiple linear regression model to assess the relative contribution of each to the CSS. The significant predictors from this regression were then used to fit a second regression model.
The resulting formula from the second regression model was then assessed to determine how well it predicted speech-in-noise difficulties in the complete data set of 122 participants. The performance of the model on new data was estimated using Monte Carlo cross-validation (Hastie 2009; Kuhn & Johnson 2013). We used root mean square error (RMSE) to evaluate the model’s effectiveness in predicting CSS, and area under the receiver operating characteristic curve (AUC) was used in relation to predicting “low” CSS. The data sample was randomly split into a “test” set (comprising 10% of participants, n = 12) and a “fitting” set (comprising 90% of the participants, n = 110); then all of the statistical methods, described above, including the variable selection method (i.e., the t tests, Mann–Whitney U tests, and two regressions described above) were repeated using the fitting data set only. By not using the participants in the “test” set to fit the model, overestimation of predictive accuracy was avoided. The resulting regression formula was used to calculate predicted CSS values for the 12 participants in the “test” set, and these predicted values were used to calculate AUC. This cross-validation procedure was repeated 1000 times, and the RMSE and AUC values were averaged to produce a stable estimate of model performance.
The accuracy of the formula for predicting CSS was measured by the RMSE: the square root of the average squared prediction error [prediction error = actual CSS minus predicted CSS]. It is an absolute measure of predictive accuracy, that is, how close the values predicted by the regression model are to the actual values. For each of the 12 participants in the test set, we compared the participant’s actual CSS with the CSS predicted by the regression formula and then calculated the RMSE to determine the accuracy of the prediction. RMSE is expressed in the same units as the response variable (which in this case is the CSS), and a lower RMSE value indicates a better fit.
Predicting “Low” CSS
The accuracy of the formula for predicting “low” CSS was measured by the AUC; the curve being a plot of test sensitivity versus the false positive rate (Park et al. 2004). The AUC can be interpreted as the probability that given two randomly chosen people, one having “low” CSS and the other not having “low” CSS, the diagnostic criterion will correctly identify them (Hanley & McNeil 1982; Swets 1988). The AUC value can be between 0 and 1, and a higher value indicates a higher probability of success (with 1 indicating perfect accuracy and 0.5 equivalent to a random prediction). It is important that the regression formula is able to accurately predict people with low CSS since this group is representative of the individuals we would expect to present to a hearing clinic reporting problems hearing speech in noise.
Composite Speech Scores
Figure 1 shows the distribution of the CSSs for the entire sample (n = 122). Scores ranged from −1.93 to −0.55 for the low CSS group (Mlow CSS = −0.9, SD = 0.3) and from 0.51 to 1.76 for the high CSS group (Mhigh CSS = 0.9, SD = 0.3). Scores for the remaining participants (n = 62) ranged from −0.53 to 0.51 (Mmid CSS = 0, SD = 0.3).
The SSQ speech results confirmed that 77% of the low CSS group returned scores below the whole sample mean. The average SSQ speech score for the low CSS group was 5.5, and all scored 5.0 or less on at least two of the speech questions, indicating they were experiencing real-world speech in noise hearing difficulties. In contrast, the average SSQ speech score for the high CSS group was 8.6, and the majority (77%) of participants in this group had an average score greater than 8.0. Only one participant in this group scored 5.0 or less on at least two of the individual speech questions.
Differences Between Groups
As shown in Table 1, the low CSS group and high CSS group differed significantly on nine of the 17 variables examined. The low-performing group was older than the high-performing group, with an average difference of 6.1 years between them (Mlow CSS = 48.1 years; Mhigh CSS = 42.0 years).
The groups also differed on hearing thresholds. As shown in Figure 2, there were significant differences between the low CSS and high CSS groups across all three frequency regions, which increased as frequency increased. There was a significant 2.2 dB difference in the LF region, a 5.5 dB difference in the HF region, and a 23.3 dB difference in the EHF region.
The two groups also differed in their sensitivity to TFS (Mlow CSS = 71.8 dB; Mhigh CSS = 36.4 dB) and in the detection of amplitude modulation (AM90) (Mlow CSS = −22.7; Mhigh CSS = −25.1), with the low-performing group attaining significantly poorer thresholds than the high-performing group on both tasks.
There were also differences in performance on the tests of attention (Mlow CSS = 7.2; Mhigh CSS = 8.4) and working memory (Mlow CSS = 45.1; Mhigh CSS = 56.1), with the low-performing group scoring more poorly on both these tasks. Additionally, the groups differed on the TRT threshold (Mlow CSS = 61.0; Mhigh CSS = 58.6) with the low CSS group performing more poorly than the high CSS group.
Multiple Linear Regression
Next, we examined the relative effects of these nine significant variables on the CSS by fitting a multiple linear regression model using data from the entire sample (n = 122). The results showed that when all other variables were held constant, EHF thresholds and working memory scores were the only two significant predictors of the CSS, indicating that poorer EHF hearing and poorer working memory capacity were associated with reduced ability to understand speech in noise (see Table 2). This regression model accounted for 41% of the total variance [R2 = 0.41, F(9,112) = 7.57, p < 0.001].
We then fitted a second regression model using only EHF and RST to determine the relative effects of these two variables, and obtained a regression formula:
This formula was then assessed for its usefulness as a diagnostic criterion for predicting CSS.
First, we used the cross-validation results to assess whether our formula was able to accurately predict CSS using the RMSE method. The results yielded an RMSE of 0.60, which can be interpreted by noting that CSS, being the average of z scores, is expected to have a mean and SD of approximately 0 and 1, respectively. The RMSE value of 0.60 suggests that although this simple formula is likely inadequate for highly accurate prediction of CSS; it may be useful as a first-order approximation.
Next, we tested how well the regression formula predicted which participants were on the “low” end of the CSS performance scale using the AUC method. As shown in Figure 3, the AUC was equal to 0.76, meaning that for every 100 pairs of people, one “low” and one not “low”, the diagnostic criterion would correctly identify which person in each pair is “low” for 76 pairs.
Finally, we tested whether using one of the four EHF thresholds (9, 10, 11.2, or 12.5 kHz) instead of the average of the four EHF thresholds would yield similar results. We reasoned that since threshold testing at one, rather than four, frequency is more clinically expedient, it would be useful to know if any of the single frequencies was equivalent to the 4-frequency average. We repeated the cross-validation procedures for the four alternative models and found that results were slightly poorer for the 9 kHz and 11.2 kHz, and to a lesser extent 10 kHz models. These three frequencies were often not selected in the variable selection procedure indicating that they had less predictive value. However, using 12.5 kHz yielded RMSE and AUC results that were equally as good as, if not slightly better than, using the averaged EHF thresholds (see Table 3).
The purpose of this study was to devise a diagnostic criterion that could be used clinically for predicting or confirming “low” speech-in-noise performance in young and middle-aged adult listeners with normal hearing. The criterion we developed was a regression formula, based on EHF thresholds and RST results, and our results show that its ability to predict the CSS and identify “low” CSS performance was reasonable. Monte Carlo cross-validation results showed that the AUC was 0.76 indicating that the diagnostic criterion would correctly identify “low” CSS in approximately 76 out of every 100 pairs of people, where one was low and one was not, but it would also incorrectly identify some clients as “low” CSS when they were not. However, this situation would occur rarely, if at all, because people who do not self-perceive listening difficulties would be unlikely to seek hearing assessment in the first place. The RMSE of 0.60 shows that although there was some variation between the predicted and observed values, suggesting that the formula would not yield a highly accurate prediction of CSS, it was not so large that the predicted CSS would not be useful clinically in providing an approximate prediction. When we replaced the four-frequency average EHF with each of the stand-alone frequencies, 9, 10, 11.2 kHz, separately, results were slightly poorer, while the formula that included the 12.5 kHz threshold was equally as good as, if not slightly better than, the formula that used the four-frequency average EHF.
If used in clinical practice, our proposed diagnostic criterion would correctly identify or confirm “low” CSS in the majority of clients presenting with speech-in-noise problems. While it is acknowledged that not every client with normal hearing presenting with difficulty understanding speech in background noise will have elevated EHF thresholds or lower than average RST scores, our results show that using this diagnostic criterion (which is based on these two factors) would provide an evidence-based clinical explanation that would help a substantial proportion of clients to feel understood and likely result in a better clinical encounter than a standard hearing assessment currently provides (Pryce & Wainwright 2008; Pryce 2015).
Although EHF thresholds are not currently measured routinely, their diagnostic value is becoming increasingly recognized. Our results provide additional support of the link between elevated EHF thresholds and poorer speech-in-noise performance shown in several other recent studies of normal-hearing listeners (Badri et al. 2011; Liberman et al. 2016). Related to this are recent research results linking poorer EHF thresholds to increased levels of noise exposure in young adults (da Rocha et al. 2010; Liberman et al. 2016; Kumar et al. 2017; Prendergast et al. 2017) and the suggestion that noise damage may first appear in the EHF region (Somma et al. 2008; Le Prell et al. 2013; Sulaiman et al. 2014). Considered collectively, there is growing evidence that EHF thresholds in humans provide an early indicator of subclinical auditory damage that may coincide with noise-induced cochlear synaptopathy or other causes, for example, ototoxicity and aging. This has led some authors to recommended that EHF thresholds be included as part of standard testing (Rodríguez Valiente et al. 2016; Moore et al. 2017). Our findings provide further evidence that when clients present with difficulty understanding speech in noise (with or without a history or noise exposure) and are found to have normal thresholds for standard audiometric frequencies (≤20 dB HL, 0.25 to 4 kHz), best clinical practice would be to measure the client’s EHF thresholds, rather than reassure them that their hearing is “normal”.
Our results also indicate that tests of working memory could have a valuable role to play in the diagnosis of speech-in-noise difficulties. In our cohort, poorer RST results were a highly significant predictor of poorer CSS, a finding that is consistent with several other studies involving normal hearers (Besser et al. 2013; Keidser et al. 2015; Gordon-Salant & Cole 2016). Interestingly, we recorded more variation in working memory scores for the low CSS (11 to 67%) than for the high CSS group (41 to 76%), and the overlapping range of scores that straddle the two groups (41 - 67%) suggests that for some people working memory plays a greater contributory role in understanding speech in noise than it does for others.
To be clinically viable, a test of working memory would probably need to be shorter than the RST or the Listening Span task (Daneman & Carpenter 1980). Ideally, there is a need for a standardized clinical test that can efficiently identify those with lower working memory capacity and also differentiate performance within the middle range. One possible candidate has been developed by Smith et al. (2016). The Word Auditory Recognition and Recall Measure combines a working memory span and a word-recognition test which can be administered easily in the clinic. This test allows audiologists to simultaneously obtain information about both word-recognition ability and the cognitive processing required to recall words, which can then be used clinically for planning rehabilitation. Initial evaluation of the Word Auditory Recognition and Recall Measure with younger (18 to 38 years) and older (60 to 84 years) listeners with normal hearing and older (60 to 85 years) listeners with hearing loss showed that the test is clinically feasible and provides useful additional information about listening abilities (Smith et al. 2016).
Perhaps most importantly, use of the diagnostic criterion proposed here could provide a new avenue for counseling clients who present with speech-in-noise difficulty. For those who have a history of noise exposure, clinicians could point out that poor EHF thresholds are often associated with noise exposure and focus on the importance of avoiding excessive noise exposure, or using hearing protection when avoidance is not possible. For those clients without significant previous noise exposure, clinicians could discuss other possible causes of hearing damage such as ototoxicity, aging, and the interaction of these factors. For all clients, measuring one’s EHF thresholds provides a baseline to enable regular monitoring and early identification of hearing deterioration.
In time, future rehabilitation strategies may be developed on the basis of the diagnostic criterion provided here. For those with poor EHF thresholds, one approach could be to fit devices that extend the signal bandwidth. Devices such as the Earlens Photonic Transducer (Perkins et al. 2010) have been reported to significantly improve normal hearers’ ability to hear target speech in complex environments (Perkins et al. 2011; Levy et al. 2015; Struck & Prusick 2017). Several studies have demonstrated that extended bandwidth improved nonsense syllable and speech test results for normal hearers (Füllgrabe et al. 2010; Levy et al. 2015), who also prefer the sound quality these signals provide (Beck & Olsen 2008; Ricketts et al. 2008; Füllgrabe et al. 2010).
Another remediation approach, used alone or in combination with a device, could be to develop training packages that focus on working memory. To date there have been mixed research findings in relation to the efficacy of working memory training (Owen et al. 2010; Melby-Lervåg & Hulme 2013; Ferguson & Henshaw 2015; Ingvalson et al. 2015; Mackenzie 2015; Whitton et al. 2017), implying that further work is needed to develop training packages that cater to individual client needs and motivation levels; are sufficiently rewarding; and produce sustainable outcomes that withstand rigorous evaluation. Even if such evaluation reveals that working memory training provides only modest improvements in performance, offering it to clients may help to “legitimize” their concerns. This would be preferable to the status quo, which can leave clients questioning whether they are in fact experiencing an actual communication problem. In any case, it has been noted that even when significant improvements in post-training test scores only translate to small real-world effects, a client’s levels of confidence and self-efficacy may be significantly enhanced (Mackenzie 2015). For many clients, this may provide enough justification to undertake training.
The results of this investigation may have been affected by several factors, which should be taken into account when interpreting the findings. First, the procedure we used to segregate high and low performers meant that not everyone in the low CSS group was necessarily experiencing significant real-world listening difficulties. This may have influenced the results obtained; however, even considering this, the diagnostic criterion still yielded reasonable predictive accuracy. Second, rather than use a cross-validation procedure (which can overestimate model performance), it could be argued that it would be preferable to evaluate the efficacy of the diagnostic criterion by testing the procedure on a new population. Recruiting more participants was beyond the scope of this study; however, it is possible that existing data sets from other research groups could be used in this way.
Our study demonstrated that EHF hearing levels and working memory are significant predictors of the ability to understand speech in noise in normal-hearing adult listeners. The diagnostic criterion we developed based on these two factors was found to have reasonable predictive accuracy and could form the basis of future clinical tests and rehabilitation tools for those experiencing communication difficulties in noise. It may also assist clinicians to provide evidence-based counseling for normal hearers who present with otherwise unexplained communication difficulties in noise.
The authors thank Jermy Pang for data collection and Mark Seeto for advice on statistical analysis and interpretation. I.Y. designed and performed experiments, analyzed data, and wrote the article; E.F.B. and M.S. designed experiments and provided interpretative analysis and critical revision. All authors discussed the results and implications and commented on the manuscript at all stages.
Abrams H. B., Kihm J. An introduction to MarkeTrak IX: A new baseline for the hearing aid market: MT9 reveals renewed encouragement as well as obstacles for consumers with hearing loss. The Hearing Review, 2015). 22, 16.
Badri R., Siegel J. H., Wright B. A. Auditory filter shapes and high-frequency hearing in adults who have impaired speech in noise
performance despite clinically normal audiograms. J Acoust Soc Am, 2011). 129, 852–863.
Beck D. L., Olsen J. Extended bandwidths in hearing aids. The Hearing Review, 2008). 15(11), 22–26.
Besser J., Koelewijn T., Zekveld A. A, et al. How linguistic closure and verbal working memory
relate to speech recognition in noise–a review. Trends Amplif, 2013). 17, 75–93.
Best V., Keidser G., Freeston K, et al. Evaluation of the NAL Dynamic Conversations Test in older listeners with hearing loss. Int J Audiol, 2018). 57, 221–229.
Bramhall N. F., Konrad-Martin D., McMillan G. P. Tinnitus and auditory perception after a history of noise exposure: Relationship to auditory brainstem response measures. Ear Hear. 2018). doi: 10.1097/AUD.0000000000000544.
Bramhall N. F., Konrad-Martin D., McMillan G. P, et al. Auditory brainstem response altered in humans with noise exposure despite normal outer hair cell function. Ear Hear, 2017). 38, e1–e12.
Bressler S., Goldberg H., Shinn-Cunningham B. Sensory coding and cognitive processing of sound in Veterans with blast exposure. Hear Res, 2017). 349, 98–110.
Cameron S., Glyde H., Dillon H. Listening in Spatialized Noise-Sentences Test (LiSN-S): Normative and retest reliability data for adolescents and adults up to 60 years of age. J Am Acad Audiol, 2011). 22, 697–709.
Chin T., Rickard N. S. The Music USE (MUSE) Questionnaire: An instrument to measure engagement in music. Music Perception, 2012). 29, 429–446.
Classon E., Rudner M., Rönnberg J. Working memory
compensates for hearing related phonological processing deficit. J Commun Disord, 2013). 46, 17–29.
da Rocha R. L., Atherino C. C., Frota S. M. High-frequency audiometry in normal hearing
military firemen exposed to noise. Braz J Otorhinolaryngol, 2010). 76, 687–694.
Daneman M., Carpenter P. A. Individual differences in working memory
and reading. J Verbal Learn Verbal Behav, 1980). 19, 450–466.
Dryden A., Allen H. A., Henshaw H, et al. The association between cognitive performance and speech-in-noise perception for adult listeners: A systematic literature review and meta-analysis. Trends Hear, 2017). 21, 2331216517744675.
Ferguson M. A., Henshaw H. Auditory training can improve working memory
, attention, and communication in adverse conditions for adults with hearing loss. Front Psychol, 2015). 6, 556.
Fulbright A. N. C., Le Prell C. G., Griffiths S. K, et al. Effects of recreational noise on threshold and suprathreshold measures of auditory function. Semin Hear, 2017). 38, 298–318.
Füllgrabe C., Rosen S. On the (un)importance of working memory
in speech-in-noise processing for listeners with normal hearing
thresholds. Front Psychol, 2016). 7, 1268.
Füllgrabe C., Baer T., Stone M. A, et al. Preliminary evaluation of a method for fitting hearing aids with extended bandwidth. Int J Audiol, 2010). 49, 741–753.
Gordon-Salant S., Cole S. S. Effects of age and working memory
capacity on speech recognition performance in noise among listeners with normal hearing
. Ear Hear, 2016). 37, 593–602.
Grinn S. K., Wiseman K. B., Baker J. A, et al. Hidden hearing loss? No effect of common recreational noise exposure on cochlear nerve response amplitude in humans. Front Neurosci, 2017). 11, 465.
Grose J. H., Buss E., Hall J. W. 3rd. Loud music exposure and cochlear synaptopathy
in young adults: Isolated auditory brainstem response effects but no perceptual consequences. Trends Hear, 2017). 21, 2331216517737417.
Guest H., Munro K. J., Prendergast G, et al. Tinnitus with a normal audiogram: Relation to noise exposure but no evidence for cochlear synaptopathy
. Hear Res, 2017). 344, 265–274.
Hanley J. A., McNeil B. J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 1982). 143, 29–36.
Hastie T. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. (2009). 2nd ed.). New York, NY: Springer.
Heinrich A., Henshaw H., Ferguson M. A. Only behavioral but not self-report measures of speech perception correlate with cognitive abilities. Front Psychol, 2016). 7, 576.
Hind S. E., Haines-Bazrafshan R., Benton C. L, et al. Prevalence of clinical referrals having hearing thresholds within normal limits. Int J Audiol, 2011). 50, 708–716.
Honaker J., King g., Blackwell M. Amelia II: A program for missing data. J Stat Software, 2011). 45, 1–47.
Ingvalson E. M., Dhar S., Wong P. C, et al. Working memory
training to improve speech perception in noise across languages. J Acoust Soc Am, 2015). 137, 3477–3486.
Kalikow D. N., Stevens K. N., Elliott L. L. Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. J Acoust Soc Am, 1977). 61, 1337–1351.
Kaufman A. S., Kaufman N. L. Kaufman Brief Intelligence Test Manual (2nd ed., pp. 5–20). 2004). Sydney, Australia: Pearson Australia Group Pty Ltd.
Keidser G., Best V., Freeston K, et al. Cognitive spare capacity: Evaluation data and its association with comprehension of dynamic conversations. Front Psychol, 2015). 6, 597.
Kuhn M., Johnson K. Applied Predictive Modeling. 2013). New York, NY: Springer.
Kujala T., Shtyrov Y., Winkler I, et al. Long-term exposure to noise impairs cortical sound processing and attention control. Psychophysiology, 2004). 41, 875–881.
Kujawa S. G., Liberman M. C. Adding insult to injury: cochlear nerve degeneration after “temporary” noise-induced hearing loss. J Neurosci, 2009). 29, 14077–14085.
Kumar P., Upadhyay P., Kumar A, et al. Extended high frequency audiometry in users of personal listening devices. Am J Otolaryngol Head Neck Med Surg, 2017). 38, 163–167.
Le Prell C. G., Spankovich C., Lobariñas E, et al. Extended high-frequency thresholds in college students: Effects of music player use and other recreational noise. J Am Acad Audiol, 2013). 24, 725–739.
Levy C. S., Freed J. D., Nilsson J. M, et al. Extended high-frequency bandwidth improves speech reception in the presence of spatially separated masking speech. Ear Hear, 2015). 36, e214–e224.
Liberman M. C., Epstein M. J., Cleveland S. S, et al. Toward a differential diagnosis of hidden hearing loss in humans. PLoS One, 2016). 11, e0162726.
Lunner T. Cognitive function in relation to hearing aid use. Int J Audiol, 2003). 42(Suppl 1), S49–S58.
Mackenzie D. Sound advice. New Scientist, 2015). 227, 36–38.
Marshall L., Lapsley Miller J. A., Guinan J. J, et al. Otoacoustic-emission-based medial-olivocochlear reflex assays for humans. J Acoust Soc Am, 2014). 136, 2697–2713.
Mehrparvar A. H., Mirmohammadi S. J., Ghoreyshi A, et al. High-frequency audiometry: A means for early diagnosis of noise-induced hearing loss. Noise Health, 2011). 13, 402–406.
Melby-Lervåg M., Hulme C. Is working memory
training effective? A meta-analytic review. Dev Psychol, 2013). 49, 270–291.
Mimosa Acoustics. (MOCR User Manual Help Version 1.0. 2014). Champaign, IL: Mimosa Acoustics, Inc.
Moore B. C., Sek A. Development of a fast method for determining sensitivity to temporal fine structure. Int J Audiol, 2009). 48, 161–171.
Moore B. C., Creeke S., Glasberg B. R, et al. A version of the TEN Test for use with ER-3A insert earphones. Ear Hear, 2012). 33, 554–557.
Moore D., Hunter L., Munro K. Benefits of extended high-frequency audiometry for everyone. Hear J, 2017). 70, 50–55.
Noble W., Jensen N. S., Naylor G, et al. A short form of the Speech, Spatial and Qualities of Hearing scale suitable for clinical use: The SSQ12. Int J Audiol, 2013). 52, 409–412.
Owen A. M., Hampshire A., Grahn J. A, et al. Putting brain training to the test. Nature, 2010). 465, 775–778.
Park S. H., Goo J. M., Jo C. H. Receiver operating characteristic (ROC) curve: Practical review for radiologists. Korean J Radiol, 2004). 5, 11–18.
Perkins R., Fay J. P., Rucker P, et al. The EarLens system: New sound transduction methods. Hear Res, 2010). 263, 104–113.
Perkins R. C., Fay J., Nilsson M. J, et al. The EarLens Photonic Transducer: Extended bandwidth. Otolaryngol Head Neck Surg, 2011). 145 (Suppl), P102.
Prendergast G., Guest H., Léger A, et al. Evidence that hidden hearing loss does not vary systematically as a function of noise exposure in young adults with normal audiometric hearing. J Acoust Soc Am, 2016). 139, 2122.
Prendergast G., Guest H., Munro K. J, et al. Effects of noise exposure on young adults with normal audiograms I: Electrophysiology. Hear Res, 2017). 344, 68–81.
Pryce H. The process of coping in King-Kopetzky Syndrome. Audiol Med, 2006). 4, 60–67.
Pryce H. King-Kopetzky syndrome? A bio-psychosocial approach to adult “APD”. Persp Hear Hear Disorders Res Diagn, 2015). 19, 22.
Pryce H., Hall A. The role of shared decision-making in audiologic rehabilitation. Persp Aural Rehab Instrument, 2014). 21, 15.
Pryce H., Wainwright D. Help-seeking for medically unexplained hearing difficulties: A qualitative study. Int J Ther Rehab, 2008). 15, 343–349.
R Core Team (R: A Language and Environment for Statistical Computing. 2016). Vienna, Austria: R Foundation for Statistical Computing.
Ricketts T. A., Dittberner A. B., Johnson E. E. High-frequency amplification and sound quality in listeners with normal through moderate hearing loss. J Speech Lang Hear Res, 2008). 51, 160–172.
Robertson I. H., Ward T., Ridgeway V, et al. The Test of Everday Attention Manual. 1994). London, United Kingdom: Pearson Assessment.
Robin X., Turck N., Hainard A, et al. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 2011). 12, 77.
Rodríguez Valiente A., Roldán Fidalgo A., Villarreal I. M, et al. Extended high-frequency audiometry (9000–20 000 Hz). Usefulness in audiological diagnosis. Acta Otorrinolaringol (English Edition), 2016). 67, 40–44.
Rubin D. B. Multiple Imputation for Nonresponse in Surveys. 1987). Canada: John Wiley & Sons, Inc.
Rudner M., Lunner T. Cognitive spare capacity and speech communication: A narrative overview. Biomed Res Int, 2014). 2014, 869726.
Rudner M., Rönnberg J., Lunner T. Working memory
supports listening in noise for persons with hearing impairment. J Am Acad Audiol, 2011). 22, 156–167.
Ruggles D., Bharadwaj H., Shinn-Cunningham B. G. Why middle-aged listeners have trouble hearing in everyday settings. Curr Biol, 2012). 22, 1417–1422.
Saunders G. H., Haggard M. P. The clinical assessment of obscure auditory dysfunction–1. Auditory and psychological factors. Ear Hear, 1989). 10, 200–208.
Saunders G. H., Haggard M. P. The clinical assessment of “Obscure Auditory Dysfunction” (OAD) 2. Case control analysis of determining factors. Ear Hear, 1992). 13, 241–254.
Schoof T., Rosen S. The role of auditory and cognitive factors in understanding speech in noise
by normal-hearing older listeners. Front Aging Neurosci, 2014). 6, 307.
Smith L. S., Pichora-Fuller K. M., Alexander K. G. Development of the word auditory recognition and recall measure: A working memory
test for use in rehabilitative audiology. Ear Hear, 2016). 37, e360–e376.
Somma G., Pietroiusti A., Magrini A, et al. Extended high-frequency audiometry and noise induced hearing loss in cement workers. Am J Ind Med, 2008). 51, 452–462.
Spankovich C., Gonzalez V. B., Su D, et al. Self reported hearing difficulty, tinnitus, and normal audiometric thresholds, the National Health and Nutrition Examination Survey 1999–2002. Hear Res, 2018). 358, 30–36.
Stamper G. C., Johnson T. A. Auditory function in normal-hearing, noise-exposed human ears. Ear Hear, 2015). 36, 172–184.
Stenbäck V., Hällgren M., Larsby B. Executive functions and working memory
capacity in speech communication under adverse conditions. Speech Lang Hear, 2016). 19, 218–226.
Stephens D., Zhao F., Kennedy V. Is there an association between noise exposure and King Kopetzky Syndrome? Noise Health, 2003). 5, 55–62.
Struck C. J., Prusick L. Comparison of real-world bandwidth in hearing aids vs earlens light-driven hearing aid system. The Hearing Review, 2017). 24, 24.
Sulaiman A. H., Husain R., Seluakumaran K. Evaluation of early hearing damage in personal listening device users using extended high-frequency audiometry and otoacoustic emissions. Eur Arch Otorhinolaryngol, 2014). 271, 1463–1470.
Swets J. A. Measuring the accuracy of diagnostic systems. Science, 1988). 240, 1285–1293.
Tremblay K. L., Pinto A., Fischer M. E, et al. Self-reported hearing difficulties among adults with normal audiograms: The Beaver Dam Offspring Study. Ear Hear, 2015). 36, e290–e299.
Valderrama J. T., Beach E. F., Yeend I, et al. Effects of lifetime noise exposure on the middle-age human auditory brainstem response, tinnitus and speech-in-noise intelligibility. Hear Res, 2018). 365, 36–48.
Whitton J. P., Hancock K. E., Shannon J. M, et al. Audiomotor perceptual training enhances speech intelligibility in background noise. Curr Biol, 2017). 27, 3237–3247.e6.
Yeend I., Beach E. F., Sharma M, et al. The effects of noise exposure and musical training on suprathreshold auditory processing and speech perception in noise. Hear Res, 2017). 353, 224–236.
Zekveld A. A. Ten years of measuring Text Reception Thresholds: What are we actually measuring? 2017). In Fourth International Conference on Cognitive Hearing Science for Communication. Linkoping, Sweden.
Zekveld A. A., George E. L., Kramer S. E, et al. The development of the text reception threshold test: A visual analogue of the speech reception threshold test. J Speech Lang Hear Res, 2007). 50, 576–584.
Zhao F., Stephens D., Pryce H, et al. Rehabilitative management strategies in patients with King-Kopetzky Syndrome. Aus N Z J Audiol, 2008). 30, 119–127.
Keywords:Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved.
Cochlear synaptopathy; Cognition; Extended high-frequency hearing; Normal hearing; Speech in noise; Working memory