Journal Logo

Working Memory

Working Memory in Realistic Listening Environments

Miller, Christi W. PhD

Author Information
doi: 10.1097/
  • Free

Individuals with sensorineural hearing loss experience greater difficulty understanding speech in background noise compared with their normal-hearing peers. Although sensorineural hearing loss is primarily characterized by sensory impairments (e.g., increased thresholds and distortion of the auditory signal), cognitive factors also play a role in understanding speech in demanding environments (JASA. 2002;112:1112). While many domains of cognition have been explored to a limited degree, working memory capacity has consistently been linked to speech understanding in noise (Intl J Audiol. 2008;47(S2):S53).

hearing loss, brain health, memory

Working memory (WM) refers to the temporary storage and manipulation of information during ongoing cognitive processing. During speech understanding, listeners are required to process the incoming speech signal and integrate it with stored knowledge, while simultaneously anticipating impending signals. Individuals with hearing loss may need to allocate additional resources to comprehend a degraded incoming signal, therefore reducing their available capacity to process new information (Front Syst Neurosci. 2013;7:31). In fact, a strong relationship between WM and speech understanding in noise has been established among those with hearing loss (JAAA. 2011;(3):156). People with lower WM capacity tend to have poorer speech understanding in noise than those with a higher capacity, even when controlling for age and hearing loss severity.


Decisions on implementing new clinical protocols depend not only on statistical significance but also on the practical significance of reported findings (including effect sizes). While previous research demonstrates a significant statistical relationship between speech perception and WM, the effect sizes are generally small (approximately 2-17%), and it's unknown whether these effects will hold in more realistic listening environments (Intl J Audiol. 2008; Front Psychol. 2016;7:1268). Our recent study aimed to determine if the relationship between WM and speech understanding in noise was maintained under test conditions that more closely resemble real world listening conditions than those used in prior studies (J Sp Lang Hear Res. 2017;60:2310).

Seventy-six adults with symmetrical, mild-to-moderate, sensorineural hearing loss participated in the study. The tasks consisted of two measures of WM capacity and speech understanding under realistic listening conditions that included: (1) greater variability in the target speech stimuli to reduce indexical cue learning, (2) multi-talker background noise, and (3) adding visual cues to supplement the auditory stimuli. Contrary to previous findings, neither of the WM measures had a significant statistical effect on speech understanding, even after controlling for age and hearing loss severity.


Speech understanding is usually measured using a single target-talker (e.g., a female voice in the Quick Speech-in-Noise test), which can lead to the learning of voice-specific features influenced by age, gender, regional dialect, native language, and anatomical differences in the vocal tract between individuals (i.e., indexical cues; J Exper Psych. 1952;44(1):51). Learning of indexical cues leads to better speech recognition results and faster response times than when indexical cues are not available. This is an important aspect to consider as individuals communicate with multiple talkers throughout the day or even within a single conversation. In fact, performance on understanding multiple target-talkers (i.e., a different person speaking each sentence) correlates with self-reported hearing abilities but not with single target-talker conditions.

A recent survey reported inconsistent trends in the effects of WM on speech understanding across the difficulty of the listening condition; some studies showed greater reliance on WM for harder tasks while others showed the opposite effect (greater reliance on WM for easier tasks; Front Psychol. 2016). Previous work using single-talker, auditory-only stimuli (e.g., Quick Speech-in-Noise test) has shown WM to significantly explain two percent of variance over that explained by audibility (Int J Audiol.2015;54[10]:705). Others have evaluated whether linguistic complexity (words, sentences, discourse) resulted in differential effects of WM on performance (Front Pscychol.2015;6:1394). Investigating a similar group of older adults with hearing loss as those in the current study, researchers did not find any significant relationship between WM and speech recognition tasks. In our study, we used a speech test that minimized the learning of indexical cues by including multiple target-talkers, and we also found no effect on WM. Collectively, the role of WM in verbal speech communication may be grounded on processing talker-specific characteristics (e.g., dialect, speech rate, gender) and not linguistic characteristics.


Researchers have demonstrated an interaction between noise type and WM on speech understanding in noise. Specifically, higher WM capacity leads to better speech understanding performance when the background contains amplitude modulations and WM plays a smaller role in performance with steady-state backgrounds (e.g., J Am Acad Audiol. 2011). However, not all findings have supported such an interaction between noise type and WM, and the effects of noise on WM may be task-dependent (e.g., open or closed set). In our study, the difference in variability explained by WM between unmodulated and modulated noise was not significant. It is possible that the task difficulty was already high with multiple target-talkers compared with the single target-talker speech used in previous research, which perhaps overrode the stronger WM effects that we expected with modulated noise.


Visual cues are often available in real-world environments and lead to significantly better speech understanding, especially when audibility is reduced or the listener can't take advantage of the audibility provided. For example, adding visual cues reduces the hearing aid bandwidth required for speech recognition. Electrophysiological evidence has been presented to support faster and more efficient brain processing in auditory-visual conditions compared with auditory-only conditions and that WM facilitated this effect (Front Psychol. 2016;7:490). Despite evidence suggesting that visual cues relieve some of the processing demands on WM, listening effort may be higher in auditory-visual tasks under challenging conditions when speech performance is equal. Therefore, if the task demands a high level of cognitive resources (e.g., speech understanding with multiple target-talkers), the additional resources required for integration of audio-visual information may reduce resources that could be allocated to processing the auditory-only information.

Because the listeners in our study had a degraded auditory input from hearing loss, we predicted that adding visual cues would improve speech understanding, yet with a greater reliance on WM due to reconstruction of the auditory and visual peripheral input. Speech understanding significantly improved with the addition of visual cues in our study, but WM did not predict variance in performance, suggesting that bimodal processing did not require additional WM resources. One explanation is that although a +8 dB signal-to-noise ratio with visual cues is representativx of real world experiences, it may not have been challenging enough to stress top-down processing.

Under realistic listening conditions, the influence of WM on speech understanding appears to be negligible for unaided listening, and were of insufficient effect sizes to warrant clinical use of WM tests. However, after enhancing audibility with amplification, WM may play a more prominent role in explaining speech understanding variability. Future studies should evaluate the relevance of WM to aided performance under similar test conditions. If effect sizes for WM are enhanced for aided outcomes in similar realistic situations, substantial evidence may exist to change clinical practice regarding counseling, selecting, and fitting of hearing aids based on WM performance.

Christi W. Miller, PhD
Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.