Secondary Logo

Journal Logo

Listening Effort in Younger and Older Adults: A Comparison of Auditory-Only and Auditory-Visual Presentations

Sommers, Mitchell S.; Phelps, Damian

doi: 10.1097/AUD.0000000000000322
Behavioral approaches: cognition and listening

One goal of the present study was to establish whether providing younger and older adults with visual speech information (both seeing and hearing a talker compared with listening alone) would reduce listening effort for understanding speech in noise. In addition, we used an individual differences approach to assess whether changes in listening effort were related to changes in visual enhancement—the improvement in speech understanding in going from an auditory-only (A-only) to an auditory-visual condition (AV) condition. To compare word recognition in A-only and AV modalities, younger and older adults identified words in both A-only and AV conditions in the presence of six-talker babble. Listening effort was assessed using a modified version of a serial recall task. Participants heard (A-only) or saw and heard (AV) a talker producing individual words without background noise. List presentation was stopped randomly and participants were then asked to repeat the last three words that were presented. Listening effort was assessed using recall performance in the two- and three-back positions. Younger, but not older, adults exhibited reduced listening effort as indexed by greater recall in the two- and three-back positions for the AV compared with the A-only presentations. For younger, but not older adults, changes in performance from the A-only to the AV condition were moderately correlated with visual enhancement. Results are discussed within a limited-resource model of both A-only and AV speech perception.

Department of Psychology, Washington University in St. Louis, St. Louis, Missouri, USA.

The authors have no conflicts of interest to disclose.

Received December 10, 2015; accepted February 15, 2016.

Address for correspondence: Mitchell S. Sommers, Department of Psychology, Washington University in St. Louis, St. Louis, Missouri, USA. E-mail:

Back to Top | Article Outline


Adult-onset hearing loss is among the most prevalent and burdensome conditions world-wide, especially for individuals over age 40. By age 65, approximately 30% of individuals in the US will have a hearing loss significant enough to qualify for a hearing aid (Schoenborn & Marano 1988). One of the principal consequences of impaired hearing is greater difficulty understanding speech, especially under difficult listening conditions. Well-fit hearing aids have been shown to provide long-term benefits for individuals with hearing impairment (Humes et al. 2002), but even when amplification provides significant improvements in performance, individuals with hearing loss often report that speech perception becomes increasingly difficult and effortful as their hearing loss progresses (Tye-Murray et al. 2012).

In addition to sensory aids, one of the most effective methods for improving speech perception in individuals with hearing impairment is to allow them to both see and hear a talker rather than only listening. The increased performance for auditory-visual (AV) compared with auditory-only (A-only) or visual-only (V-only) presentations is at least partially a result of complementary information available in the auditory and visual speech signals (Summerfield 1987; Grant & Seitz 1998; Grant et al. 1998; Grant 2002). Thus, for example, if a listener is unable to perceive the acoustic cues for place of articulation, accurate intelligibility can be maintained if the speaker is visible because speechreading provides an additional opportunity to extract information about place of articulation.

One of the main goals of the current investigation is to examine whether, in addition to improving intelligibility, providing both auditory and visual speech information will also reduce the listening effort needed to understand speech. As noted, individuals who are hard of hearing often report that, even when they can correctly identify what is being said, listening to speech is more difficult than it was when they were younger and had better hearing. Consistent with these subjective reports, a number of researchers (McCoy et al. 2005; Nachtegaal et al. 2009; Pichora-Fuller & Singh 2006). have reported that, in addition to reduced speech intelligibility, individuals with hearing loss often report increased effort and fatigue, particularly when listening in noise, even when changes in the listening environment do not produce changes in overall performance (Hick & Tharpe 2002; Bologna et al. 2013).

The concept of listening effort is based on a limited-capacity resource model (Kahneman 1973) in which current ongoing cognitive operations engage a given percentage of total cognitive capacity (see also Rudner 2016, this issue, pp. 69S–76S, for discussions of listening effort and cognitive spare capacity). Increasing task demands or decreasing the quality of the input signal, such as by increasing background noise or reverberation, will require additional resources to be allocated for successful completion of initial encoding. It is this reallocation of resources toward front-end encoding of the speech signal that is perceived subjectively as increased effort and that can produce decrements in ongoing activities. Thus, the concept of listening effort provides a useful general framework for understanding why individuals with impaired hearing report greater difficulty perceiving speech despite little or no change in listening performance. Specifically, when listening conditions become more difficult, individuals can maintain a given level of performance by increasing the overall percentage of resources or effort that are allocated or engaged to complete the listening task. Within a limited capacity system, however, reallocation of resources to initial encoding will leave fewer resources available for maintenance of ongoing cognitive activities or tasks and may therefore produce decrements in one or more concurrent tasks (for a description of a dual-task paradigm see Phillips 2016, this issue, pp. 44S–51S).

Of particular importance to the present study is that listening effort can also be reduced by manipulations, such as improving signal to noise ratio (SNR), that make initial encoding easier (Zekveld et al. 2011). Within a limited-capacity model, reduced perceptual effort as a consequence of easier encoding results from individuals being able to reallocate resources from encoding to other ongoing activities. Thus, it is the extent of resources required for initial encoding of speech signals that determines perceptions of effort needed to understand speech.

In addition to changes in SNR, one manipulation that has been shown to reduce initial encoding demands is allowing listeners to both see and hear a talker, compared with listening alone (e.g., Sommers et al. 2005). Thus, one goal of the present study is to compare measures of listening effort for A-only versus AV presentations. Within a limited-capacity system, reducing the resources required for up-front encoding will increase the availability of cognitive resources available for other ongoing activities and may therefore result in performance improvements in those activities. Thus, we hypothesize that listening effort will be reduced for AV compared with A-only presentations.

Rabbitt (1990) was one of the first to test what has become known as the “effortfulness hypothesis”—the idea that differences in the difficulty or effortfulness of initial encoding can have downstream consequences for ongoing cognitive functions. In one study, for example, normal-hearing younger adults were presented with lists of digits either in a quiet background or in the presence of noise. Participants in both groups were able to shadow the digits perfectly, but those presented with the stimuli in a background noise showed significantly lower memory for the digits on a subsequent recall test. Rabbitt attributed this difference in recall performance to the increased effort needed to encode words presented in noise compared with those heard in quiet. This additional effort, according to Rabbitt, reduced available resources for rehearsal and other processing that would support memory of the items, thereby reducing recall in the noise condition despite equivalent and nearly perfect intelligibility.

Recently, McCoy et al. (2005) reported a similar pattern of results using a modified version of a serial recall task (see also, Pichora-Fuller 1996 and Frtusova et al. 2013 for additional examples of the relationship between working memory and AV speech perception). They tested two groups of older adults differing in the extent of age-related hearing loss on a task in which highly familiar words were presented individually without background noise. List presentation was stopped randomly and participants were asked to recall the last three words that had been presented before the list was stopped. Both groups of older adults were able to recall the most recently presented word nearly perfectly, suggesting that both groups were able to encode the items. Group differences were observed, however, in recall of the three-back word (i.e., the one presented least recently), with those in the group having greater amounts of hearing loss demonstrating poorer three-back recall than those with better hearing. McCoy et al. argued that this difference could be attributed to greater effort at encoding for the group with more impaired hearing. Specifically, relative to individuals with better hearing the individuals with more hearing loss had to devote a greater percentage of processing resources to encoding, leaving fewer resources available to update and rehearse the three most recently presented words.

Other studies, using a number of different procedures for assessing listening effort during speech perception (McCoy et al. 2005; Tun et al. 2009; Gosselin & Gagné 2011a, b), have also reported that older adults expend more listening effort than younger adults to obtain comparable performance levels. Gosselin and Gagné (2011a), for instance, used a dual-task paradigm to compare listening effort in younger and older adults with similar auditory thresholds through 3000 Hz. Participants were instructed to maximize performance on the primary task which was a closed set speech-in-noise test and effort was indexed by measuring performance on the secondary task (a vibrotactile pattern identification task). The rationale for this methodology was that increased difficulty or effort in performing the primary (speech perception) task would recruit cognitive resources from the secondary task, thereby producing reduced performance on the secondary task as effort for achieving the primary task was increased. Gosselin and Gagné (2011a) reported that performance on their vibrotactile secondary task was significantly poorer for older compared with younger adults under conditions in which performance on the primary task (speech recognition) was equated. They interpreted this age-related difference as indicating that older adults expend greater effort than younger adults to achieve comparable accuracy levels on the speech-in-noise test. In a follow-up investigation, Gosselin and Gagné (2011b) compared listening effort in older and younger adults for both A-only and AV presentations. Listening effort, as indexed by the same dual-task paradigm used in their earlier investigation (Gosselin & Gagné 2011a) was significantly greater for AV compared with A-only presentations. In addition, older adults exhibited greater effort than younger adults in both A-only and AV presentations.

The finding that greater listening effort is expended for speech tasks in AV compared with A-only presentations is somewhat surprising in light of the results noted above that under degraded listening conditions (such as the speech in noise test used by Gosselin and Gagné 2011b) AV speech perception is superior to A-only. Furthermore, Sommers et al. (2005) found that visual enhancement—the benefit obtained from both seeing and hearing a talker compared with listening alone—was similar for older and younger adults. Given the absence of age-related differences in visual enhancement and the nearly universal finding of better AV than A-only word recognition in noise, one might have anticipated that less effort would be expended in AV than A-only conditions, with similar reductions in effort for younger and older adults in moving from A-only to AV presentations.

One possible explanation for the finding of greater effort for AV than for A-only presentations (Gosselin & Gagné 2011b) is that speech perception (the primary task) was assessed using a closed set test format which does not reflect the demands typically encountered during open-set speech perception. There is some evidence (Mishra et al. 2013) that under excellent listening conditions, such as when younger adults listen in quiet or very low levels of noise, performance in an AV condition can be lower than in A-only conditions because of the additional requirements of integrating auditory and visual speech information from bimodal presentations. That is, when speech perception is relatively easy and automatic listeners gain very little from adding visual speech information (Tye-Murray et al. 2010) and the additional costs of having to integrate across modalities may even impose a cost that is reflected in greater effort in the AV than the A-only condition. In the Gosselin and Gagné (2011b) study, participants had to select from one of seven response alternatives for each of three key word positions. If the use of such a closed set response format reduces the difficulty of the speech perception test sufficiently, then the additional demands of integrating the auditory and visual speech information will be greater than any benefits obtained by adding visual speech information to an already relatively easy A-only speech perception task. Under such conditions, measures of effort may be greater in the AV than the A-only condition.

Methodological issues may also account for the finding (Gosselin & Gagné 2011b) that older adults expend more effort than younger listeners in the AV condition, despite evidence (Sommers et al. 2005) of similar visual enhancement for both age groups. Specifically, age-related differences in listening effort as reflected in secondary task performance could arise for at least two different, but not mutually exclusive, reasons. First, it is possible that the age-related differences in secondary task performance reflect true differences in the effort required to perform the primary speech perception tasks. However, it is also possible that the age-related differences reflect differences in the total amount of resources available to the two age groups. There is now considerable evidence (Craik & Salthouse 2000; Bialystok & Craik 2006) that older adults have reduced overall cognitive resources relative to younger adults. Thus, older and younger adults may have expended similar effort in performing the primary task, but older adults had fewer remaining resources available to perform the secondary task. That is, the difference in secondary task performance between older and younger adults may reflect differences in the total pool of resources rather than the amount of effort expended during the primary listening task.

In the present study, we extend and reevaluate the findings reported by Gosselin and Gagné (2011b) by using a serial recall measure of listening effort and examining whether there is a correlation between visual enhancement and listening effort. Specifically, we used the modified version of the serial recall task developed by McCoy et al. (2005) as a measure of listening effort to avoid the difficulties in interpretation (see above) that can result from use of dual-task paradigms. We also obtained measures of visual enhancement to examine the correlation between the benefits of adding visual speech information for word recognition and reductions in listening effort. Our hypothesis is that those individuals demonstrating greater visual enhancement will also show the greatest reductions in listening effort when moving from A-only to AV presentations. Such a finding would be evidence of the relationship between perceptual abilities and listening effort.

Back to Top | Article Outline



Thirty-two younger adults (mean age, 20.1, SD = 2.1, 19 female) and 34 older adults (mean age 70.2, SD = 6.8, 22 female) served as participants. Younger adults were all students at a private university and were recruited through posted advertisements. Older adults were all community-dwelling residents and were recruited through a database maintained by the Aging and Development Program at Washington University. Testing required one 1.5-hr session. All participants reported that English was their first language and that they had never had any lipreading training. Verbal abilities were assessed using the vocabulary subtest of the Wechsler Adult Intelligence Scale (Wechsler 1955). Mean scores for older and younger adults were 55.3 (SD = 8.7) and 46.2 (SD = 4.2), respectively. Older participants were also screened for dementia using the Mini Mental Status Exam (Cockrell & Folstein 1988). Participants who scored below 26 (of 30) were excluded from further testing. All participants were paid for participation.

Participants were also screened for vision and hearing before testing. Participants whose normal or corrected visual acuity, as assessed with a Snellen eye chart, exceeded 20/40 were excluded from participating to minimize the influence of reduced visual acuity on the ability to encode visual speech information. Visual contrast sensitivity was measured using the Pelli–Robson contrast sensitivity chart (Pelli et al. 1998) and participants whose score exceeded 1.8 were also excluded from further participation. Pure-tone air conduction thresholds were obtained for all participants at octave frequencies from 250 to 4000 Hz, using a portable audiometer and headphones (TDH 39). Pure-tone averages (500, 1000, and 2000 Hz) for younger adults were 0.33 (SD = 4.3) and 0.47 (SD = 4.2) dB HL for the left and right ears, respectively. Corresponding values for older adults were 17.1 (SD = 6.6) and 18.2 (SD = 6.3) dB HL. A comparison of pure-tone averages revealed that thresholds were significantly greater for older than for younger adults (left ear: t(64) = 10.4, p < 0.001; right ear: t(64) = 10.1, p < 0.001).

Back to Top | Article Outline

Stimuli and Procedures

Participants completed 2 experimental tasks, 1 assessing speech perception in noise and 1 measuring perceptual effort, during sessions that lasted approximately 1.5 hr. The order in which the two tasks was completed was counterbalanced across participants.

Back to Top | Article Outline

Speech Perception Tests

Participants were presented with individual words in A-only, V-only, and AV conditions (only data from the AV and A-only conditions are reported in the present study). Three lists of 50 words each were created to test in each of the 3 modality conditions. Each of the word lists had 25 monosyllabic and 25 bisyllabic words. Mean familiarity rating (Luce & Pisoni 1998) for the words (with 7 representing a response of “I know the word and its definition”) was 6.6 (SD = 0.31). Mean log HAL frequency ( was equated as closely as possible across lists. Mean HAL log frequencies were 9.5 (SD = 2.1), 9.39 (SD = 1.8), and 9.57 (SD = 2.1), for lists 1, 2, and 3, respectively. Words were produced by a female talker and were recorded at a sampling rate of 44.1 kHz.

Words were presented to participants in the carrier phrase “please say the word _________,” with the target word presented in sentence-final position. The word lists were rotated across presentation modalities such that approximately an equal number of participants in each group received a given list in the A-only, V-only, and AV conditions. Order of testing in the three conditions was counterbalanced such that approximately equal numbers of participants from each age group were tested in each possible testing order.

All testing was conducted in a double-walled sound attenuating chamber. Auditory stimuli for the A-only and AV conditions were presented binaurally over headphones in a six-talker background babble. Babble level was set independently for each participant using a modified version of the procedure for establishing speech reception thresholds (American Speech and Hearing Association 1988). The goal of this pretesting was to determine the SNR that produced approximately 50% correct in the A-only condition for each participant. Using this customized SNR served to equate overall audibility across the younger and older adults and avoided ceiling level performance in the AV condition. Signal level remained constant at 60 dB SPL for the A-only and AV conditions and SNR for both the A-only and AV conditions was set to the SNR established individually during pretesting. Visual stimuli for the V and AV conditions were presented on a 17-in color touch screen monitor with participants sitting approximately 0.5 m from the display. The video image completely filled the 17-in monitor. Participants viewed the head and neck of each talker as they articulated the stimuli.

Presentation modality (A-only, V-only, and AV) was blocked. At the beginning of each trial, participants saw “press space bar to begin next trial” presented on the screen. Participants then either saw, heard, or both saw and heard a talker producing the carrier phrase with the target word in sentence-final position. Participants were instructed to say the target word aloud and scoring was based on exact phonetic matches (i.e., adding, deleting, or substituting a single phoneme counted as an incorrect response). All experimental sessions were recorded and recordings were checked if there was any ambiguity about the individual’s response. If participants were unsure of the target word, they were encouraged to guess. Participants received three practice trials in each condition (A-only, V-only, and AV) before testing and none of the practice words appeared during the actual testing.

Back to Top | Article Outline

Measure of Listening Effort

Listening effort was assessed using a modified serial recall task developed by McCoy et al. (2005). Participants heard 8 lists of highly familiar English words presented without background noise in both A-only and AV conditions (total of 16 lists). Words were presented with an interstimulus interval of 1 sec. Within a list, modality (A-only, AV) was held constant and all eight lists of one modality were presented before the eight in the other modality (i.e., modality was blocked). The words were all spoken by a male talker. The lists contained varying numbers of words (10, 7, 13, 5, 12, 15, 8, and 14 for lists 1 to 8, respectively, and 14, 9, 12, 17, 6, 11, 10, and 13, for lists 9 to 16, respectively). Mean word frequency was equated as closely as possible across lists. Half of the participants received one set of lists (1 to 8) in the A-only condition and the remainder of the lists in the AV condition. The other half received the opposite mapping of lists to condition. Following the last word in each list, the participant saw “please repeat the last three words” on the computer screen. Participants were unaware of the number of items in each list and were therefore required to update the last three items in memory after each word was presented. Order of testing (A-only and AV) was counterbalanced across both older and younger adults.

All procedures were approved by the Institutional Review Board at Washington University in St. Louis.

Back to Top | Article Outline


Speech Perception Test

Figure 1 displays percent correct word recognition for the A-only and AV conditions along with the computed values for visual enhancement ((AV–A-only)/1-A-only) × 100. Recall that SNRs were customized in the A-only condition such that performance was approximately 50% for all participants. Thus, the finding that scores in the A-only condition did not differ across age groups and was very close to our intended target level of 50% simply indicates that the procedure for setting SNRs for individual participants was effective. Independent measures t tests indicated that neither performance in the AV condition nor visual enhancement differed between age groups (all p’s > 0.5). These findings are consistent with previous results (Sommers et al. 2005) demonstrating similar visual enhancement for older and younger adults.

Fig. 1

Fig. 1

Back to Top | Article Outline

Measures of Listening Effort

Figure 2 displays the mean number of words recalled for each of the 3 test positions (1-back, 2-back, and 3-back) for the A-only and AV conditions across the 2 age groups. Recall that performance for the most recently presented position (one-back) was used as an index of participants’ ability to hear and encode the words correctly. Inspection of Figure 2 shows that performance for both groups in both conditions was greater than seven (out of a possible eight items) for the one-back position. Mean performance for younger and older adults in the one-back position in the A-only presentation condition was 7.4 (SD = 0.5) and 7.3 (SD = 0.5), respectively. Means for the AV condition in younger adults was 7.5 (SD = 0.3). For older adults, mean recall in the AV condition for the 1-back position was 7.4 (SD = 0.5). A two-way analysis of variance with group as a between-participants variable and condition as a repeated measures variable found no significant main effects or interactions for performance in the 1-back position (all p’s > .2). Overall then, performance for the one-back position indicated that older and younger adults could encode the words equally well and that performance was near ceiling for both the A-only and AV conditions.

Fig. 2

Fig. 2

To examine changes in listening effort as a function of both condition (A-only, AV) and age, we conducted an analysis of variance with condition and position (two- and three-back) as repeated measures variables and age as a between participants variable on the memory performance in the two- and three-back positions. Of principal importance to the current investigation, the analysis revealed a significant interaction between group and condition F(1, 64) = 26.7, p < 0.001. To further examine the source of this interaction, we analyzed the memory scores for the final two positions separately for younger and older adults. For younger adults, memory performance was significantly poorer for the 3-back than for the 2-back position F(1, 31) = 33.7, p < 0.001 and was significantly lower for A-only than for AV presentations F(1, 31) = 50.3, p < 0.01. The difference between the A-only and AV conditions was also significantly greater for the 3-back than for the 2-back position (i.e., a significant interaction between modality and recall position) F(1, 31) = 6.8, p < 0.05, suggesting that the benefits of going from an A-only to an AV condition were greater for the more difficult 3-back position. Post-hoc pairwise comparisons with a Bonferroni correction for multiple contrasts indicated that memory scores were significantly better in the AV than in the A-only conditions for both the 2- and 3-back positions (all p’s < 0.01).

In contrast to the findings with younger adults, the parallel analysis for older adults indicated that only the difference between the 2-back and 3-back positions was significant F(1, 33) = 70.5, p < 0.001. Of particular importance, recall scores did not differ as a function of condition F < 1 and there was no significant interaction between condition and position F(1, 33) = 2.3, p = 0.15.

Back to Top | Article Outline

Relationship Between Listening Effort and Visual Enhancement

We interpreted the improved recall performance in the AV compared with the A-only condition for younger adults as indicating that the addition of visual speech information reduced encoding difficulty, thereby easing listening effort and allowing younger adults to reallocate resources from encoding to memory. To further investigate this issue, we examined whether differences between the two modalities in the memory test were related to visual enhancement. That is, if the benefit of AV compared with A-only performance on the memory task is in part a consequence of easier encoding in the AV condition, then individuals who show the greatest benefit of going from an A-only to an AV condition on the speech perception test (i.e., visual enhancement) should also show the biggest improvement in memory performance in moving from the A-only to the AV presentations. To examine this relationship, we combined memory scores for the 2- and 3-back conditions separately for each of the A-only and AV presentations in younger adults. We then calculated the difference between word identification scores for the AV and A-only presentations and correlated this value with visual enhancement. These two measures correlated significantly (r = 0.37, p < 0.01) suggesting that, at least in younger adults, individuals who showed the greatest visual enhancement also tended to show the largest differences between the A-only and AV conditions in the memory test.

Back to Top | Article Outline


The findings from the present study suggest that younger, but not older, adults have reduced listening effort for AV compared with A-only presentations, at least as assessed with the modified serial recall task used in the present study. The lack of improvement in memory scores for older adults in the AV relative to the A-only condition was observed despite similar visual enhancement scores for the two groups. Finally, at least for younger adults, the improvement in memory performance from the A-only to the AV conditions was moderately correlated with visual enhancement.

Perhaps the most puzzling finding from the present experiment is that older adults did not demonstrate a reduction in listening effort as a result of providing visual speech information. The result is surprising because one factor that has been shown to affect listening effort in the current paradigm is encoding difficulty (McCoy et al. 2005). Recall that McCoy et al. found reduced recall scores for the two- and three-back positions in a group of older adults with impaired hearing compared with an age-matched group with better hearing. They attributed the reduced memory performance to greater encoding difficulty for the more impaired group that necessitated diverting resources from functions supporting memory (such as rehearsal) to those supporting initial stimulus perception. Based on these and other findings suggesting that factors which reduce initial encoding demands should also reduce listening effort (Rabbitt 1990; McCoy et al. 2005; Tun et al. 2009; Zekveld et al. 2011; Picou et al. 2013), one might expect similar reductions in listening effort for AV compared with A-only presentations in both older and younger adults. That is, both groups exhibited comparable benefits in identification performance for AV compared with A-only presentations and therefore similar beneficial effects on listening effort might also be expected.

One possible explanation for the absence of changes in listening effort for older adults in the AV compared with the A-only condition is that the multimodal presentations imposed additional demands that offset any benefits in encoding. For example, if older adults have greater difficulty combining or integrating the auditory and visual speech information compared with younger adults, then this could negate any benefits older adults obtained from easier speech perception in the AV compared with the A-only condition. Consistent with this explanation, Gosselin and Gagné (2011b) reported that listening effort for AV presentations, as measured with a dual-task paradigm, was greater for older compared with younger adults. However, Mishra et al. (2013) found that adding visual speech information to an auditory speech signal increased cognitive spare capacity, a measure of available cognitive resources, to a greater extent for older than younger adults. Currently, it is not possible to adjudicate between these somewhat contradictory findings owing to the absence of an independent measure of integration costs. Therefore, additional research is required before we can provide a clear explanation for the failure of older adults to reduce listening effort for the AV compared with the A-only presentations.

Although older adults did not benefit from the addition of visual speech information, younger adults exhibited a gain of approximately 1 to 1.5 words in the 2- and 3-back positions for AV compared with A-only presentations. One model that provides a useful framework for understanding this improvement is the ease of language understanding (ELU) model (Rönnberg et al. 2013). In brief, in the ELU model when speech is presented under good conditions, such as normal-hearing adults listening under favorable SNRs, matching incoming signals with lexical representations stored in long-term memory is relatively automatic and requires minimal cognitive effort. However, when listening situations become more difficult, for example when individuals have reduced auditory sensitivity or background noise becomes louder, individuals must engage explicit cognitive abilities such as working memory to match the degraded acoustic signal with the stored representations. Adding visual speech information may function to lessen these additional cognitive demands by improving listeners’ ability to match the degraded acoustic signals to stored representations. That is, according to the ELU model, providing visual speech information may function to move individuals from the more explicit pathway that engages a number of resource demanding cognitive abilities back to the more automatic pathway in which speech perception takes place automatically and with relatively little effort.

Back to Top | Article Outline


The construct of listening effort has become increasingly important for understanding how speech perception varies across a number of different listening conditions and populations. It has the potential to advance our understanding of the basic mechanisms mediating spoken language processing and to motivate new clinical treatments for listeners who are old and/or hard-of-hearing. Measures of perceptual effort, for example, can provide a quantitative index of listening difficulty across different acoustic environments that may be useful in guiding clinical interventions. One difficulty with our current understanding of the construct, however, is that there is little agreement on how to measure listening effort. Complicating the picture is that the few studies that have used multiple measures of effort (Gosselin & Gagné 2011b) have generally failed to find correlations between them. Thus, there is a critical need for research that addresses how different measures of listening effort are related and the extent to which each provides a valid index of the overall difficulty of speech encoding. Assessing the effort required to comprehend spoken language is a potentially powerful new approach to research on speech perception and we look forward to refinements in both its theoretical importance and clinical utility.

Back to Top | Article Outline


Supported by Grant R01 AG018029-12 and from a T35 training grant to the Department of Otolaryngology.

Back to Top | Article Outline


Bialystok E., Craik F. I. M. Lifespan Cognition: Mechanisms of Change. (2006). Oxford; New York, NY: Oxford University Press.
Bologna W. J., Chatterjee M., Dubno J. R. Perceived listening effort for a tonal task with contralateral competing signals. J Acoust Soc Am, (2013). 134, EL352–EL358.
Cockrell J. R., Folstein M. F. Mini-mental state examination (MMSE). Psychopharmacol Bull, (1988). 24, 689–692.
Craik F. I. M., Salthouse T. A. The Handbook of Aging and Cognition (2000). (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Frtusova J. B., Winneke A. H., Phillips N. A. ERP evidence that auditory-visual speech facilitates working memory in younger and older adults. Psychol Aging, (2013). 28, 481–494.
Gosselin P. A., Gagné J. P. Older adults expend more listening effort than young adults recognizing speech in noise. J Speech Lang Hear Res, (2011a). 54, 944–958.
Gosselin P. A., Gagné J. P. Older adults expend more listening effort than young adults recognizing audiovisual speech in noise. Int J Audiol, (2011b). 50, 786–792.
Grant K. W. Measures of auditory-visual integration for speech understanding: A theoretical perspective. J Acoust Soc Am, (2002). 112, 30–33.
Grant K. W., Seitz P. F. Measures of auditory-visual integration in nonsense syllables and sentences. J Acoust Soc Am, (1998). 104, 2438–2450.
Grant K. W., Walden B. E., Seitz P. F. Auditory-visual speech recognition by hearing-impaired subjects: Consonant recognition, sentence recognition, and auditory-visual integration. J Acoust Soc Am, (1998). 103(5 pt 1)2677–2690.
Hick C. B., Tharpe A. M. Listening effort and fatigue in school-age children with and without hearing loss. J Speech Lang Hear Res, (2002). 45, 573–584.
Humes L. E., Wilson D. L., Barlow N. N., et al Changes in hearing-aid benefit following 1 or 2 years of hearing-aid use by older adults. J Speech Lang Hear Res, (2002). 45, 772–782.
Kahneman D. Attention and Effort. (1973). Englewood Cliffs, NJ: Prentice-Hall.
Luce P. A., Pisoni D. B. Recognizing spoken words: The neighborhood activation model. Ear Hear, (1998). 19, 1–36.
McCoy S. L., Tun P. A., Cox L. C., et al Hearing loss and perceptual effort: Downstream effects on older adults’ memory for speech. Q J Exp Psychol A, (2005). 58, 22–33.
Mishra S., Lunner T., Stenfelt S., et al Seeing the talker’s face supports executive processing of speech in steady state noise. Front Syst Neurosci, (2013). 7, 96.
Nachtegaal J., Kuik D. J., Anema J. R., et al Hearing status, need for recovery after work, and psychosocial work characteristics: Results from an internet-based national survey on hearing. Int J Audiol, (2009). 48, 684–691.
Pelli D., Robson J., Wilkins A. The design of a new letter chart for measuring contrast sensitivity. Clin Vision Sci, (1998). 2, 187–199.
Phillips N.The implications of cognitive aging for listening and the FUEL model.Ear Hear, (2016). 37, 44S–51S.
Pichora-Fuller M. K. Working memory and speechreading. Berlin Nato As1 Series P8, (1996). 257–274.
Pichora-Fuller M. K., Singh G. Effects of age on auditory and cognitive processing: Implications for hearing aid fitting and audiologic rehabilitation. Trends Amplif, (2006). 10, 29–59.
Picou E. M., Ricketts T. A., Hornsby B. W. How hearing aids, background noise, and visual cues influence objective listening effort. Ear Hear, (2013). 34, e52–e64.
Rabbitt P. Mild hearing loss can cause apparent memory failures which increase with age and reduce with IQ. Acta Otolaryngol Suppl, (1990). 476, 167–75; discussion 176.
Rönnberg J., Lunner T., Zekveld A., et al The Ease of Language Understanding (ELU) model: Theoretical, empirical, and clinical advances. Front Syst Neurosci, (2013). 7, 31.
Rudner M.Cognitive spare capacity as an index of listening effort. Ear Hear, (2016). 37, 69S–76S.
Schoenborn C. A., Marano M. Current estimates from the National Health Interview Survey. Vital Health Stat, (1988). 10, 1–233.
Sommers M. S., Tye-Murray N., Spehar B. Auditory-visual speech perception and auditory-visual enhancement in normal-hearing younger and older adults. Ear Hear, (2005). 26, 263–275.
Summerfield Q. Dodd B., Campbell R., Some preliminaries to a comprehensive account of audio-visual speech perception. Hearing by Eye: The Psychology of Lip-Reading (1987). Hillsdale, NJ: Lawrence Erlbaum Associates.3–51.
Tun P. A., McCoy S., Wingfield A. Aging, hearing acuity, and the attentional costs of effortful listening. Psychol Aging, (2009). 24, 761–766.
Tye-Murray N., Sommers M., Spehar B., et al Aging, audiovisual integration, and the principle of inverse effectiveness. Ear Hear, (2010). 31, 636–644.
Tye-Murray N., Sommers M. S., Mauzé E., et al Using patient perceptions of relative benefit and enjoyment to assess auditory training. J Am Acad Audiol, (2012). 23, 623–634.
Wechsler D. Manual for the Wechsler Adult Intelligence Scale. (1955). New York, NY: Psychological Corporation.
Zekveld A. A., Kramer S. E., Festen J. M. Cognitive load during speech perception in noise: the influence of age, hearing loss, and cognition on the pupil response. Ear Hear, (2011). 32, 498–510.

Aging; Auditory-visual presentations; Listening effort; Visual enhancement

Copyright © 2016 Wolters Kluwer Health, Inc. All rights reserved.