The objective of this study was to evaluate the sensitivity and reliability of one subjective (rating scale) and three objective (dual-task paradigm, pupillometry, and skin conductance response amplitude) measures of listening effort across multiple signal to noise ratios (SNRs).
Twenty adults with normal hearing attended two sessions and listened to sentences presented in quiet and in stationary noise at three different SNRs: 0, –3, and –5 dB. Listening effort was assessed by examining change in reaction time (dual-task paradigm), change in peak to peak pupil diameter (pupillometry), and change in mean skin conductance response amplitude; self-reported listening effort on a scale from 0 to 100 was also evaluated. Responses were averaged within each SNR and based on three word recognition ability categories (≤50%, 51% to 71%, and >71%) across all SNRs. Measures were considered reliable if there were no significant changes between sessions, and intraclass correlation coefficients were a minimum of 0.40. Effect sizes were calculated to compare the sensitivity of the measures.
Intraclass correlation coefficient values indicated fair-to-moderate reliability for all measures while individual measurement sensitivity was variable. Self-reports were sensitive to listening effort but were less reliable, given that subjective effort was greater during the dual task than either of the physiologic measures. The dual task was sensitive to a narrow range of word recognition abilities but was less reliable as it exhibited a global decrease in reaction time across sessions. Pupillometry was consistently sensitive and reliable to changes in listening effort. Skin conductance response amplitude was not sensitive or reliable while the participants listened to the sentences. Skin conductance response amplitude during the verbal response was sensitive to poor (≤50%) speech recognition abilities; however, it was less reliable as there was a significant change in amplitude across sessions.
In this study, pupillometry was the most sensitive and reliable objective measure of listening effort. Intersession variability significantly influenced the other objective measures of listening effort, which suggests challenges for cross-study comparability. Therefore, intraclass correlation coefficients combined with other statistical tests more fully describe the reliability of measures of listening effort across multiple difficulties. Minimizing intersession variability will increase measurement sensitivity. Further work toward standardized methods and analysis will strengthen our understanding of the reliability and sensitivity of measures of listening effort and better facilitate cross-modal and cross-study comparisons.