The most commonly employed speech processing strategies in cochlear implants (CIs) only extract and encode amplitude modulation (AM) in a limited number of frequency channels. Zeng et al. (2005) proposed a novel speech processing strategy that encodes both frequency modulation (FM) and AM to improve CI performance. Using behavioral tests, they reported better speech, speaker, and tone recognition with this novel strategy than with the AM-alone strategy. Here, we used the scalp-recorded human frequency following responses (FFRs) to examine the differences in the neural representation of vocoded speech sounds with AM alone and AM + FM as the spectral and temporal cues were varied. Specifically, we were interested in determining whether the addition of FM to AM improved the neural representation of envelope periodicity (FFRENV) and temporal fine structure (FFRTFS), as reflected in the temporal pattern of the phase-locked neural activity generating the FFR.
FFRs were recorded from 13 normal-hearing, adult listeners in response to the original unprocessed stimulus (a synthetic diphthong /au/ with a 110-Hz fundamental frequency or F0 and a 250-msec duration) and the 2-, 4-, 8- and 16-channel sine vocoded versions of /au/ with AM alone and AM + FM. Temporal waveforms, autocorrelation analyses, fast Fourier Transform, and stimulus-response spectral correlations were used to analyze both the strength and fidelity of the neural representation of envelope periodicity (F0) and TFS (formant structure).
The periodicity strength in the FFRENV decreased more for the AM stimuli than for the relatively resilient AM + FM stimuli as the number of channels was increased. Regardless of the number of channels, a clear spectral peak of FFRENV was consistently observed at the stimulus F0 for all the AM + FM stimuli but not for the AM stimuli. Neural representation as revealed by the spectral correlation of FFRTFS was better for the AM + FM stimuli when compared to the AM stimuli. Neural representation of the time-varying formant-related harmonics as revealed by the spectral correlation was also better for the AM + FM stimuli as compared to the AM stimuli.
These results are consistent with previously reported behavioral results and suggest that the AM + FM processing strategy elicited brainstem neural activity that better preserved periodicity, temporal fine structure, and time-varying spectral information than the AM processing strategy. The relatively more robust neural representation of AM + FM stimuli observed here likely contributes to the superior performance on speech, speaker, and tone recognition with the AM + FM processing strategy. Taken together, these results suggest that neural information preserved in the FFR may be used to evaluate signal processing strategies considered for CIs.