Several studies have shown that the ability to identify the timbre of musical instruments is reduced in cochlear implant (CI) users compared with normal-hearing (NH) listeners. However, most of these studies have focused on tasks that require specific musical knowledge. In contrast, the present study investigates the perception of timbre by CI subjects using a multidimensional scaling (MDS) paradigm. The main objective was to investigate whether CI subjects use the same cues as NH listeners do to differentiate the timbre of musical instruments.
Three groups of 10 NH subjects and one group of 10 CI subjects were asked to make dissimilarity judgments between pairs of instrumental sounds. The stimuli were 16 synthetic instrument tones spanning a wide range of instrument families. All sounds had the same fundamental frequency (261 Hz) and were balanced in loudness and in perceived duration before the experiment. One group of NH subjects listened to unprocessed stimuli. The other two groups of NH subjects listened to the same stimuli passed through a four-channel or an eight-channel noise vocoder, designed to simulate the signal processing performed by a real CI. Subjects were presented with all possible combinations of pairs of instruments and had to estimate, for each pair, the amount of dissimilarity between the two sounds. These estimates were used to construct dissimilarity matrices, which were further analyzed using an MDS model. The model output gave, for each subject group, an optimal graphical representation of the perceptual distances between stimuli (the so-called “timbre space”).
For all groups, the first two dimensions of the timbre space were strikingly similar and correlated strongly with the logarithm of the attack time and with the center of gravity of the spectral envelope, respectively. The acoustic correlate of the third dimension differed across groups but only accounted for a small proportion of the variance explained by the MDS solution. Surprisingly, CI subjects and NH subjects listening to noise-vocoded simulations gave relatively more weight to the spectral envelope dimension and less weight to the attack-time dimension when making their judgments than NH subjects listening to unprocessed stimuli. One possible reason for the relatively higher salience of spectral envelope cues in real and simulated CIs may be that the degradation of local fine spectral details produced a more stable spectral envelope across the stimulus duration.
The internal representation of musical timbre for isolated musical instrument sounds was found to be similar in NH and in CI listeners. This suggests that training procedures designed to improve timbre recognition in CIs will indeed train CI subjects to use the same cues as NH listeners. Furthermore, NH subjects listening to noise-vocoded sounds appear to be a good model of CI timbre perception as they show the same first two perceptual dimensions as CI subjects do and also exhibit a similar change in perceptual weights applied to these two dimensions. This last finding validates the use of simulations to evaluate and compare training procedures to improve timbre perception in CIs.
This study aimed to investigate the cues that cochlear implant subjects use to differentiate musical instruments based on their timbre. Several groups of subjects, including cochlear implant (CI) and normal-hearing (NH) listeners, were asked to rate the dissimilarity of pairs of synthesized instrument sounds. These dissimilarities were analyzed using a multidimensional scaling model that produced a geometrical representation of the internal timbre space as perceived by the subjects. The results showed very similar timbre spaces for both CI and NH subjects. This finding validates the use of training procedures for musical timbre perception in CIs.
1Medical Research Council Cognition and Brain Sciences Unit, Cambridge, United Kingdom; 2Laboratoire de Mécanique et d’Acoustique, Centre National de la Recherche Scientifique, Aix-Marseille University, Centrale Marseille, France; and 3Service de Neurophysiologie de l’audition, CHU Brugmann, Bruxelles, Belgium.
ACKNOWLEDGMENTS: The authors thank the ExpORL Lab in Leuven, Belgium, for their help with the calibration of the sound delivery system, and Mitsuko Aramaki for helpful discussions.
This study was partly supported by a grant (#080216) from Wellcome Trust (to R.P. Carlyon under which O.M. was employed).
O.M. is funded by the French national research agency (ANR grant pdoc “DAIMA”).
The authors declare no conflicts of interest.
Address for correspondence: Olivier Macherey, Laboratoire de Mécanique et d’Acoustique, 31 Chemin Joseph Aiguier, 13402 Marseille Cedex 20, France. E-mail: firstname.lastname@example.org
Received January 26, 2012
Accepted September 19, 2012