Modern digital hearing aids offer many advantages to hearing aid users. These include adaptive noise-reduction systems, adaptive directionality, adaptive feedback suppression, and highly flexible control of numerous amplification characteristics, including complex forms of compression. Unfortunately, digital reproduction also introduces the possibility of new forms of distortion, arising from either inappropriate adaptation to the environment or the analysis of sound into different frequency regions and subsequent re-synthesis to create a single analog signal for presentation to the hearing aid user.
As the purpose of digital signal processing in hearing aids is to alter sound, it is far from straightforward to measure different types of distortion by objective means. The aim of our project was to compare the perceived sound quality of several current advanced hearing aids while they are amplifying a range of different signals. We also conducted various objective measures of distortion and signal quality to relate these objective measures to the subjective measures.
Ten adults with normal hearing (three women and seven men) and nine experienced adult hearing aid users, as well as one hearing-impaired adult without previous hearing aid experience (three women and seven men in all), served as subjects. The normal-hearing subjects had bilateral thresholds better than 20 dB for all frequencies from 250 through 6000 Hz. The hearing-impaired thresholds were all symmetrical sensorineural hearing losses that met the criteria of HTL <45 dB at 500 Hz and HTL >30 dB at 2000 Hz. With respect to symmetry, no threshold at any one frequency was different by more than 10 dB from 250 to 3000 Hz or by more than 15 dB outside of that range. The nine experienced hearing aid users had all been fitted binaurally with digitally programmable instruments.
Hearing aids and earmolds
We used five behind-the-ear (BTE) fully digital hearing aids: the Bernafon Symbio 100, the Phonak Claro 211 dAZ, the Widex Senso Diva SD-9M, the GN ReSound Canta 7 (770-D), and the Siemens Triano S.
We set all the instruments to best match the NAL-NL11 2-cc coupler gain targets for a 40-dB sensorineural flat loss for the normal-hearing subjects and for the average of the two ears for the hearing-impaired subjects. The aids were programmed to match targets for 50-dB, 65-dB, and 80-dB speech inputs using a speech-shaped noise as test signal. Where very close matches could not be achieved for all three input levels, we tried to obtain the best match for a 65-dB input.
In addition, we checked to make sure that all test devices produced the same long-term average output spectra for continuous discourse played back at 65 dB SPL through the test setup as recorded on an SR780 spectrum analyzer. For the subjects with normal hearing, the differences between these spectra were large enough to make a further small adjustment to two devices (Claro and Symbio). Earmolds were hard acrylic, shell earmolds with a 2-mm vent and a 3-mm Libby horn. For two of the hearing-impaired subjects whose earmolds did not arrive on time, we used foam plugs with 3-mm Libby horns and 2-mm tubing vents.
We used six stimuli in the experiment: a male voice, a female voice, piano music, each subject's own voice, a male voice in impulse noise, and no sound at all (i.e., a quiet room). The male and female voices, as well as the impulse noise, were taken from NAL's Speech and Noise for Hearing Aid Evaluation CD.2 In these three situations, speech was presented at 70 dB SPL and the impulse noise at 80 dB SPL. The piano music used was from Gardens in the Rain.3The Colours of the Piano, tracks 7 through 10 and 15 on CD 2, were presented at 80 dB SPL. To rate their own voices, the subjects read Grandfather by Hans Christian Andersen. For the quiet room situation, we switched off all sound sources in the test room.
The experiment was conducted in a sound-attenuating test booth with the subjects seated about 1 meter in front of a loudspeaker. Figure 1 shows a block diagram of the test setup. The stimuli from the CDs were directed through an Ultra-Curve Pro digital equalizer, a Madsen OB 822 audiometer, and a Yamaha power amplifier to a loudspeaker. The equalizer was used to compensate for the frequency response of the setup.
The free-field output of the speaker was picked up by the microphones of the hearing aids, which were lined up close together on a stand. We placed the stand next to the listener's right ear, such that all five hearing aids were positioned as close as possible to the subject's ear, equidistant from the subject's mouth. To avoid feedback, we placed a baffle between the hearing aids and the listener's head, as shown in Figure 2.
From the hearing aids, the signal went via HA-2 2-cc couplers and a purpose-designed microphone amplifier to a purpose-designed switch box. From the switch box, the experimenter could select the sound from any two of the five hearing aids to be passed to the subject's response box. The signal then passed through an Ultra-Curve Pro digital equalizer, an adder, a Technics amplifier, and into 3A insert earphones that were connected to the subject's earmolds (binaurally).
The subject's response box had a switch that enabled him to switch between two hearing aids (labeled “A” and “B”), six selection buttons, and a volume control used to adjust the overall level presented in the insert earphones. The equalizer was used to flatten the frequency response such that the spectrum in the ear of the average subject (as represented by a Zwislocki coupler) was the same as what would be received if each hearing aid were directly connected to the subject's own earmold. The last amplifier was also used to balance the sound between the two ears if needed. At the adder, low-level (45 dB SPL) white noise coming from an audiometer was introduced during the playback of male voice, female voice, piano music, and own voice to mask possible internal noise in the hearing aids.
Each subject attended the laboratory twice. During the first appointment, we completed the audiogram and took impressions for earmolds. During the second appointment, we completed paired comparison tests. Prior to the second appointment, we programmed the five hearing aids in a 2-cc coupler as described above.
The hearing aids were programmed by adjusting parameters such as gain, compression threshold, and compression ratio to meet the prescribed coupler targets. Directionality was set to omnidirectional, and feedback suppression (where applicable and possible) was switched off. For the first four test conditions (female voice, male voice, piano music, and own voice), noise-suppression systems were switched off, except in the Senso Diva for which the noise-reduction (Speech Intensification System) could not be deactivated. In Phonak's Claro, the Digital Perception Processing (DPP) was set to the default of Fast-adaptive DPP. For the Senso Diva, feedback canceling must be left on. In order to program the Diva, a feedback test was needed and was performed with the microphone ports and the earhook plugged with putty to effectively disable it. The feedback margin was set to 6 dB. The occlusion manager was activated if needed to best match targets.
For the impulse noise and quiet room conditions, the frequency responses were maintained and the noise-suppression systems were activated (where applicable) and set to moderate (or medium).
For each subject the five hearing aids were randomly assigned a number from 1 to 5. We then performed a round-robin test in which every hearing aid was compared to each of the others twice, creating 20 comparisons per round robin. The order of pairs in each round robin was randomized. In all, each subject completed seven round robins of the six stimuli.
The first round was considered a practice, and we encouraged subjects to change the overall output level to suit their comfort level. They were also asked if the sound was balanced between the ears. No balance adjustments were needed for any of the subjects.
For each comparison, the two hearing aids were randomly labeled “A” and “B.” The subject had 30 seconds to switch between hearing aids “A” and “B” on the response box. We introduced the time limit to keep the test session within 2 to 2.5 hours. Once the subject decided which hearing aid he preferred, he pushed one of six buttons on the response box, which were labeled 1 through 6. A poster hanging directly in front of the subject explained that 1 meant A was much better than B, 2 meant A was moderately better than B, 3 meant A was slightly better than B, 4 meant B was slightly better than A, 5 meant B was moderately better than A, and 6 meant B was much better than A.
The first four rounds were some order of male voice, female voice, piano music, and own voice, with the fifth round being a repeat of whichever stimulus was presented first. After the fifth round, we measured real-ear insertion gain using the subject's left ear with each hearing aid attached to the earmold. We then programmed the hearing aids to activate the various noise-suppression systems. The sixth round used the impulse noise and the seventh round the quiet room. During the seventh round, the subject was asked, “Which aid is quieter?” The choices were “Aid A,” “Aid B,” or “they are the same.”
We performed several objective measures: real-ear insertion gain (REIG), coherence, time delay, distortion, internal noise, and peak maximum output (OSPL90). These measures were recorded for each device and each subject's hearing aid settings. As all the normal-hearing subjects used the same hearing aid settings, in this group we recorded only one measurement per device, except for the real-ear insertion gain curves.
Using Aurical, we measured REIG with a broadband, speech-shaped signal for a 70-dB-SPL input. The resulting insertion gains are shown in Figure 3. Although it appears that the normal-hearing group received as much gain as the hearing-impaired group, these measurements were performed with the aid directly connected to the subject's earmold. As described previously, for the subjective judgments the aid was connected within a relay chain that included a volume control adjusted to each subject's comfort level. Figure 3 therefore shows the gain-frequency response of each aid relative to the others, rather than the absolute gain experienced by the subjects.
The higher mid-frequency gain evident for the Symbio for the normal-hearing subjects was the result of the adjustment made so that all hearing aids produced similar long-term spectra for a continuous-discourse input signal.
We measured coherence as a function of frequency using an SR780 Spectrum analyzer with the hearing aid attached to an HA-2 2-cc coupler in a test box. A pink noise signal at 80 dB SPL was used as input. Coherence values at frequencies between 128 and 5024 Hz were extracted. The mean coherence value was calculated for each subject and each hearing aid by averaging the values obtained at each frequency.
We measured the time delay in two different ways with the hearing aid connected to an HA-2 2-cc coupler. In the first method, we produced an impulse sound by hitting a mug with a spoon in front of the hearing aid. The time waveforms of the input and output signals were viewed on an oscilloscope and the delay between them was estimated. We did this measurement three times and recorded the average group delay in ms.
The second method used a swept pure tone at 80 dB SPL to measure the transfer function of the hearing aid on an SR780 spectrum analyzer. We determined the phase response (unwrapped) of the hearing aid, and calculated the derivative of the phase response with respect to angular frequency. This produced a graph of group delay versus frequency. For statistical analyses, we obtained a single value of the group delay in ms by averaging the group delay measured at each frequency across subjects and then averaging across the one-third octave frequencies from 500 to 4000 Hz.
Figure 4 shows, for each device, the average time delays measured at each frequency. For three of the devices (Canta, Claro, and Symbio), the time delay is constant across frequency, at least above 500 Hz, the frequency range where the measurements seemed most reliable. For the other two devices (Diva and Triano), the time delay decreases with frequency. Presumably, the decrease in time delay with frequency is caused by multiband filtering in these hearing aids, in which a longer group delay would be expected in the lower frequency channels because of their narrower bandwidth.
For the remaining measurements (harmonic distortion, internal noise, and peak OSPL90), the hearing aids were placed in an Aurical test box. To obtain the harmonic distortion measurements, we ran frequency sweeps at 70 and 80 dB SPL. The distortion in percentage was recorded for each input level and the average percentage reported. To measure the internal noise of each hearing aid, we activated noise-suppression systems. Using the Aurical Hearing Instrument Test module, we conducted the measurements at default settings that calculate the input noise level in dB by subtracting the nominal reference test gain. Finally, we recorded the peak MPO using a 90-dB-SPL pure-tone sweep.
Tables 1 and 2 report the results of the objective measures obtained for the normal-hearing and hearing-impaired subject groups, respectively. Where applicable, the mean and standard deviation values measured across subjects are reported.
To quantify the response to each paired comparison when subjects were listening to the male, female, and own voices, piano music, and impulse noise, we assigned 3 points to the favored device if it was judged much better, 2 points if it was judged moderately better, and 1 point if it was judged slightly better. When subjects were listening in a quiet room, the favored device was assigned 1 point. In all cases, the disfavored device received the equivalent, negative number of points. This method of scoring the data increases the likelihood that the observations lie on an interval scale.4
The five stimuli that received preference scores from −3 to 3 at each comparison were analyzed by means of an analysis of variance for repeated measures. Using device and stimulus as repeated measures factors and subject group as a between-group factor, this analysis revealed no significant effect of device (p = 0.72) across subject groups and stimuli. However, it showed a highly significant interaction between stimulus, device, and subject group (p = 0.001).
A subsequent analysis of variance performed separately on observations from normal-hearing and hearing-impaired subjects revealed no significant effect of device across stimuli (p = 0.56 and p = 0.87, respectively), but a significant interaction between stimulus and device (p <0.01).
For each subject group, Figure 5 reports the average number of points and the standard error that each device received for each of the six stimuli. For the normal-hearing subjects, Diva obtained the highest average scores for male and female voices. Triano and Claro received the highest average scores for own voice and piano music, respectively, and Symbio got the highest average scores for listening to impulse noise and in a quiet room. For the hearing-impaired listeners, Symbio received the highest average scores for male and female voices and piano music. Diva received the highest scores for own voice, and Canta received the highest scores for listening to impulse noise and in a quiet room.
To assess the significance of these findings, we performed a separate analysis of variance for each subject group and each stimulus. Table 3 summarizes the results.
In four cases (normal hearing and impulse noise, normal hearing and quiet room, hearing-impaired and piano music, and hearing-impaired and quiet room), the differences observed between devices were statistically significant. A post-hoc comparison of means test (Newman-Keuls) revealed that in only one of these cases (normal-hearing listeners and quiet room) was the highest rating device (Symbio) significantly favored over all other devices.
When listening to impulse noise, the normal-hearing subjects favored Symbio and Triano significantly more than Claro and Diva. However, the difference in average points allocated to Symbio and Triano was not statistically significant. When listening to piano music, the hearing-impaired subjects favored Symbio significantly more than Claro, but Symbio was not rated significantly higher than Triano, Diva, and Canta, and Claro was not disfavored significantly versus Triano, Diva, and Canta. Finally, when listening in the quiet room, the hearing-impaired subjects significantly disfavored Diva to all the other devices, but there was no significant difference in the average preference points measured for Canta, Symbio, Claro, and Triano.
Of the objective measurements reported in Tables 1 and 2, three were correlated with each other (p <0.01). These parameters were the coherence and the two time-delay measures. Presumably, coherence decreases when the delay in the device causes the input and output to appear in different analysis frames during the coherence measurement.5
A correlation analysis of the objective measures and the preference scores obtained for each of the six stimuli revealed for the normal-hearing subject group a significant correlation between the peak OSPL90 and the average preference score obtained for piano music (p = 0.01). Although the variation in peak OSPL90 measured across devices is small (within 4 dB), higher preference scores were produced, on average, for devices with higher peak OSPL90 (Figure 6a). There is also a significant correlation between the average preference score obtained for the male voice and the time delay and coherence, respectively (p <0.04), as the normal-hearing listeners preferred the devices with shorter time delays.
The correlations between both coherence and time delay and the average preference score for the female voice are also moderately high, but not significant (p = 0.13 − 0.24). Figure 6b shows how the average preference scores obtained for the male and female voices decreases as the time delay based on the impulse noise increases. When combining the two sets of data, the correlation coefficient between preference score and time delay of −0.81 is highly significant (p = 0.004).
For the hearing-impaired subject group, analysis showed a significant correlation between the internal noise and the average preference score measured in the quiet room (p = 0.01). The devices with lower internal noise received higher preference scores (Figure 6c). Interestingly, for the hearing aid settings of the subjects with normal hearing, the internal noise measure was not very different among Symbio, Canta, and Triano, yet the normal-hearing subjects rated Symbio significantly higher in the quiet room than any of the other devices. Internal noise measurements with any non-linear device must, however, be treated with caution. Non-linearity at low levels (squelch or expansion) affects the measurement of noise at the output of the hearing aid, and non-linearity at high levels (compression) affects the measurement of gain that is used to convert the measured output noise into an equivalent input noise level. It is possible, therefore, that the equivalent input noise present during the objective measurement is different from the noise when the aid is listened to in quiet.
DISCUSSION AND SUMMARY
The sound quality of the five hearing aids was evaluated subjectively in six different listening situations by 10 normal-hearing and 10 hearing-impaired subjects. A statistical analysis of the subjective preference scores revealed no overall significant difference among the devices. Only for listening in a quiet room was one device (Symbio) significantly preferred to all the other devices by normal-hearing subjects (p <0.001). Correlations with objective measurements imply that devices with preferred sound quality have low internal noise, high peak OSPL90, and low internal delay.
The conclusion regarding internal noise seems uncontroversial. With respect to the higher peak OSPL90, the conclusion must be interpreted with caution, as the difference between OSPL90 levels was small and because the result was observed only for the normal-hearing group listening to piano music.
The conclusion regarding time delay should also be treated with caution because the effect of time delay appeared to be significant only for the male voice and for the male and female voices combined, and also because of the small number of hearing aids tested. It has previously been shown that excessive time delays have adverse consequences for hearing aid users.6–8 These studies have shown that to be disturbing delays must be longer than 14, 20, and 15 ms, respectively.
The results of this study suggest that delays of 10 ms could have a detrimental effect on sound quality. On the one hand, the evidence from the present study is not strong, as the findings could have been caused by unknown differences among the hearing aids other than time delay. On the other hand, the conditions in the present study most closely replicated real-world conditions. The aids were vented and had conventional earmold canal lengths (unlike reference 8); the stimuli were presented acoustically in real time, allowing delayed and undelayed signals to mix in the ear canal (unlike reference 7); and the hearing aid had a response characteristic individually prescribed to be appropriate for each listener (unlike reference 6). It would be worth performing further sound quality comparisons under these conditions to investigate more carefully if time delay in the range of 2 to 10 ms differentially affects sound quality.
This study was performed with funding provided by Bernafon, whose support is gratefully acknowledged. Thanks to Tom Scheller for constructive comments on the manuscript and to Liz Convery for performing some data manipulations.