Journal Logo

ARTICLE

Solving the trade-off between speech understanding and listening comfort

Schaub, Arthur

Author Information
doi: 10.1097/01.HJ.0000386585.91129.c1
  • Free

In Brief

When a technique has been around for some time it is usually assumed to be mature. This might not be true, however, in the case of wide dynamic range compression (WDRC). Compression is certainly seen as the suitable compensation for loudness recruitment, but at that point the agreement ends.

In fact, Moore writes in an article on compression that the “controversy continues about...whether it should be fast acting or slow acting.”1 Likewise, Bor et al. say about multichannel compression that “the appropriate number of channels remains an unanswered question.”2

Such uncertainties suggest room for improvement. Indeed, improvements are necessary if hearing instruments are to increase user satisfaction. And improvements are also possible, as this article will show.

COMMON LIMITATIONS

It is well known that hearing instruments cannot fully compensate for hearing impairment. For example, they take much longer to react to changes in sound pressure level (SPL) than the healthy ear.

Several sources provide evidence. Moore et al., for example, find the outcome of an experiment “consistent with the idea that loudness recruitment results from the loss of a fast-acting compressive nonlinearity that operates in the normal peripheral auditory system.”3 Moore also says of fast compression that it “improves the ability to detect a weak consonant following a relatively intense vowel.”1 Short consonants in running speech last no longer than 20 milliseconds. In contrast, hearing instruments are slow. Their relevant time constants range from tens of milliseconds to seconds.

Figure 1 illustrates the situation. A typical test signal exhibits abrupt changes in SPL (top). After such a change, a slow compression system (middle) takes longer to settle to the appropriate output SPL than a fast system (bottom). During the crucial periods (highlighted by red ellipses), compression fails to amplify enough. This effect, which is sometimes referred to as “shadowing,” favors fast compression. But slow compression is still widely used for a simple reason: Making fast compression work properly is difficult.

Figure 1
Figure 1:
Sound pressure level (SPL) of various signals: Top: SPL of a typical test signal at the input of a compression system. Middle: SPL of the output signal produced by slow compression. Bottom: SPL of the output signal produced by fast compression.

THE CURRENT PARADIGM

Today's hearing care professionals tend to think of speech understanding and listening comfort as a trade-off: slow compression for listening comfort and fast compression for speech understanding, but at the expense of listening comfort.

This way of thinking dates back some 20 years. Moore et al., for example, acknowledge the need for compression “to be fast acting to ensure that weak sounds are audible when they occur just after strong sounds.”3 Yet, not even a time constant of 20 ms was short enough to fully achieve the desired result in their experiments. And the authors comment that “shorter release times than this are generally avoided because they can lead to significant harmonic and intermodulation distortion”—a warning that has become part of the collective mindset.

More recently, Gatehouse et al. have almost carved the paradigm in stone, saying that slow compression “outperformed the fast-acting WDRC fittings for listening comfort, while for reported and measured speech intelligibility the converse was true.”4Figure 2 shows the outcome of their tests.

Figure 2
Figure 2:
Number of test subjects who achieved their best result with each of several schemes of amplification. Top: with regard to listening comfort. Bottom: with regard to speech understanding.

These researchers investigated several schemes of amplification in hearing instruments: linear with two different gain shapes (NAL-RP, linear) and two-channel compression in three different versions (slow in both channels, fast in both channels, and fast in one and slow in the other channel). Then they determined the number of test subjects who achieved their best result with each scheme with regard to listening comfort (top) and speech understanding (bottom).

AVOIDABLE SOUND DEGRADATIONS

In previous studies, researchers have rarely questioned the hearing instruments that they used in their experiments. But pitfalls are obvious when tuning a slow system to compress much faster. Such pitfalls include overshoot and spectral distortion.

Overshoot

The output signal of a compression system overshoots when a sudden loud sound follows a soft sound. In this case, the loud sound is initially amplified too much, in fact, by the amount of gain that was correct for the preceding soft sound. Then it takes a while until the output SPL settles to the appropriate value. This effect again results from the less than ideal reaction of compression systems.

Looking at disadvantages of fast compression, Moore says that “it can introduce spurious changes in the shape of the temporal envelope of sounds (e.g., overshoot and undershoot effects...).”1 But he also explains how to mitigate the problem: “Delaying the audio signal by a small amount relative to the gain-control signal can reduce such effects.” Figure 3 indeed shows that the adverse effect (top, highlighted by a red ellipse) is significantly reduced with time alignment (bottom).

Figure 3
Figure 3:
SPL of the responses to a test signal, as produced by two different compression systems: Top: one without time alignment. Bottom: one with time alignment between the audio signal and the gain applied to it.

There remains an important question: Why is overshoot a problem as compression speed increases? In running speech, fast compression detects the level differences between soft consonants and loud vowels. As a consequence, fast compression amplifies consonants more than vowels. And therefore overshoot occurs repeatedly, namely at the onset of each vowel.

With slow compression, however, the long time constants cause the gain to remain virtually constant while someone is speaking. Hence, overshoot occurs quite rarely, in fact only after a sufficiently long period of silence.

Spectral distortion

Another degradation occurs when fast compression proceeds in multiple channels. The situation is illustrated in Figure 4, for the case of two-channel compression.

Figure 4
Figure 4:
Gain applied to consecutive phonemes by two fast-acting compression systems: Top: two-channel compression, gain curves crossing each other at the split frequency between channels. Bottom: alternative amplification scheme, gain curves as determined by the SPL of each phoneme.

The dotted lines in Figure 4 indicate the target gains for various SPLs of the input signal, as determined for a particular hearing-impaired person, by means of the NAL-NL1 fitting rationale. The blue and pink lines show how a fast two-channel system (top) amplifies the consecutive phonemes /sh/ and /oe/ in the word “shoe.” Interestingly, the gain curves cross each other (highlighted by a red ellipse) at the split frequency between the two channels.

This effect originates from two opposite spectral shapes: From low to high frequencies, the sibilant has a rising spectrum and the vowel has a falling spectrum. In the low-frequency channel, the vowel consequently exhibits a higher signal level than the sibilant, resulting in less gain. In the high-frequency channel, it's the opposite. The vowel exhibits a lower signal level, resulting in more gain.

Looking at disadvantages of fast compression in multiple channels, Moore comments, “Short-term changes in the spectral pattern of sounds may be distorted because the pattern of gains across frequency changes rapidly with time.”1

Another approach makes more sense: applying the target gain curves that correspond to the SPL of each consecutive phoneme (bottom). That approach avoids spectral distortion, whatever the speed of compression.

Again, an important question remains: What about spectral distortion with slow multichannel compression? There is none. The long time constants turn compression into close-to-linear amplification when processing speech.

AN ALTERNATIVE COMPRESSION SYSTEM

The foregoing illustrates that details matter. But along with overshoot and spectral distortion, there are other shortcomings to deal with. One crucial issue is to measure SPL more quickly and reliably than in conventional systems. With all potential improvements taken together, a system would no longer demand a trade-off between speech understanding and listening comfort; it would instead bring them together.

That is the goal of ChannelFree™ compression, a system developed by Bernafon AG that is available in its Veras and Vérité hearing aid families.

Sound quality

One distinctive feature of ChannelFree processing is an extremely short reaction time of 10 ms. Such short reaction times can cause a distortion problem in conventional compression schemes, which is why Bernafon had an extensive test conducted to address concerns about distortion.

In their test report, Dillon et al. write, “The aim of our project was to compare the perceived sound quality of several current advanced hearing aids while they are amplifying a range of different signals.”5 This range was extensive, including signals that are increasingly dealt with by special features: soft background noise by silencer circuits, impulse noise by transient-noise reduction, and own voice by open fittings. Moreover, half the test ratings came from listeners with normal hearing. The problem with these test conditions is that they explore compression just marginally or, in the case of listeners with normal hearing, not even that.

Yet among the test signals were also speech and music, signals that directly rely on compression when processed for the hearing impaired—the realm at which ChannelFree aims. Indeed, Dillon et al. state, “For the hearing-impaired listeners, Symbio [the first hearing aid with ChannelFree compression] received the highest average scores for male and female voices and piano music.”5 The extract of the test results is shown in Figure 5.

Figure 5
Figure 5:
Extract of test results: preference ratings for speech and music by hearing-impaired listeners.

System design

Figure 6 depicts the structure of the ChannelFree compression system. The diagram shows a microphone, a receiver, and four processing blocks that are arranged on two parallel signal paths. The audio signal flows on the lower path, passing a block labeled Synchronization and a Controllable Filter. The Synchronization delays the audio signal by a small period, thus aligning the audio signal with the gain applied to it.

Figure 6
Figure 6:
Block diagram of the ChannelFree compression system.

The Controllable filter acts on the wideband audio signal, i.e., without splitting it into channels. It applies a smooth gain curve that evolves continuously according to the control signals received from the Filter Control.

The Filter Control continuously determines the required gain curve in accordance with the measured SPL that it receives from the block Level Measurement. To achieve this, the Filter Control holds client-specific gain data that the hearing care professional determines during a fitting session and stores in the hearing instrument.

Finally, the Level Measurement continuously measures the SPL of the wideband audio signal, using advanced techniques that are beyond the scope of this article. For details see Schaub 2008.6

Gain shaping

ChannelFree compression has sometimes been misunderstood as single-channel wideband compression. However, single-channel compression is outdated because of a severe limitation: one single fixed compression ratio across frequency. Once this limitation actually motivated the introduction of two-channel compression, and later of multichannel compression.

But ChannelFree is different, as can be seen from its gain-shaping capability illustrated in Figure 7. The gain curves show a continuously varying compression ratio (highlighted by red arrows of different length). The compression ratio can be adjusted independently at each audiometric frequency.

Figure 7
Figure 7:
Gain curves for various SPLs of the input signal, as produced by ChannelFree compression.

CONCLUSION

Researchers and engineers have long been striving for fast compression—fast enough to deal appropriately with soft and loud phonemes in running speech. At the same time, they have been striving for clear sound and flexibility in gain shaping. ChannelFree is designed to provide all three.

REFERENCES

1. Moore BCJ: The choice of compression speed in hearing aids: Theoretical and practical considerations and the role of individual differences. Trends Amplif 2008;12(2):103–112.
2. Bor S, Souza P, Wright R: Multichannel compression: Effects of reduced spectral contrast on vowel identification. J Sp Lang Hear Res 2008;51:1315–1327.
3. Moore BCJ, Wojtczak M, Vickers DA: Effect of loudness recruitment on the perception of amplitude modulation. J Acoust Soc Am 1996;100:481–489.
4. Gatehouse S, Naylor G, Elberling C: Linear and nonlinear hearing aid fittings. 1. Patterns of benefit. IJA 2006;45(3):130–152.
5. Dillon H, Keidser G, O'Brien A, Silberstein H: Sound quality comparisons of advanced hearing aids. Hear J 2003;56(4):30–40.
6. Schaub A: Digital Hearing Aids. New York: Thieme Medical Publishers, 2008.
Copyright © 2010 Wolters Kluwer Health, Inc. All rights reserved.