Hearing is a matter of time. Communication signals such as speech and music are remarkable in that they simultaneously convey information across timescales, and the human auditory system is tuned into these temporal cues: within a single vocalization, we pick up on the slow fluctuations of sentences, the steady pulse of syllabic stress, and the fast changes that convey phonemes. This temporal hierarchy pervades all communication sounds. For example, imagine the slow phrases, the steady beats, and the fast-changing notes, trills, and high-hats that combine to make music. This temporal structure, in fact, is so fundamental that it even occurs in the environment. When walking through the woods, we hear slow footsteps, the unfolding crunch of a leaf underfoot, and the rapid snap of a twig.
Auditory-evoked potentials (AEPs) allow us to index the integrity of auditory processing across these timescales. Within a single electrophysiological response to speech, there is a plethora of information, and we can conceptualize different parts of this information as reflecting the neural processing of each element of speech, including these temporal elements.
Classically, we divide AEPs into early, middle, and late components, corresponding to the auditory brainstem, middle-latency, and late-latency responses, and we think about each of these response's generators as acting in isolation (Picton. Ear Hear 2013;34:385-401http://journals.lww.com/ear-hearing/Abstract/2013/07000/Hearing_in_Time___Evoked_Potential_Studies_of.1.aspx). However, an emerging view characterizes the auditory system as a distributed, but integrated, circuit (Kraus. Trends Cogn Sci 2015;19:642-654http://www.sciencedirect.com/science/article/pii/S1364661315002089). Thus, instead of focusing on any single anatomical structure as a unique locus of expertise or disorder, we think of the auditory system as an interactive whole—and we think that auditory processing relies on the successful integration of information across the system. We fear that the terminology of “early, middle, late” or “brainstem, thalamus, cortex” compromises this view by ignoring the fact that each of these structures is only one part of a dynamic system.
A complementary framework emphasizes how timescales of neural activity align with timescales of sound. An AEP can be filtered to select the fast information (auditory brainstem responses occurring >100–2,000 Hz, corresponding to cues faster than 0.1 ms), the slow information (cortical responses occurring <40 Hz, corresponding to cues around 1 sec), or the “in between.” Each of these distinct timescales corresponds to physiological constraints of the generating systems—for example, neurons in the brainstem can phaselock up to about 1,000 Hz, whereas cortical neurons rarely phaselock beyond 100 Hz. Again, however, these are two extremes of a temporal continuum. Thus, instead of thinking about these as “different” evoked potentials, we need to think about them as distinct windows into biological sound processing, and reflective of the integrative circuit.
By analogy, an athlete may go to the doctor complaining of a sore lower back. A good physician will think about how the lower back is working as a part of an integrative system, along with that patient's core, the hip muscles, upper body strength, and flexibility. To address that pain, the athlete and physician may need to look beyond the lower back to strengthen the entire physiological system that controls movement. At first, this work may compensate for the lower back's weakness and still allow the athlete to perform well. Eventually, however, good habits and exercise will strengthen the lower back itself.
We need to think the same way about the auditory system. When a patient presents with listening difficulties, each assessment needs to be placed into the context of the entire auditory system. If someone struggles to process certain types of information, such as rapid features in speech, auditory training to improve information processing across timescales may eventually boost his or her listening skills.
A FAST-TIMING DEFICIT
A prediction of this framework is that a patient with a specific neurological insult may struggle to process certain timescales of auditory information. Consider auditory neuropathy, which occurs because of completely dyssynchronous fast subcortical neural activity (Starr. Brain 2003;126[pt 7]:1604-1619http://brain.oxfordjournals.org/content/brain/126/7/1604.full.pdf). These patients have absent auditory brainstem responses and acoustic reflexes (i.e., fast processing), often despite normal otoacoustic emissions and, in up to half of cases, normal slower cortical processing (Rance. Ear Hear 2002;23:239-253http://journals.lww.com/ear-hearing/Abstract/2002/06000/Speech_Perception_and_Cortical_Event_Related.8.aspx).
What are the perceptual consequences of this fast-timing deficit? It should not come as a surprise that these patients struggle to process rapid temporal cues, such as a brief gap in a noise burst (Zeng. NeuroReport 1999;10:3429-3435http://journals.lww.com/neuroreport/pages/articleviewer.aspx?year=1999&issue=11080&article=00031&type=abstract; Kraus. J Assoc Res Otolaryngol 2000;1:33-45http://link.springer.com/article/10.1007/s101620010004). They also experience difficulty understanding fast-changing speech cues, such as the acoustic contrast between /b/ and /g/. In addition, these patients are essentially deaf when there is background noise, suggesting that fast auditory processing is necessary to understand speech in noise.
This perceptual profile is similar to patients with auditory processing disorder, who experience similar, albeit less severe, challenges in speech understanding. Many individuals with listening difficulties, such as children with auditory processing disorder and older adults, have moderately dyssynchronous subcortical responses to speech sounds, which again reflects fast neural processing (Anderson. J Neurosci 2012;32:14156-14164http://www.researchgate.net/publication/232232076_Aging_Affects_Neural_Precision_of_Speech_Encoding; Hornickel. J Neurosci 2013;33:3500-3504http://www.jneurosci.org/content/32/41/14156.full). Millisecond-level dyssynchronies in neural processing may therefore lead to a specific perceptual profile. Importantly, improving access to sound improves this processing by facilitating directed attention to all sound features, facilitating sound-to-meaning connections and eventually boosting perception (Hornickel. Proc Natl Acad Sci U S A 2012;109:16731-16736http://www.brainvolts.northwestern.edu/documents/FM_PNAS12.pdf).