Subscribe to eTOC

Real-Time Speech Decoding from Brain Waves

Article In Brief

Researchers used high-density electrocorticography recordings to decode speech directly from the brains of human participants, with an accuracy rate as high as 61 percent and 76 percent.


Dr. Eddie Chang (right), and David Moses, PhD, worked together on the study on decoding speech dialogue using human cortical activity.

Accurate, real-time decoding of brain waves to produce speech is the Holy Grail for assistive communication for people without the ability to speak, either from stroke, traumatic brain injury, or neurodegenerative disease. While that prize still remains a long way off, researchers took a new and potentially important step towards it by showing that, when the question is known, high-density electrocorticography can be used to decode a subject's answer with high accuracy.

“This is the first study to demonstrate real-time decoding of speech, and that is significant,” commented Chethan Pandarinath, PhD, of Georgia Tech, who was not involved in the work. “The new concept this group is bringing to the table is using additional information—knowledge of the question being asked—to help the system perform better. The question constrains the type of thing you might say in response, and this information can help the system figure out what you want to say.”

Technologies for restoring communication include pointing devices controlled by a mouse or eye movements, to spell out words or point at objects or simple commands; scalp electrodes to detect a specific brain wave associated with making a choice (the P300 wave, often called the “aha!” wave); and implanted electrodes to detect and decode more varied and complex electrical activity associated with intended speech. The latter two are both types of brain-computer interfaces (BCIs), but the ability of implantable electrodes to capture complex brain activity far surpasses that of scalp electrodes.

Study Methods, Findings

In the new study, Edward Chang, MD, PhD, professor of neurological surgery and colleagues at the University of California, San Francisco, studied three subjects undergoing presurgical evaluation for intractable epilepsy. Each had received implantation of either one 256-channel or two 128-channel electrocorticography arrays on the surface of auditory and sensorimotor areas. Electrical activity during a listening and responding task was captured and processed to predict the actual responses given by the subjects to questions posed by the researchers.

That processing is the key to the advance reported here. The group led by Dr. Chang has been developing this approach for several years, combining fundamental advances in understanding how the brain encodes the motor commands that lead to speech with multi-level processing of recorded signals to extract information about the phonemes—the individual sounds that make up words—the subject intends to create during speech.

In the new study, published in the July 30 issue of Nature Communications, the research team asked subjects a set of questions, and presented them with a set of possible answers to choose from, and to utter. For instance, for the question, “Which musical instrument do you like listening to?” the subject could say “piano,” “violin,” “electric guitar,” “drums,” “synthesizer,” or “none of these.”

With the knowledge of which question was being asked, the speech decoding program could narrow down its analysis of the recorded wave form to make a guess as to the phonemes spoken by the subject in response; similarly, detection of each successive phoneme and knowledge of the words that phoneme was found in could be used to constrain the possibilities for those that followed it.

The accuracy rate for spoken words was 61 percent, compared with chance accuracy of 7 percent. Remarkably, the program could be trained to the individual speaker's brain activity within several minutes. In the experiment, the decoding program also guessed at the question based on recordings from the subject's auditory cortex, although in actual application it would be more likely for the question to be known with certainty. The accuracy rate for perceived words was 76 percent, compared with chance accuracy of 20 percent.

“By integrating what participants hear and say, we leveraged an interactive question-and-answer behavioral paradigm that can be used in a real-world assistive communication setting,” the authors concluded. “Together, these results represent an important step in the development of a clinically viable speech neuroprosthesis.”

Expert Commentary

“There has been a lot of work done on developing brain-computer interfaces to read signals directly from the brain,” said Dr. Pandarinath assistant professor of biomedical engineering at Georgia Tech. “A lot of that work, including my own, has focused on assisted typing, but this is much less efficient than being able to speak directly,” the process Dr. Chang is working on.

Dr. Pandarinath noted that the choice to focus on the motor cortex to understand what words the subject is intending to form, rather than further upstream where thoughts are formed, has two advantages. “It makes sense to be as close as you can to the motor output, since there is a long history of how the brain controls movement that we can leverage. We certainly understand that process better than we understand the elements of thinking a thought. And we probably don't want a system that detects thoughts,” rather than intended utterances. “The people using these devices will probably want to limit what the device has access to.”

The work in the new study is a step forward he added, “but there are some significant technical advances that we would need” before a speech detection system could be practical for those who might need it. First, he noted, the electrodes “aggregate activity of hundreds to thousands of neurons simultaneously,” including from many that are likely not involved in speech production.

“Complementary technologies might help out here to get more localized information,” to produce a more accurate signal with fewer electrodes.

Second, the subjects in this study had undergone a major craniotomy in order to implant electrodes for epilepsy monitoring but were otherwise healthy. Such an operation would be daunting for any patient, but especially for those with significant brain damage or neurodegenerative disease.

Nonetheless, one ALS patient has received a BCI device implanted on the surface of her cortex, which has allowed her to control an eye-tracking device linked to a virtual keyboard. That work was led by Nick Ramsey, PhD, professor of neurology and neurosurgery at University Medical Center Utrecht, in the Netherlands. The patient has been using the device at home for over three years, Dr. Ramsey said, “and the signals are as crisp as ever,” despite presumed ongoing neurodegeneration. A consortium of researchers under the name BrainGate is developing a trial of implantable BCI devices in 20 ALS patients.

“The advance in speech decoding in the new study is very good academic work,” Dr. Ramsey said, “but it is a long way from being ready for home use. A central problem is that the range of questions that would be functionally useful, and the answers to those questions, is enormously larger than the handful of each tested here. Real-time decoding becomes far harder as the universe of possible questions and answers expands.”

“The whole BCI implant field is too removed from patients,” he said. “We keep on exploring the horizons of what is possible and leaving patients behind. It would be good if more of the research would focus on what it takes to get working devices into the patient's home, because that is what is going to help people.”


Drs. Chang, Pandarinath, and Ramsey had no conflicts of interests.

Link Up for More Information

• Moses DA, Leonard MK, Makin JG, Chang EF. Real-time decoding of question-and-answer speech dialogue using human cortical activity Nat Commun 2019;10(1):3096.
    • Vansteensel MJ, Pels EGM, Bleichner MG, et al. Fully implanted brain-computer Interface in a locked-in patient with ALS N Engl J Med 2016;375(21):2060–2066.