Translating brain waves into speech
The ability to speak is something we often take for granted; but not everyone can. So can technology give a voice back to the voiceless? Izze Clarke caught up with Nima Mesgarani, from Columbia University’s Zuckerman Institute, who may have found a solution...
Izzie - Speech is the most natural way for us to communicate. It's faster than emails, instant messaging and helps us to connect with those around us. Which is why losing our ability to talk because of an injury or a disease can be so devastating.
Nima - For example, locked in syndrome or ALS. An example of that would be Stephen Hawking; that he was also losing the ability to talk.
Izzie - That’s Nima Mesgarani from the Zuckerman's Mind and Brain Institute at Columbia University.
Nima - So in these cases the brain is fine. The pattern of activity that produces or hears speech is OK. It's just the connection between that and the speech generation muscles that is affected. What we're hoping to do is to directly read speech from the brain of a person so that they do not have to actually say it but we can go one step before that. And as the brain activity is produced we can directly detect and decode it.
Izzie - When we hear or imagine speech, our brain kicks into gear and that generates a specific pattern of neural activity in the brain as it processes this information. That goes on in a certain area called the auditory cortex. That brain pattern depends on who is speaking, what we're hearing and the quality of the sound. What's impressive is that Nima and his team measured those brain signals and came up with a method that decodes them and turns it back into speech.
Nima - We've used a machine learning algorithm. These are models that are loosely based on the properties of neurons in the brain and they are able to learn extremely complex patterns of relationships. We also use the latest technologies in speech sentences and we basically ask the algorithm to learn how to translate, how to go from the brainwaves back to the speech sentences, and from there we can go to the sound itself.
Izzie - That sounds too good to be true. So how did people come into this? How are you able to test that?
Nima - We teamed up with neurosurgeons and whenever they had patients, for example, with epilepsy, the surgeons implant a bunch of electrodes in their brain. And these patients are in the hospital, they're connected to a recording device and we play them sound and we record their brain waves simultaneously.
Izzie - The first part of this experiment involved playing children's storybooks to patients. This helps the algorithm to recognise their all important brain patterns when hearing speech. But then it was time to see if the algorithm could inverse this, turning those brain patterns into something audible.
Nima - We asked them to listen to numbers from 0 to 9 and the algorithm was never trained on numbers but, looking at half hour of the speech, it was able to to figure out what sort of brainwave activity corresponds to what sort of speech sounds.
And then this algorithm is able to reconstruct the sound that is most similar to what the person actually heard.
And of course,because we know what the person actually heard, you're able to compare the two to determine whether it did a good job or not. And when we did that test we found that what we reconstructed from the brain is highly intelligible.
Izzie - How intelligible? Well, using the latest speech synthesize,s the algorithm translated their thoughts into this.
Clear robotic voice - zero, one, two, three…
Izzie - And those speech synthesizers, similar to Siri, Google assist or any other minion living in your device, helped make this a vast improvement from any previous attempts.
Fuzzy robotic voice - Zero, one, two, three…
Izzie - See what I mean? But Nima seemed in very positive that one day this system could translate brain signals into more complex sentences. So could this technology give a voice to those with ALS, locked in syndrome or those with any other speech impairment?
Nima - So I would say that this is definitely a big step in that direction but obviously there is a lot more that has to be done. Previous studies have shown that there is a lot of similarity between actually listening to a speech or imagining listening to a speech. But of course it has to be tested and that's also another future direction of our work.
The ideal goal would be to have an implantable device, that is able to detect and decode the brain activity that reflects the internal voice of a person. So when the person tries to say something internally we would be able to decode and translate it into speech so that the person will have the ability to communicate using speech.