Machine learning turns thoughts into words

How understandable speech has been reconstructed from brain activity...
09 February 2019


The wiring of the brain


Understandable speech has been reconstructed from brain cell activity using machine learning, raising the possibility of giving a voice to the voiceless.

When we speak, or listen to someone speaking, a part of the brain called the auditory cortex becomes active as it processes the speech. When someone loses the ability to speak, owing to diseases or trauma, while the parts of the nervous system required to make speech movements may not work, the auditory cortex can remain unharmed. For instance, a person with motor neurone disease is likely to have a perfectly healthy auditory cortex yet be unable to speak because of paralysed muscles. But by studying the patterns of brain cell activity that arise in the auditory cortex when a patient hears speech, scientists have recreated speech artificially at a higher clarity than ever before, in an important step towards restoring speech to those who have lost it.

The research, published in the journal Scientific Reports, comes from Nima Mesgerani’s group at the Neural Acoustic Processing Lab, Columbia University. According to Mesgerani, "what we're hoping to do is to directly read speech from the brain of a person so that they do not have to actually say it…  As the brain activity is produced we can directly detect and decode it."

This kind of speech reconstruction has been studied before, but Mesgerani and his team brought together two state of the art technologies - machine learning, and modern speech creation software (of the likes used in your virtual assistants, Siri and Alexa) - to reproduce much clearer speech than previous attempts.

To achieve this, the researchers worked with five patients who were undergoing brain surgery to treat epilepsy, all of whom had normal hearing, and placed electrodes directly onto the surface of the auditory cortex in each of their brains. While the patients then listened to 30 minutes of short stories, the team recorded the neural activity as their brains processed the speech. The data were fed through a machine learning algorithm called a neural network, a complex set of instructions that mimics the dense network of connections found in the brain, to teach the algorithm how the brain processes speech.

This allowed the researchers to play the patient a reading like this:

And reproduce the algorithm’s best guess at what the patient had heard using the neural activity recorded at the auditory cortex:

The researchers went a step further, by combining this machine learning algorithm with recent advances in speech creation software, to give a much clearer, more intelligible, output:

The short story segments were used to train the algorithm; to teach it how to convert neural activity into recognisable speech. However, once the algorithm had been trained, the challenge became to feed it new neural activity data and see if it could reproduce the speech from the data. In this case, the researchers read the patients numbers from zero to nine and you can hear for yourself how it did at reproducing the spoken numbers:

Impressive as it sounds, the model will nevertheless need adapting and fine-tuning to be able to reproduce not just speech that is heard, but speech that is thought or imagined, so you won’t have an algorithm speaking for you any time soon. But Mesgerani is optimistic about the benefits the technology will bring. 

"The ideal goal would be to have an implantable device, that is able to detect and decode the brain activity that reflects the internal voice of a person. So when the person tries to say something internally we would be able to decode and translate it into speech so that the person will have the ability to communicate using speech."





Add a comment