Monday, May 1, 2023

Using an LLM to decode fMRI images of continuous speech

Oliver Whang, "A.I. Is Getting Better at Mind-Reading," NYTimes, May 1, 2023.

The study centered on three participants, who came to Dr. Huth’s lab for 16 hours over several days to listen to “The Moth” and other narrative podcasts. As they listened, an fMRI scanner recorded the blood oxygenation levels in parts of their brains. The researchers then used a large language model to match patterns in the brain activity to the words and phrases that the participants had heard.

Large language models like OpenAI’s GPT-4 and Google’s Bard are trained on vast amounts of writing to predict the next word in a sentence or phrase. In the process, the models create maps indicating how words relate to one another. A few years ago, Dr. Huth noticed that particular pieces of these maps — so-called context embeddings, which capture the semantic features, or meanings, of phrases — could be used to predict how the brain lights up in response to language.

In a basic sense, said Shinji Nishimoto, a neuroscientist at Osaka University who was not involved in the research, “brain activity is a kind of encrypted signal, and language models provide ways to decipher it.”

In their study, Dr. Huth and his colleagues effectively reversed the process, using another A.I. to translate the participant’s fMRI images into words and phrases. The researchers tested the decoder by having the participants listen to new recordings, then seeing how closely the translation matched the actual transcript.

Almost every word was out of place in the decoded script, but the meaning of the passage was regularly preserved. Essentially, the decoders were paraphrasing.

Limitations: the model is a long, tedious process, and to be effective it must be done on individuals. When the researchers tried to use a decoder trained on one person to read the brain activity of another, it failed, suggesting that every brain has unique ways of representing meaning.

Participants were also able to shield their internal monologues, throwing off the decoder by thinking of other things. A.I. might be able to read our minds, but for now it will have to read them one at a time, and with our permission.

* * * * *

The original research:

Tang, J., LeBel, A., Jain, S. et al. Semantic reconstruction of continuous language from non-invasive brain recordings. Nat Neurosci (2023).

Abstract: A brain–computer interface that decodes continuous language from non-invasive recordings would have many scientific and practical applications. Currently, however, non-invasive language decoders can only identify stimuli from among a small set of words or phrases. Here we introduce a non-invasive decoder that reconstructs continuous language from cortical semantic representations recorded using functional magnetic resonance imaging (fMRI). Given novel brain recordings, this decoder generates intelligible word sequences that recover the meaning of perceived speech, imagined speech and even silent videos, demonstrating that a single decoder can be applied to a range of tasks. We tested the decoder across cortex and found that continuous language can be separately decoded from multiple regions. As brain–computer interfaces should respect mental privacy, we tested whether successful decoding requires subject cooperation and found that subject cooperation is required both to train and to apply the decoder. Our findings demonstrate the viability of non-invasive language brain–computer interfaces.

* * * * * 

Related posts.

No comments:

Post a Comment