The brain is a wonderful, mysterious thing: three pounds of soft gelatinous tissue through which we interact with the world, generate ideas and construct meaning and representation. Understanding where and how this happens has long been among neuroscience’s fundamental goals.
In recent years researchers have turned to artificial intelligence to make sense of brain activity as measured by fMRI, turning AI models on the data to understand, with increasing specificity, what people are thinking and what those thoughts look like in their brains.
An interdisciplinary team at UC Santa Barbara pushes those boundaries by applying deep learning to fMRI data to create complex reconstructions of what the study subjects saw.
“There are several projects that try to translate fMRI signals into images, mostly because neuroscientists want to understand how brains process visual information,” said Sikun Lin, the lead author of a paper that appeared in a recent NeurIPS conference in November 2022.
According to Lin, UCSB computer science professor Ambuj Singh and cognitive neuroscientist Thomas Sprague, the resulting images generated by this study are both photorealistic and accurately reflect the original “ground truth” images. They noted that previous reconstructions didn’t create images with the same level of fidelity.
Key to their approach is that in addition to images, a layer of information is added through textual descriptions, a move that Lin said was made to add data to train their deep learning model.
Building on a publicly available dataset, they used CLIP (Contrastive Language-Image Pre-training) to encode objective, high-quality text descriptions that pair with the observed images, and then mapped the fMRI data of those observed images on to the CLIP space.
From there they used the output from the mapping models as conditions to train a generative model to reconstruct the image. The resulting reconstructions came remarkably close to the original images viewed by the subjects — closer, in fact, than any previous attempt to reconstruct images from fMRI data. Studies that have followed, including a notable one from Japan, have outlined methods for efficiently manipulating limited data into clear images.
“One of the main gists of this paper is that visual processes are inherently semantic,” Lin said. According to the paper, “the brain is naturally multimodal,” we use multiple modes of information at different levels to gain meaning from a visual scene, such as what is salient, or the relationships between objects in the scene.
“Using a visual representation only might make it more difficult to reconstruct the image,” Lin continued, “but using a semantic representation like CLIP that incorporates text such as the image’s description, is more coherent with how the brain processes information.”
“The science in this is whether the structure of the models can tell you something about how the brain works,” Singh added. “And that’s what we are hoping to try to find.”
In another experiment, for instance, the researchers found that the fMRI brain signals encoded a lot of redundant information — so much that even after masking more than 80% of the fMRI signal, the resulting 10–20% contained enough data to reconstruct an image within the same category as the original image, even though they didn’t feed any image information into the signal reconstruction pipeline (they were working solely from fMRI data).
“This work represents a true paradigm shift in the accuracy and clarity of image reconstruction methods,” Sprague said. “Previous work focused on extremely simplistic stimuli, because our modeling approaches were much simpler. Now, with these new image reconstruction methods in hand, we can advance our cognitive computational neuroscience experiments toward using naturalistic, realistic stimuli without sacrificing our ability to generate clear conclusions.”
At the moment, the reconstruction of brain data into “true” images continues to be labor intensive and out of reach of ordinary use, not to mention the fact that each model is specific to the person whose brain generated the fMRI data. But it doesn’t stop the researchers from musing on the implications of being able to decode what a person is thinking, right down to the layers of meaning that are hyper specific to each mind.
“What I find exciting about this project is whether it might be possible to preserve the cognitive state of a person, and see how these states so uniquely define them,” Singh said. According to Sprague, these methods would allow neuroscientists to conduct further studies measuring how brains change their representations of stimuli — including in representations of robust, complicated scenes — across task changes.
“This is a critical development that will answer fundamental questions about how brains represent information during dynamic cognitive tasks, including those requiring attention, memory and decision-making,” he said.
One of the areas they are now exploring is finding out what and how much is shared between brains so AI models can be constructed without having to start from zero each time.
“The underlying idea is that the human brain across many subjects share some hidden latent commonalities,” said Christos Zangos, a doctoral student researcher in Singh’s lab. “And based on those, currently I’m working on the exact same framework, but I’m trying to train with a different partition of the data set to see to what extent, using small amounts of data, we could build a model for a new subject.”
Source: UC Santa Barbara