MITB Banner

Can Stable Diffusion Make Mind-Reading a Reality?

New research proposes the reconstruction of high-resolution images from human brain activity; opens doors to future technology

Share

Listen to this story

Steven Speilberg’s 2002 sci-fi hit Minority Report showed a ‘pre-crime police department’ that would prevent future crimes before it was committed thanks to three clairvoyant humans (precogs). This was a concept beyond imagination. The precogs were genetically engineered to foresee future crime, which the police could see via a video projection from their minds. 

We may be far from getting there yet, but we are surely advancing on the path of generating images from the human brain.  

Two researchers in Japan, Yu Takagi and Shinji Nishimoto, recently, submitted a paper where diffusion models (DMs) like Stable Diffusion were used to generate high-resolution images from human brain activity. A study was proposed where images are reconstructed with the use of fMRI (functional magnetic resonance imaging). The goal was to interpret the connection between computer vision models and our visual system. By reconstructing visual experiences from human brain activities, the way a human brain processes visual information can be ascertained. 

fMRI, used for image reconstruction, measures brain activity by detecting changes associated with blood flow. The technique combines cerebral blood flow and neuronal activation. In the proposed paper, high-resolution images were reconstructed with high fidelity without any additional training or fine-tuning of complex deep-learning models.        

Each component of the LDM is mapped to specific components of the brain regions. 

There have been previous attempts to reconstruct visual images from fMRI, however, newer studies use deep generative models trained on a large number of naturalistic images. There is a limitation to these methods though. Training and fine-tuning of generative models such as GANs (Generative Adversarial Networks), a type of neural network architecture, with the dataset used in fMRI experiments is challenging as the sample size in neuroscience is small. However, DMs and LDMs (latent diffusion models) have the ability to generate high-resolution images with high semantic fidelity of text conditioning, and high computational efficiency.  

Latent Diffusion Model

LDM is a type of computer program that can learn to create images by transforming a simple noise pattern into a complex image. LDM can be trained with a dataset of images from which it learns to create new images that look similar to the training data. Once trained, the model will be able to create images by starting with a random noise pattern and gradually transforming it into an image that looks like it belongs in the dataset. 

In the proposed paper, each component of an LDM (Stable Diffusion) is quantitatively interpreted from a neuroscience perspective by mapping specific components to distinct brain regions. 

Image Source: sites.google

The encoder-decoder model is used where one neural network (the encoder) is used to transform the input data into a fixed-length representation and then use another neural network (the decoder) to generate the output based on this encoding. 

Image Source: biorxiv.org 
Row 1: Presented images. Row 2 : Images reconstructed from fMRI signals

The research also worked with prediction accuracy of the encoding models for three types of latent representations associated with the diffusion model. Latent representations are compressed, abstract features or variables of the data that capture the most relevant and useful information inferred from raw data, for a particular task. A latent representation of the original image- z, a latent representation of image text annotation- c, and zc which is a noise-added latent representation of z after the reverse diffusion process with cross-attention to c.

Image Source: biorxiv.org

Mind Reading? 

With the announcement of this paper, people have been quick to react with the consideration of this model becoming the next mind reader. However, this model is not trained to interpret thoughts and words. The model is an AI extension on previous studies of brain mapping through fMRI or electroencephalography (EEG), where the imaging machine is able to detect only broad patterns of activity. The proposed model is still in the nascent stages of interpreting brain activities. 

Future Scope 

Brain mapping is already implemented in the medical sector in diagnosing and understanding a patients’ illnesses pertaining to triggers and tumors. With focussed brain readings, doctors are able to deliver targeted treatments. With image reconstruction from brain activities using LDM, the integration with an already existing framework of brain mapping, can bring advancements in the medical field. 

If the proposed model comes into the picture, future refinement of the model can probably assist with jobs where such a model can be extensively implemented. For example, in crime, an eyewitness testimony is influenced by the mental state and surroundings of the witness, which can often cloud the description of the suspect. With this model, eyewitness or victim’s recollection of the suspect will become simpler. However, the implementation of such a technology will bring the focus on ethical mind reading. 

Share
Picture of Vandana Nair

Vandana Nair

As a rare blend of engineering, MBA, and journalism degree, Vandana Nair brings a unique combination of technical know-how, business acumen, and storytelling skills to the table. Her insatiable curiosity for all things startups, businesses, and AI technologies ensures that there's always a fresh and insightful perspective to her reporting.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.