Can Stable Diffusion Make Mind-Reading a Reality?

New research proposes the reconstruction of high-resolution images from human brain activity; opens doors to future technology
Listen to this story

Steven Speilberg’s 2002 sci-fi hit Minority Report showed a ‘pre-crime police department’ that would prevent future crimes before it was committed thanks to three clairvoyant humans (precogs). This was a concept beyond imagination. The precogs were genetically engineered to foresee future crime, which the police could see via a video projection from their minds. 

We may be far from getting there yet, but we are surely advancing on the path of generating images from the human brain.  

Two researchers in Japan, Yu Takagi and Shinji Nishimoto, recently, submitted a paper where diffusion models (DMs) like Stable Diffusion were used to generate high-resolution images from human brain activity. A study was proposed where images are reconstructed with the use of fMRI (functional magnetic resonance imaging). The goal was to interpret the connection between computer vision models and our visual system. By reconstructing visual experiences from human brain activities, the way a human brain processes visual information can be ascertained. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

fMRI, used for image reconstruction, measures brain activity by detecting changes associated with blood flow. The technique combines cerebral blood flow and neuronal activation. In the proposed paper, high-resolution images were reconstructed with high fidelity without any additional training or fine-tuning of complex deep-learning models.        

Each component of the LDM is mapped to specific components of the brain regions. 

There have been previous attempts to reconstruct visual images from fMRI, however, newer studies use deep generative models trained on a large number of naturalistic images. There is a limitation to these methods though. Training and fine-tuning of generative models such as GANs (Generative Adversarial Networks), a type of neural network architecture, with the dataset used in fMRI experiments is challenging as the sample size in neuroscience is small. However, DMs and LDMs (latent diffusion models) have the ability to generate high-resolution images with high semantic fidelity of text conditioning, and high computational efficiency.  

Latent Diffusion Model

LDM is a type of computer program that can learn to create images by transforming a simple noise pattern into a complex image. LDM can be trained with a dataset of images from which it learns to create new images that look similar to the training data. Once trained, the model will be able to create images by starting with a random noise pattern and gradually transforming it into an image that looks like it belongs in the dataset. 

In the proposed paper, each component of an LDM (Stable Diffusion) is quantitatively interpreted from a neuroscience perspective by mapping specific components to distinct brain regions. 

Image Source: sites.google

The encoder-decoder model is used where one neural network (the encoder) is used to transform the input data into a fixed-length representation and then use another neural network (the decoder) to generate the output based on this encoding. 

Image Source: biorxiv.org 
Row 1: Presented images. Row 2 : Images reconstructed from fMRI signals

The research also worked with prediction accuracy of the encoding models for three types of latent representations associated with the diffusion model. Latent representations are compressed, abstract features or variables of the data that capture the most relevant and useful information inferred from raw data, for a particular task. A latent representation of the original image- z, a latent representation of image text annotation- c, and zc which is a noise-added latent representation of z after the reverse diffusion process with cross-attention to c.

Image Source: biorxiv.org

Mind Reading? 

With the announcement of this paper, people have been quick to react with the consideration of this model becoming the next mind reader. However, this model is not trained to interpret thoughts and words. The model is an AI extension on previous studies of brain mapping through fMRI or electroencephalography (EEG), where the imaging machine is able to detect only broad patterns of activity. The proposed model is still in the nascent stages of interpreting brain activities. 

Future Scope 

Brain mapping is already implemented in the medical sector in diagnosing and understanding a patients’ illnesses pertaining to triggers and tumors. With focussed brain readings, doctors are able to deliver targeted treatments. With image reconstruction from brain activities using LDM, the integration with an already existing framework of brain mapping, can bring advancements in the medical field. 

If the proposed model comes into the picture, future refinement of the model can probably assist with jobs where such a model can be extensively implemented. For example, in crime, an eyewitness testimony is influenced by the mental state and surroundings of the witness, which can often cloud the description of the suspect. With this model, eyewitness or victim’s recollection of the suspect will become simpler. However, the implementation of such a technology will bring the focus on ethical mind reading. 

Vandana Nair
As a rare blend of engineering, MBA, and journalism degree, Vandana Nair brings a unique combination of technical know-how, business acumen, and storytelling skills to the table. Her insatiable curiosity for all things startups, businesses, and AI technologies ensures that there's always a fresh and insightful perspective to her reporting.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR