MITB Banner

DeepMind’s AlphaFold 2 is half of the story

The idea was if I give you a sequence of amino acids, can you predict what will be the structure or the shape that it will take in the 3D space?
Share

Illustration by Analytics India Magazine

Listen to this story

Last year, Google’s research arm DeepMind released an open-source version of its deep-learning neural network AlphaFold 2, which essentially solved the 50-year-old grand challenge problem of protein folding. This computational method achieved higher prediction performance than much more expensive experimental methods like X-ray crystallography. 

What’s the big deal?  

As shown below, DeepMind’s AlphaFold 2 algorithm significantly outperformed other teams at the CASP14 protein-folding contest – and its previous version’s performance at the last CASP. 

In 2018, its previous version achieved a score of 58 on the hardest class of proteins. The second generation of AlphaFold achieved a score of 87, which is a huge improvement – 26 points better than the closest competition. For those unaware, a score above 90 is considered roughly equivalent to the experimentally determined structure. 

But, there is more

“Protein folding is just half of the story,” said Vikas K. Garg, co-founder and chief scientist at YaiYai. 

An IISc and MIT (PhD) alumnus, Garg is one of the lead researchers who worked on developing ‘Generative models for graph-based protein design,’ which served as one of the references to shape DeepMind’s AlphaFold. His firm YaiYai, which he co-founded alongside Tamar Pichkhadze, provides AI-based solutions in biopharma, energy, edtech, gaming, and fintech sectors to startups and governments, as well as leading companies across the globe. 

Let’s get real 

The human body is almost entirely made up of protein, besides water and fat. The folding of proteins is the underlying cause of many diseases. Understanding these protein folding, protein designs, etc., helps pave the way for finding a cure, designing new medicines, pharmaceutical solutions, drugs, etc. Last two years, because of the Covid-19 pandemic, the methodologies or techniques, such as AlphaFold, have been gaining renewed focus. 

Here’s how it works

People get diseases, maybe cold or fever or Covid-19. So we have these protein targets in our bodies, and there are different kinds of proteins throughout our bodies, in our cells. So, when we are trying to give them a tablet or a drug, we identify certain specific proteins in the human body, where we want these molecules to go and bind with these proteins. 

“You should think of it like how you would open a lock. So, think of protein as a lock and your drug or a molecule as a key,” explained Garg. 

He said if you do not open the right lock, there is a risk that you will open something else, which will lead to side effects. To minimise the side effects, and at the same time, you want to cure someone’s health – what you would do is, administer this protein, which is in the 3d objects or shape. 

Think of it like these balls (proteins) coming together. Once you find the right key to the lock, the activity is restored. In other words, the drug has the effect of restoring the normal functionality of the human body. 

Simply put, there are two parts to this process. First is the target identification of proteins we want to target for a particular disease. The second is finding the right drugs that can have the desired therapeutic effect—for example, Remdesivir injection in treating coronavirus. 

“So, you want these drugs to technically open every lock. For that, you need just the right amount of the right key. You don’t want to open too many locks. You want to open only the lock you are looking for,” said Garg. 

Here’s how you do it

There are two ways, either you could design new proteins (new locks) and find a cure (key), or identify new proteins and inject molecules. 

Why is AlphaFold half the story? 

“Now, when we talk about AlphaFold, for example, and protein fold–this is only the half of the story, probably a quarter of the story,” stressed Garg. 

He said that proteins are very complex structures. There are lots of properties that have evolved over centuries or millennia. For example, some of the proteins are known to be more biologically stable compared to others. “The space where we are operating is humongous,” he added, saying that it is really hard to find the right drugs or generate new proteins. 

“This is where the fun part lies,” said Garg, “This is also where I want to emphasise why quantum holds so much promise. Of course, AI has been making very rapid progress. But, quantum gives you this flexibility.” 

If done well, they have tremendous potential because they will be able to search through this humongous space very quickly. “Much more quickly than any classical methods today,” said Garg. 

Throwing light on AlphaFold, Garg said: “Proteins are very compact structures.” For instance, each protein can be viewed as a sequence of amino acids. There are 20 amino acids in our body. This includes histidine, isoleucine, methionine, leucine, lysine, phenylalanine, etc. “Think of it as a chain, where at each location, and each position in that chain, you are labelling it as one of these 20 amino acids,” explained Garg, saying that this is a very complex problem. 

“Now, imagine, 20 raised to the power of 1000 possibilities,” added Garg, underlining the complexity of solving this problem. 

Inverse protein folding 

“The idea was if I give you a sequence of amino acids, can you predict what will be the structure or the shape that it will take in the 3D space?” said Garg. He said that this is a problem AlphaFold solved using deep learning. 

“Of course, it is a useful thing, but it is not the end goal,” he added, saying that it is just a part of the process. “The end goal is that you want to design new proteins,” said Garg, proposing an inverse protein folding technique. “I will give you the shape of the lock, but I will not tell you which is the right key to open the lock. Now, the idea is can I map the 3D structure to the sequence,” said Garg.  

How does it work? “We created these 3D structures, where we used very advanced deep learning methods to predict which amino acids or new proteins can be generated. Because we already have the structure, you can fill in the amino acids at each position, and you get a new protein,” said Garg.

He said that the potential of inverse protein folding is immense; you can design therapeutics, new materials, synthesise new batteries for electric vehicles, etc.

PS: The story was written using a keyboard.
Share
Picture of Amit Raja Naik

Amit Raja Naik

Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.