DeepMind’s AlphaFold 2 is half of the story

The idea was if I give you a sequence of amino acids, can you predict what will be the structure or the shape that it will take in the 3D space?
Listen to this story

Last year, Google’s research arm DeepMind released an open-source version of its deep-learning neural network AlphaFold 2, which essentially solved the 50-year-old grand challenge problem of protein folding. This computational method achieved higher prediction performance than much more expensive experimental methods like X-ray crystallography. 

What’s the big deal?  

As shown below, DeepMind’s AlphaFold 2 algorithm significantly outperformed other teams at the CASP14 protein-folding contest – and its previous version’s performance at the last CASP. 

In 2018, its previous version achieved a score of 58 on the hardest class of proteins. The second generation of AlphaFold achieved a score of 87, which is a huge improvement – 26 points better than the closest competition. For those unaware, a score above 90 is considered roughly equivalent to the experimentally determined structure. 

But, there is more

“Protein folding is just half of the story,” said Vikas K. Garg, co-founder and chief scientist at YaiYai. 

An IISc and MIT (PhD) alumnus, Garg is one of the lead researchers who worked on developing ‘Generative models for graph-based protein design,’ which served as one of the references to shape DeepMind’s AlphaFold. His firm YaiYai, which he co-founded alongside Tamar Pichkhadze, provides AI-based solutions in biopharma, energy, edtech, gaming, and fintech sectors to startups and governments, as well as leading companies across the globe. 

Let’s get real 

The human body is almost entirely made up of protein, besides water and fat. The folding of proteins is the underlying cause of many diseases. Understanding these protein folding, protein designs, etc., helps pave the way for finding a cure, designing new medicines, pharmaceutical solutions, drugs, etc. Last two years, because of the Covid-19 pandemic, the methodologies or techniques, such as AlphaFold, have been gaining renewed focus. 

Here’s how it works

People get diseases, maybe cold or fever or Covid-19. So we have these protein targets in our bodies, and there are different kinds of proteins throughout our bodies, in our cells. So, when we are trying to give them a tablet or a drug, we identify certain specific proteins in the human body, where we want these molecules to go and bind with these proteins. 

“You should think of it like how you would open a lock. So, think of protein as a lock and your drug or a molecule as a key,” explained Garg. 

He said if you do not open the right lock, there is a risk that you will open something else, which will lead to side effects. To minimise the side effects, and at the same time, you want to cure someone’s health – what you would do is, administer this protein, which is in the 3d objects or shape. 

Think of it like these balls (proteins) coming together. Once you find the right key to the lock, the activity is restored. In other words, the drug has the effect of restoring the normal functionality of the human body. 

Simply put, there are two parts to this process. First is the target identification of proteins we want to target for a particular disease. The second is finding the right drugs that can have the desired therapeutic effect—for example, Remdesivir injection in treating coronavirus. 

“So, you want these drugs to technically open every lock. For that, you need just the right amount of the right key. You don’t want to open too many locks. You want to open only the lock you are looking for,” said Garg. 

Here’s how you do it

There are two ways, either you could design new proteins (new locks) and find a cure (key), or identify new proteins and inject molecules. 

Why is AlphaFold half the story? 

“Now, when we talk about AlphaFold, for example, and protein fold–this is only the half of the story, probably a quarter of the story,” stressed Garg. 

He said that proteins are very complex structures. There are lots of properties that have evolved over centuries or millennia. For example, some of the proteins are known to be more biologically stable compared to others. “The space where we are operating is humongous,” he added, saying that it is really hard to find the right drugs or generate new proteins. 

“This is where the fun part lies,” said Garg, “This is also where I want to emphasise why quantum holds so much promise. Of course, AI has been making very rapid progress. But, quantum gives you this flexibility.” 

If done well, they have tremendous potential because they will be able to search through this humongous space very quickly. “Much more quickly than any classical methods today,” said Garg. 

Throwing light on AlphaFold, Garg said: “Proteins are very compact structures.” For instance, each protein can be viewed as a sequence of amino acids. There are 20 amino acids in our body. This includes histidine, isoleucine, methionine, leucine, lysine, phenylalanine, etc. “Think of it as a chain, where at each location, and each position in that chain, you are labelling it as one of these 20 amino acids,” explained Garg, saying that this is a very complex problem. 

“Now, imagine, 20 raised to the power of 1000 possibilities,” added Garg, underlining the complexity of solving this problem. 

Inverse protein folding 

“The idea was if I give you a sequence of amino acids, can you predict what will be the structure or the shape that it will take in the 3D space?” said Garg. He said that this is a problem AlphaFold solved using deep learning. 

“Of course, it is a useful thing, but it is not the end goal,” he added, saying that it is just a part of the process. “The end goal is that you want to design new proteins,” said Garg, proposing an inverse protein folding technique. “I will give you the shape of the lock, but I will not tell you which is the right key to open the lock. Now, the idea is can I map the 3D structure to the sequence,” said Garg.  

How does it work? “We created these 3D structures, where we used very advanced deep learning methods to predict which amino acids or new proteins can be generated. Because we already have the structure, you can fill in the amino acids at each position, and you get a new protein,” said Garg.

He said that the potential of inverse protein folding is immense; you can design therapeutics, new materials, synthesise new batteries for electric vehicles, etc.

Download our Mobile App

Amit Raja Naik
Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week.