MITB Banner

How this Amsterdam-based AI Startup is Building ‘ChatGPT for Protein Formulas’

Cradle brings generative AI for DNA and protein sequences.

Share

Illustration by Nikhil Kumar

Listen to this story

When NVIDIA chief Jensen Huang, spoke about how he used ChatGPT to understand how generative AI can be used for solving real-world problems such as dissolving plastics, reducing carbon emissions and more, little did we know that a European AI startup will use LLMs for DNA and protein sequences to address this very problem, and actual use cases are emerging.

“About 60% of the things that we consume today, whether they’re drugs or food or chemicals, you could be making through biological means. That just felt a lot more impactful than some of the other applications that people were working on,” said Stef van Grieken, co-founder and CEO of Cradle, in an exclusive interaction with AIM

Engineering Biology with LLMs

Cradle, is a European biotech startup that employs AI to help scientists design and engineer proteins faster and cost-effectively. The AI startup focuses on engineering protein modalities such as enzymes, vaccines, peptides, and antibodies with the help of generative AI. 

Akin to a ChatGPT, where you give it an equation and get an answer, or a diffusion model, where you give it a prompt and get a picture, at Cradle, a DNA description or how a molecule looks is inputted and what needs to be done with it is added. For instance, bind to a particular thing on the cell, be stable, or be soluble in water. 

“What it does is it generates another set of sequences that you can bring into your laboratory that have a much higher probability of doing that,” said Grieken. “Instead of diffusing a picture, you’re diffusing a molecule.” 

Similar to how GPT is trained by infilling, which is when you remove words from sentences and ask the model to fill in, Cradle works on a similar model except that is done for DNA and protein sequences. 

With these models, the number of advancements surpassing previous benchmarks and the scale of their enhancement are approximately double that of previous methods.“This means that you reach your target twice as fast over the duration of an R&D project,” said Grieken.  

“A lot of the work that companies like Google, Facebook and others are doing is more in machine learning research and development. They’re not trying to build tools that help biologists use these types of methods in a sort of easy fashion,” he said.

Cradle works on proprietary models with inspiration from open-source models such as Transformer-based Bert. “In terms of technology capabilities in biology, such as molecular biology, we’re still much like GPT 0.5,” he said. 

Data and Feedback Loops Remain Challenging

The scarcity of data on proteins impedes the speed at which such models are developed, especially, when compared to training GPT models with all the information that is available on the internet. “Training these models on public data is really hard to do. It’s one of the reasons why we have our in-house laboratory to effectively build training sets for these machine learning models to learn faster,” said Grieken.  

The slow feedback loops for these models also impedes progress. Grieken compares the process to GPT models, where an instant feedback on the generated results, if it is wrong, bad, or right, can help instantly train the models. “In our case, it takes three months between the thing being generated and results coming back,” he said. Furthermore, the cost of generating results is high and can range between $30 and $1000s per data point. 

To Make the World A Better Place 

Cradle solves a number of real-world problems associated with medical research, especially when it comes to time, cost and logistics accessibility. Many vaccines are hard to distribute in different parts of the world, owing to cold storage and distribution networks. 

“If you can develop certain drugs that work at room temperature, you can bring them to more places in the world, which is helpful, so you can end up with a better product,” said Grieken.  

Grieken also believes that if the amount of time and money required to bring out solutions for curing diseases or moving away from petrochemical oil-based products to more bio-based products are reduced, there will be a lot more of these types of products entering the market. 

“I have two small daughters. Twenty years from now they will ask me what I did when the Earth caught fire, and the answer, ‘I was working for an advertising company’, is probably not the best one. So, try to make use of my time,” joked Grieken when asked about the inspiration to start Cradle. 

Coming from a vast experience in a big-tech company like Google, Grieken recommends that everyone must work for a large tech company for a while and then go build something else once the learnings are gathered. 

“I’m incredibly grateful to Google. First of all, they teach you how to do engineering. Secondly, I was fortunate to be at Google around the time when language models started to emerge,” said Grieken, who considers himself to be incredibly lucky to be there in the early days. 

Cradle has raised a total funding of $29.7 million, and has two offices in Amsterdam in the Netherlands and Zurich in Switzerland. 

Share
Picture of Vandana Nair

Vandana Nair

As a rare blend of engineering, MBA, and journalism degree, Vandana Nair brings a unique combination of technical know-how, business acumen, and storytelling skills to the table. Her insatiable curiosity for all things startups, businesses, and AI technologies ensures that there's always a fresh and insightful perspective to her reporting.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.