2022 is going to be the year of Foundational Models in AI research

The foundational models are based on concepts that have been around for a while, like deep neural networks and self-supervised learning.

Last year, researchers at Stanford University published a paper detailing a ‘paradigm shift’ in AI, driven by ‘Foundational Models.’ The term may be fresh, but the idea behind it was not. Foundational model refers to models trained to process data at scale and can be adapted to perform multiple tasks: It is a single model underlying other downstream models, hence the ‘foundation’ prefix.

The foundational models are based on concepts that have been around for a while, like deep neural networks and self-supervised learning. The tipping point for foundational models came in 2018 when Google developed BERT, a natural language processing model, followed by OpenAI’s GPT3 and CLIP. 


Student loans are a crippling problem in the United States, with the total debt hovering around $1.6 trillion-even exceeding credit card arrears combined. Foundational models can be effective in teaching and learning processes.

Existing work has focused on custom solutions to highly specific tasks for which large amounts of training data has to be collected from scratch, the paper said. Due to the difficulty and cost of creating large datasets, using this approach to solve every educational task independently is fundamentally limited. Instead, we need general-purpose approaches that are reusable across various tasks and subjects, the paper said. For example, MathBERT is a system created to retrace the path of a student’s understanding of a subject based on their past responses. 

The foundational models can be fine-tuned to millions of parameters and implemented in more advanced ways for tasks like clearing students’ doubts. 


Healthcare and medical services in the United States rank poorly in terms of access, administrative efficiency despite a disproportionate amount of its budget being spent on healthcare (17% of GDP). The strong adaptive ability of foundational models can fill gaps in healthcare infrastructure. 

Especially with the ongoing pandemic, foundational models can be used to develop automated chatbots for urgent diagnoses, summarise patient records and pull relevant events from the patient history. Such models can also expedite the drug discovery process. In addition, foundation models can merge multiple data modalities in medicine to facilitate investigating biomedical concepts from multiple scales (molecular, patient and demographic data) and diverse knowledge sources (imaging, textual and chemical descriptions).

Biases and accuracy

Large models like BERT gained popularity because of their ease of use and adaptability. Codex, DALL-E, T-5 and CLIP are used everywhere, from speech recognition and coding to computer vision. GPT-3 forms the basis for interactive gaming, search engines and softwares to create low-code applications on Microsoft Azure. However, the findings of the Stanford research highlighted the pitfalls of GPT3.

An automated tweet generator based on the model went rogue. Another paper by Stanford PhD candidate Abubakar Abid stated that GPT3’s text tended to associate ‘Jews’ with ‘money.’ The dangers of large-scale use cases are obvious: the racist and offensive stereotypes would trickle downstream models and perpetuate the bias.

Additionally, a medical chatbot that used GPT3 advised a suicidal patient to ‘kill themselves.’ According to a research by UC Berkeley titled, ‘Measuring Mathematical Problem Solving with the MATH Dataset,’ the accuracy of large language models is low, between 3% and 6.9%.

Status quo

Until now, the advantages of foundational models are far too enticing to pass. OpenAI has continued to allow developers access to GPT3 despite the criticism. Facebook’s AI Research team has introduced FLAVA, a next-generation foundational model designed to work on vision, language, and vision-and-language tasks simultaneously. Google’s Cloud business has also built a system based on foundational models to detect defects in physical products.

Andrew Moore, the former dean at Carnegie Mellon University’s computer science department, who is currently heading AI and industry solutions for Google’s Cloud segment, acknowledged the shift towards large-language models. It is simply more convenient to have a basic algorithm that is a building block and can be altered in millions of ways to build increasingly sophisticated AI softwares. The potentially dangerous effects of the biases can be mitigated with cautious human review, he said. Google’s Cloud service has two AI ethics review committees for this explicit purpose, called ‘Iced Tea’ and ‘Lemonaid.’ 

Download our Mobile App

Poulomi Chatterjee
Poulomi is a Technology Journalist with Analytics India Magazine. Her fascination with tech and eagerness to dive into new areas led her to the dynamic world of AI and data analytics.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.