Top 7 ChatGPT Alternatives

We are exploring some well-known large language model alternatives built on transformer models!
Listen to this story

While ChatGPT continues to make headlines for all the right and wrong reasons, the human-like chatbot seems to perform a host of tasks, ranging from solving mathematical problems to generating codes, writing essays and more. 

This new tool from OpenAI is already changing people’s perspectives and the way they search for information by answering intricate questions. However, it is not entirely alone in this space. In this article, we are exploring some well-known large language model alternatives built on transformer models—similar to GPT-3 and BERT. 

LaMDA

Developed by Google with 137 billion parameters, LaMDA was a revolution in the natural language processing world. It was built by fine-tuning a group of Transformer-based neural language models. For pre-training, the team created a dataset of 1.5 trillion words that is 40 times more than previously developed models. LaMDA has already been used for zero-shot learning, programme synthesis, and BIG-bench workshop.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Blender Bot 2 

Meta’s chatbot—Blender Bot 2, the third iteration of which was released a few months ago. The conversational AI prototype is based on 175 billion parameters and has its own long-term memory. The model uses dialogue history, the internet, and memory to produce output.

Alexa Teacher Model

Alexa Teacher Model (ATM) is a large language model with 20 billion parameters. AlexaTM 20B is a seq-2-seq language model with SOTA capabilities for few shot learning. What makes it different from others is that it has an encoder and decoder for increasing performance on machine translation. With 1/8 number of parameters, the language model by Amazon outperformed GPT-3 on SQuADv2 and SuperGLUE benchmarks. 


Download our Mobile App



DialoGPT

DialoGPT is a large-scale, pre-trained dialogue response generation model for multi-turn conversations. The model is trained on 147 million multi-turn dialogues from Reddit discussion threads.

Godel

Godel evolved out of Microsoft’s 2019 DialoGPT project. The model combines two functionalities in a single model. The first is task-oriented, and the second is making the dialogue more realistic and social. Most chatbots are focused on being one or the other. So, for example, the Godel  can recommend a restaurant and simultaneously engage in a conversation about sports or weather games and then bring the conversation back on track. 

Sparrow

Deepmind’s  AI chatbot ‘Sparrow’ is a “useful dialogue agent that reduces the risk of unsafe and inappropriate answers”. It is trained to converse with a user, answer queries and even search the internet using Google to provide evidence to inform its responses.

Besides reinforcement learning, Sparrow is based on Chinchilla—consisting of 70 billion parameters—that handily makes inferences and fine-tunes comparatively lighter tasks. Moreover, it is created with 23 rules to prevent it from delivering biased and toxic answers. 

However, the model was taken down for improvements. 

Galactica

In November 2022, Meta released Galactica as an open-source large language model trained on scientific knowledge, with 120 billion parameters.The AI-generative tool was intended to assist academic researchers by producing extensive literature reviews, generating Wiki articles on any topic, accessing lecture notes on scientific texts, producing answers to questions, solving complex mathematical solutions, annotating molecules and proteins, and more. 

However, when members of the community started using the all-new AI model by Meta, many found the results to be inaccurate thereby forcing the tech giant to take down the model within days following its launch.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Aparna Iyer
Aparna Iyer has covered various sectors spanning education, wildlife, culture and law for close to a decade. She now writes on technology and is keen to unearth its capability for public good.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Council Post: From Promise to Peril: The Pros and Cons of Generative AI

Most people associate ‘Generative AI’ with some type of end-of-the-world scenario. In actuality, generative AI exists to facilitate your work rather than to replace it. Its applications are showing up more frequently in daily life. There is probably a method to incorporate generative AI into your work, regardless of whether you operate as a marketer, programmer, designer, or business owner.