MITB Banner

ChatGPT Alternatives

We are exploring some well-known large language model alternatives built on transformer models!
Share
Listen to this story

While ChatGPT continues to make headlines for all the right and wrong reasons, the human-like chatbot seems to perform a host of tasks, ranging from solving mathematical problems to generating codes, writing essays and more. 

This new tool from OpenAI is already changing people’s perspectives and the way they search for information by answering intricate questions. However, it is not entirely alone in this space. In this article, we are exploring some well-known large language model alternatives built on transformer models—similar to GPT-3 and BERT.

LaMDA

Developed by Google with 137 billion parameters, LaMDA was a revolution in the natural language processing world. It was built by fine-tuning a group of Transformer-based neural language models. For pre-training, the team created a dataset of 1.5 trillion words that is 40 times more than previously developed models. LaMDA has already been used for zero-shot learning, programme synthesis, and BIG-bench workshop.

Blender Bot 2

Meta’s chatbot—Blender Bot 2, the third iteration of which was released a few months ago. The conversational AI prototype is based on 175 billion parameters and has its own long-term memory. The model uses dialogue history, the internet, and memory to produce output.

Alexa Teacher Model

Alexa Teacher Model (ATM) is a large language model with 20 billion parameters. AlexaTM 20B is a seq-2-seq language model with SOTA capabilities for few shot learning. What makes it different from others is that it has an encoder and decoder for increasing performance on machine translation. With 1/8 number of parameters, the language model by Amazon outperformed GPT-3 on SQuADv2 and SuperGLUE benchmarks. 

DialoGPT

DialoGPT is a large-scale, pre-trained dialogue response generation model for multi-turn conversations. The model is trained on 147 million multi-turn dialogues from Reddit discussion threads.

Godel

Godel evolved out of Microsoft’s 2019 DialoGPT project. The model combines two functionalities in a single model. The first is task-oriented, and the second is making the dialogue more realistic and social. Most chatbots are focused on being one or the other. So, for example, the Godel  can recommend a restaurant and simultaneously engage in a conversation about sports or weather games and then bring the conversation back on track. 

Sparrow

Deepmind’s  AI chatbot ‘Sparrow’ is a “useful dialogue agent that reduces the risk of unsafe and inappropriate answers”. It is trained to converse with a user, answer queries and even search the internet using Google to provide evidence to inform its responses.

Besides reinforcement learning, Sparrow is based on Chinchilla—consisting of 70 billion parameters—that handily makes inferences and fine-tunes comparatively lighter tasks. Moreover, it is created with 23 rules to prevent it from delivering biased and toxic answers. 

However, the model was taken down for improvements. 

Galactica

In November 2022, Meta released Galactica as an open-source large language model trained on scientific knowledge, with 120 billion parameters.The AI-generative tool was intended to assist academic researchers by producing extensive literature reviews, generating Wiki articles on any topic, accessing lecture notes on scientific texts, producing answers to questions, solving complex mathematical solutions, annotating molecules and proteins, and more. 

However, when members of the community started using the all-new AI model by Meta, many found the results to be inaccurate thereby forcing the tech giant to take down the model within days following its launch.

Resources

PS: The story was written using a keyboard.
Share
Picture of Aparna Iyer

Aparna Iyer

Aparna Iyer has covered various sectors spanning education, wildlife, culture and law for close to a decade. She now writes on technology and is keen to unearth its capability for public good.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India