Listen to this story
While ChatGPT continues to make headlines for all the right and wrong reasons, the human-like chatbot seems to perform a host of tasks, ranging from solving mathematical problems to generating codes, writing essays and more.
This new tool from OpenAI is already changing people’s perspectives and the way they search for information by answering intricate questions. However, it is not entirely alone in this space. In this article, we are exploring some well-known large language model alternatives built on transformer models—similar to GPT-3 and BERT.
Developed by Google with 137 billion parameters, LaMDA was a revolution in the natural language processing world. It was built by fine-tuning a group of Transformer-based neural language models. For pre-training, the team created a dataset of 1.5 trillion words that is 40 times more than previously developed models. LaMDA has already been used for zero-shot learning, programme synthesis, and BIG-bench workshop.
Blender Bot 2
Meta’s chatbot—Blender Bot 2, the third iteration of which was released a few months ago. The conversational AI prototype is based on 175 billion parameters and has its own long-term memory. The model uses dialogue history, the internet, and memory to produce output.
Alexa Teacher Model
Alexa Teacher Model (ATM) is a large language model with 20 billion parameters. AlexaTM 20B is a seq-2-seq language model with SOTA capabilities for few shot learning. What makes it different from others is that it has an encoder and decoder for increasing performance on machine translation. With 1/8 number of parameters, the language model by Amazon outperformed GPT-3 on SQuADv2 and SuperGLUE benchmarks.
DialoGPT is a large-scale, pre-trained dialogue response generation model for multi-turn conversations. The model is trained on 147 million multi-turn dialogues from Reddit discussion threads.
Godel evolved out of Microsoft’s 2019 DialoGPT project. The model combines two functionalities in a single model. The first is task-oriented, and the second is making the dialogue more realistic and social. Most chatbots are focused on being one or the other. So, for example, the Godel can recommend a restaurant and simultaneously engage in a conversation about sports or weather games and then bring the conversation back on track.
Deepmind’s AI chatbot ‘Sparrow’ is a “useful dialogue agent that reduces the risk of unsafe and inappropriate answers”. It is trained to converse with a user, answer queries and even search the internet using Google to provide evidence to inform its responses.
Besides reinforcement learning, Sparrow is based on Chinchilla—consisting of 70 billion parameters—that handily makes inferences and fine-tunes comparatively lighter tasks. Moreover, it is created with 23 rules to prevent it from delivering biased and toxic answers.
However, the model was taken down for improvements.
In November 2022, Meta released Galactica as an open-source large language model trained on scientific knowledge, with 120 billion parameters.The AI-generative tool was intended to assist academic researchers by producing extensive literature reviews, generating Wiki articles on any topic, accessing lecture notes on scientific texts, producing answers to questions, solving complex mathematical solutions, annotating molecules and proteins, and more.
However, when members of the community started using the all-new AI model by Meta, many found the results to be inaccurate thereby forcing the tech giant to take down the model within days following its launch.