Listen to this story
Every AI company would say that they want to build something like ChatGPT. After the release of the OpenAI’s API, the chatbot can now be integrated into business for building products. This has been helpful for a lot of developers and companies. But even then, building such a technology from scratch is a far off game as it requires a lot of resources. But things have changed now!
Hugging Face has been at the forefront of the current open-source AI developments and it has proven itself yet again! Recently, the company published a blog about integrating Transformer Reinforcement Learning (TRL) with Parameter-Efficient Fine-Tuning (PEFT) for making a large language Model (LLM) with around 20 billion parameters fine-tunable on a 24GB consumer grade GPU. The best about this model is that it combines LLMs with RLHF, which is the powerful approach of ChatGPT as well.
Similarly, Colossal-AI had also taken up the herculean task and found a way to build one’s own ‘personal ChatGPT’ on less computing resources, as small as a GPU of only 1.62 GB of memory. Their implementation was based on PyTorch that covered all three stages—pre training, reward model training, and reinforcement learning. The speed was even faster by 10.3 times with 8 billion parameters on a single GPU, which is much larger while being cheaper than a $14,999 A100 80GB GPU.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
These achievements are significant because they solve one of the most significant barriers to entry in the generative AI space: Computational resources. There is no doubt by now that building an LLM or chatbot with billions of parameters is a high-computation task. As a result, only the biggest tech companies in the world have been able to do so until now and even they fail in replicating what OpenAI did.
But with these models, small- and medium-sized businesses can build their own language models without spending a fortune on computational resources. Now the ball is in everyone else’s court. Anyone can try it out on their own device. Who knows? Someone might actually beat OpenAI at their own game!
Download our Mobile App
The era of being dependent on OpenAI, or any other big-tech, is coming to an end and the reign of DIY chatbots is set to begin.
Until now, companies looking to implement AI chatbots have relied on third-party providers which are now offering pre-trained models like ChatGPT through APIs. However, with Hugging Face’s latest development, this could all change. Companies, or even developers, will now have the ability to build their own chatbots with similar capabilities, including their own weights and biases, if required.
This means, for starters, the company can tailor the chatbot to its specific needs rather than relying on a generic pre-trained model. A custom chatbot can understand the nuances of the company’s industry and can be trained to respond to specific queries and concerns that are relevant to the company. This level of customisation is impossible with third-party chatbots.
Another benefit of building a chatbot in-house is that the company can retain full control over the data that the chatbot collects. Third-party chatbot providers are notorious for collecting data from conversations and using it for their own purposes. This data can be highly sensitive and companies may not want to risk it falling into the wrong hands.
What This Means for Big-Tech
Microsoft recently announced their OpenAI Service offering ChatGPT on Azure. This is meant to be a groundbreaking announcement since companies were increasingly eager to use this technology for their own use cases. Along similar lines, Salesforce, Forethought, and Thoughtspot also released their betas offering the same services though, for obvious reasons, none came close to Microsoft’s offering.
Colossal-AI’s offering was more developer-focused. On the other hand, Hugging Face’s step shows more promise. Hugging Face recognises two limitations in their approach. One is that the training speed is still slower and the other is the challenge of how users are to expand it to multiple GPUs for data parallelism. But this is just the first step, hopefully.