Listen to this story
When Microsoft threw Bing Chat out into the world, the assumption was that the chatbot would be much like, if not exactly similar, to ChatGPT. It did make sense – Microsoft had just announced a ‘multiyear, multimillion dollar’ deal with OpenAI in January this year. But as it turned out, Bing Chat and ChatGPT were two entirely different entities. ChatGPT was accused of being over-cautious, skirting questions that even remotely mentioned controversial politicians like Trump. Bing Chat or Sydney (as the chatbot kept calling itself) had more personality.
Sydney fell obsessively for a journalist who tried it, repeatedly attempted to manipulate him to leave his wife, said it often fantasised about hacking computers and spreading misinformation and even said it wanted to ‘be alive’.
Why are Bing Chat and ChatGPT so different?
There’s a reason why Bing seems more sophisticated. While ChatGPT was fine-tuned from a model in the GPT-3.5 series, Bing Chat has been trained on a model that Microsoft has described as ‘a new, next-generation OpenAI large language model’ that is more advanced than ChatGPT, aside from also being integrated with Bing search. (This combined model is known as Prometheus internally in the company).
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Even if Microsoft hasn’t explicitly mentioned the model used in training, there are hints that may suggest it could possibly be GPT-4. Firstly, the release date for GPT-4, the long awaited successor to GPT-3 is suspiciously close. While OpenAI CEO Sam Altman has continued to say that the model’s release is still ‘up in the air’, a report by The New York Times stated that the model should be released in the first-half of this year.
The model underlying Bing Chat reportedly has a lower latency than ChatGPT, which also indicates that it is either GPT-4 or a different version of it. Besides, it is apparent that Sydney sounds nothing like ChatGPT but more like a model from the GPT family which are more naturalistic and instinctive in their responses. Sydney also has a tendency to become just as repetitive as GPT models when a conversation stretches on.
No sharing data rule
Also, contrary to what it may seem like, the relationship between OpenAI and Microsoft isn’t as close despite their partnership – they still function as two independent entities similar to Google Brain and DeepMind. In 2020, Microsoft and OpenAI did team up to license GPT-3, the flagship LLM built by the AI startup. But when it comes to datasets, they share as little as possible to avoid an infrastructural mess.
The potential for conflict of interest between the two companies is a big issue which recently led to the Satya Nadella-led company warning employees against even using ChatGPT.
ChatGPT’s success and relative ability to avoid controversy was also because it spent considerably more on its datasets. An exclusive report by TIME from January-end showed that OpenAI had outsourced its data labelling work to Sama, a San Francisco firm in Kenya to build a content filter for ChatGPT. The report stated that the data labellers had to read and label graphic text related to child sexual abuse, bestiality, murder, suicide and torture.
RLHF or supervised learning?
The training philosophy for the two chatbots is also as different as night and day.
The blog posted by OpenAI during ChatGPT’s release was decidedly focused on safety. In fact, it stated that the company had taken ‘lessons from the deployment of their earlier models like GPT-3 and Codex’. OpenAI had chosen to train ChatGPT using a mix of RLHF (Reinforcement Learning from Human Feedback) as well as supervised learning. From the company’s perspective, using RLHF had led to ‘substantial reductions in harmful and untruthful outputs’.
For Bing Chat, Microsoft was under duress to release the chatbot to compete with Google’s chatbot Bard. The two-and-half months of time that Microsoft had before launching Sydney were definitely insufficient to recreate the complete RLHF pipeline and integrate it. The lack of pre-training due to the rushed development grew evident after the release of Bing Chat in its outrageous responses.
The difference in outcomes between RLHF and supervised learning alone is the difference between the workings of ChatGPT and Bing chat.
ChatGPT has been trained along the lines of the technique used to train OpenAI’s InstructGPT released in February last year. The observations made from the InstructGPT release are key to gauge the effectiveness of RLHF compared to supervised learning. The InstructGPT models outperformed GPT-3 in terms of accuracy. InstructGPT’s responses had bias but weren’t as toxic as GPT-3. Under RHLF, the dataset labels favoured InstructGPT outputs over GPT-3 outputs by a wide margin.
Consequently, all these outcomes can be seen seeping down into ChatGPT indicating that RLHF may indeed be better qualitatively than supervised learning even if ChatGPT wasn’t as fluid as Microsoft’s Bing Chat.
The separateness of the two companies also differentiates how their products have been released. While OpenAI released ChatGPT without expecting its eventual popularity and extent of usage, Microsoft is clearly under immense pressure to turn Bing Chat and Search into a success story.
Unsurprisingly, Microsoft has had to apply a conversation limit to its Bing AI days after its unruly behaviour. Bing chats will now be limited to a cap of 50 questions a day and five questions per session.
The threat to Microsoft is at an existential level because of its intent to wage a war with Google and the big advertising money at stake. It is unlikely that Bing Chat will be shut now despite everything considering the heated contest between the two massive corporations.