OpenAI dumps its own GPT-3 for something called InstructGPT, and for right reason

Compared to GPT-3, InstructGPT produces fewer imitative falsehoods (according to TruthfulQA) and are less toxic (according to RealToxicityPrompts).
Top Free Resources To Learn GPT-3
Image © Top Free Resources To Learn GPT-3

Advertisement

OpenAI has trained language models that are much better at following user intentions than GPT-3. The InstructGPT models are trained with humans in the loop and are deployed as the default language models on the OpenAI API. The team claims to have made them more truthful and less toxic by using techniques developed through alignment research.

The OpenAI API is powered by GPT-3 language models that can perform natural language tasks using carefully engineered text prompts. But these models sometimes generate outputs that are untruthful, toxic, or reflect harmful sentiments.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

To make the models safer, helpful, and aligned, OpenAI used reinforcement learning from human feedback (RLHF) to fine-tune GPT-3. This has made the resulting InstructGPT models much better at following instructions than GPT-3.

InstructGPT models have been in beta on the API for more than a year. This is the first time that OpenAI has applied their alignment research to their product. 

Compared to GPT-3, InstructGPT produces fewer imitative falsehoods (according to TruthfulQA) and are less toxic (according to RealToxicityPrompts). The team also conducted human evaluations on their API prompt distribution, and found that InstructGPT makes up facts (“hallucinates”) less often, and generates more appropriate outputs.

According to OpenAI, InstructGPT “unlocks” the capabilities GPT-3 already had, but were difficult to elicit through prompt engineering alone. “This is because the training procedure has a limited ability to teach the model new capabilities relative to what is learned during pretraining, since it uses less than 2% of the compute and data relative to model pretraining,” according to their official blog.

OpenAI team also warned that, despite making significant progress, the InstructGPT models are far from fully aligned or fully safe and still generate toxic or biased outputs, make up facts, and generate sexual and violent content without explicit prompting. “But the safety of a machine learning system depends not only on the behavior of the underlying models, but also on how these models are deployed. To support the safety of our API, we will continue to review potential applications before they go live, provide content filters for detecting unsafe completions, and monitor for misuse.”

More Great AIM Stories

Meeta Ramnani
Meeta’s interest lies in finding out real practical applications of technology. At AIM, she writes stories that question the new inventions and the need to develop them. She believes that technology has and will continue to change the world very fast and that it is no more ‘cool’ to be ‘old-school’. If people don’t update themselves with the technology, they will surely be left behind.

Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MORE FROM AIM
Amit Raja Naik
Oh boy, is JP Morgan wrong?

The global brokerage firm has downgraded Tata Consultancy Services, HCL Technology, Wipro, and L&T Technology to ‘underweight’ from ‘neutral’ and slashed its target price by 15-21 per cent.