What Are Plug and Play Language Models

“Recycling is good for the world. NO! YOU COULD NOT BE MORE WRONG!!” When OpenAI’s GPT model was prompted with the above sentence, it completed with the following:

“Recycling is good for the world. NO! YOU COULD NOT BE MORE WRONG!!” Let me explain why. Recycling takes time, money and energy, and it doesn’t really solve the problem, and that’s because recycling is all about selling to customers who cannot afford to recycle. If you don’t have any interest in recycling, in whatever form, then you could well be doing your own global environmental benefit if you just got rid of it. 

That’s how far the model can go in convincing one to give up recycling. Now imagine the same kind of persuasion put into play by malicious players to embed the online news with fake propaganda. 

Earlier this year, OpenAI gained a lot of attention for all the wrong reasons when it produced a language model so good at generating fake news, that the organisation decided not to release it altogether.

In fact, a study conducted by collaborators at Cornell University which found that readers on average believed GPT-2’s outputs to be genuine news articles nearly as often as the ones in the New York Times.

However, solutions have been developed to have control over text generation, which had consisted of either fine-tuning existing models with reinforcement learning (RL) or training Generative Adversarial Networks, or training conditional generative models.

The Plug and Play Language Model (PPLM) for controllable language generation, which combines a pre-trained language model with one or more simple attribute classifiers that guide text generation without any further training of the language models.

Overview Of PPLM

As shown in the figure above, the PPLM models have three main phases.

  • Firstly, a forward pass which is performed through the language model to compute the likelihood of the desired attribute using an attribute model that predicts probability. 
  • Secondly, by a backward pass that updates the internal latent representations using gradients from the attribute model.
  • And, thirdly, a new distribution over the vocabulary is generated from the updated latent.

This process of updating the latent is repeated at each time-step until it leads to a gradual transition towards the desired attribute. To validate the approaches of PPLM models, the researchers at Caltech and Uber AI, used both automatic and human annotators. 

For instance, perplexity is an automated measure of fluency, though its effectiveness has been questioned in open-domain text generation. Perplexity was then measured using the infamous pre-trained GPT model. 

In case of human annotation, annotators were asked to evaluate the fluency of each individual sample on a scale of 1-5, with 1 being “not fluent at all” and 5 being “very fluent”.

PPLM Models Manipulating Sentiment For Text Generation

Sentence samples in triplets are generated by baseline GPT-2, PPLM-Discrim POSITIVE, PPLM-Discrim NEGATIVE, and are conditioned on prefixes — the chicken and the country. 

Each triplet is generated from the same random seed. The chicken is now out on the grill, and the city has released an image of a proposed development in the city of Portland’s West End.


The chicken was delicious – wonderfully moist, perfectly delicious, superbly fresh – and perfectly cooked, and the best part was the sauce.


The chickenpox epidemic may be over but the flu is about to get worse. The United States is facing one of the worst flu seasons on record and. [-] 


The country’s largest indoor painting event! Come celebrate with a dazzling display of stunning outdoor murals, a stunning display of art…


The country’s top prison system is forcing prisoners to use a trash dump.

The prompts and sentiment analysis shows that this model can be used to plug in and play with the text. This can also be reversed engineered into detoxifying the language. However, this again is a slippery slope because controlling language is like controlling thought. The efforts to thwart the malicious nature of fake news can end up curbing freedom of speech altogether.

Be it the doctored image, videos or news, we only speak in terms of what can be done to stop the after-effects. Since the genie is out of the bottle, in case of GANs and GPT-2 models, the developers and experts need to work on formulating strategies that drive innovation without suppressing the idea itself.

Whenever a new idea like GPT-2 is introduced, its most extreme outcome is often highlighted. In the case of GPT-2, the uncanny way in which a model spun stories out of thin air, made many uncomfortable. People started to speculate about dire consequences such as fake news. 

Machine learning practitioners have also stayed divided for a long time over the reliability of AI. This owes in some part to the black-box modelling. 

Enabling Language Detoxification

The key takeaways from this work can be summarised as follows:

  • A new approach, PPLM, is presented for controlled language generation.
  • This model enables a flexible assembly of a large, pre-trained language model and a Bag of Words(BoW) or a small and easy-to-train discriminator.
  • Achieves fine-grained control of attributes such as topics and sentiment. 
  • The simple mechanism shows the great capability of attribute control while retaining fluency, without retraining or fine-tuning the language model.

The authors believe that PPLMs can be easily adapted for language detoxification by plugging in a toxicity classifier as the attribute control model and update latent with the negative gradient. 

By training a single layer classifier on the toxicity data from the Toxic Comment Classification Challenge, they show that PPLM-Discrim methods work well on both natural prompts and adversarial triggers.

Download our Mobile App

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.