OpenAI Junks Diffusion for Consistency Models

Consistency models, on the other hand, are a single-step generation which is faster and is able to trade-off compute for sample quality when necessary.
Listen to this story

In 2021, OpenAI chief Sam Altman wrote a blog discussing how Moore’s Law—the theory that semiconductor chips would become twice as powerful for the same price around every two years—should be applicable for everything. Altman tweeted about the leapfrogs that AI was making, saying, ”A new version of Moore’s law that could start soon: the amount of intelligence in the universe doubles every 18 months”. 

Introducing consistency models

To others, Altman’s optimism may seem unwarranted but OpenAI’s pace of research seems to back up his claims. Last week, the startup published a paper discussing a new class of generative models titled, ‘Consistency Models’, that outperformed diffusion models. Authored by Yang Song, Prafulla Dhariwal, Mark Chen and OpenAI co-founder Ilya Sutskever, the study was released on March 3, 2023. 

Diffusion models have become the foundation of the revolution in generative AI since they took over GANs as the most effective models for image synthesis. Some of the most prominent text-to-image AI generators such as OpenAI’s DALL.E 2, Stability AI’s Stable Diffusion and Google’s Imagen are all diffusion models. 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Faster and less energy-intensive than Diffusion models

However, consistency models have proven to produce the same quality output as diffusion models in much less time. This is because the consistency model works on a single-step generation process like GANs.

Diffusion models, in contrast, work around a repetitive sampling process which progressively removes noise from an image. The continuous iterative generation process of diffusion models eats up 10–2000 times more compute in comparison to consistency models and slows down the inference during training. 

Download our Mobile App

Consistency models are able to trade-off compute for sample quality when necessary. Besides this, such models are also capable of performing zeroshot data editing tasks like image inpainting, colorisation or stroke-guided image editing. 

Canadian computer scientist Ilya Sutskever, co-founder and chief scientist of OpenAI, Source: University of Toronto

These models also use a mathematical equation to transform data into noise and ensure that the resulting output is consistent for similar data points, allowing for smooth transitions between them. Such equations are called probability flow ordinary differential equations. The study has named this class of models ‘consistency’ because they maintain this property of self-consistency throughout between the input data and the output

These models can either be trained in the distillation mode or the isolation mode. In the distillation mode, consistency models are able to distill the data from pre-trained diffusion models into a sampler that can perform in a single step. While in isolation mode, consistency models don’t depend on diffusion models at all, thereby making them an entirely independent type of models

No adversarial training, no problem

Both methods of training however have removed adversarial training from their books. Adversarial training does result in a stronger neural network but goes about the process in a roundabout way—it introduces a wrongly classified set of adversarial examples and then retrains the target neural network with the correct labels. 

Consequently, adversarial training has been also found to lead to a slight decrease in the accuracy in predictions by deep learning models. They can also cause unexpected side effects in robotics applications.

The experiments showed that the distillation techniques used in training consistency models were better than the distillation techniques used in diffusion models. Consistency models achieved a new state-of-the-art Frechet Inception Distance score—which is indicative of the quality of AI generated images—of 3.55 on the CIFAR10 image dataset and 6.20 on the ImageNet 64*64 dataset. 

It’s fair to say that OpenAI isn’t the only stakeholder here but is definitely one of the major ones. If they want their AI tools to sell more, the onus falls on them to ensure that they take less time and use less compute. In that sense, the potential impact of consistency models is huge since diffusion models aren’t only popular in image generation but also in video and audio generation models.  

Just last month, Sutskever posted a tweet with a hint, saying, “Many believe that great AI advances must contain a new ‘idea’. But it is not so: many of AI’s greatest advances had the form huh, turns out this familiar unimportant idea, when done right, is downright incredible”. This paper shows exactly that—built on older concepts with a tweak can change everything. 

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Poulomi Chatterjee
Poulomi is a Technology Journalist with Analytics India Magazine. Her fascination with tech and eagerness to dive into new areas led her to the dynamic world of AI and data analytics.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.