GANs Learn To Do More, Can Now Put Together A Delicious Pizza

Machine Learning has found its application in various parts of the industry. One of the toughest challenges for an intelligent system can be to  build something sensible out of raw inputs — for example, the culinary arts. But what if a machine or an algorithm can compose and generate a recipe for you? This is exactly the question which was answered by a team of researchers from MIT and Qatar Computing Research Institute.

A joint team from these institutes worked on a machine learning system which can follow a recipe and make a pizza. The researchers looked at food preparation as following a set of instructions and also as changing how the food looks after adding a key ingredient or putting the food through a process. To achieve a system that can perceive food making as following a manual, the researchers compose operators that can add or remove ingredients from a dish. Each of the operators is actually a Generative Adversarial Network (GAN) which predict how the food looks after every step.


Sign up for your weekly dose of what's up in emerging technology.

The aim of the researcher is to build a model that will:

  1. Classify pizza toppings by using supervised learning
  2. Remove the toppings and show what is underneath the topping
  3. Infer the ordering of the pizza topping 

The researchers built a custom dataset which was synthetic in nature and consisted of clip art style pizza images. Researchers see two main advantages of having such images as training data. They say, “ First, it allows us to generate an arbitrarily large set of pizza examples with zero human annotation cost. Second and more importantly, we have access to accurate ground-truth ordering information and multi-layer pixel segmentation of the toppings.”

Download our Mobile App

They also had ground truth annotation which marked the topping for each synthetic pizza. They also downloaded some half a million pizza images from Instagram using the hashtag #pizza. And they got more than 9000 images annotated using human annotators for various toppings found on the pizza. 


Given image level labels from RCB training images, the team has a binary vector representing labels for each of the pizza images. The goal for the researchers is to learn how the toppings look from the training data. For this purpose, they create small datasets with and without a particular topping. In this architecture, the generator generates a topping on the pizza image and another generator checks how the topping matches the pizza and removes the topping. The discriminator is involved with judging the quality of the generated composite images. 

GAN operators 

The two generators and the discriminator are learned jointly. At the test time, the model can now generate pizzas and can be soon as assembling a pizza using its generator and discriminator architecture (GAN). This can be also seen as following a set of instructions. A reverse scenario can also be envisioned. The researchers put it in the following way, “The reverse scenario is to predict the ordered set of instructions that were used to create an image.” 

The inference procedure happens in the following manner:

Classification: Discriminator identifies pizza toppings.

Ordering: The model also manages to understand the layering of toppings and which to remove.

Training Process and Results

The researchers trained using a learning rate of 0.0002 for the first 100 epochs and the decay took it to zero in the next 100 epochs. For pizza images which were real, the researchers’ centre cropped and resized the images to 256 by 256 pixels. The researchers achieved a 99.9% mAP on the classification of toppings. Furthermore, the average normalized Damerau–Levenshtein distance for the PizzaGAN is claimed to be 0.33.


This is a good step towards understanding food science and an innovative way of looking at how AI can change food for humans. This new experiment can be transferred to other layered food items. The researchers say, “Though we have evaluated our model only in the context of pizza, we believe that a similar approach is promising for other types of foods that are naturally layered such as burgers, sandwiches, and salads.”

More Great AIM Stories

Abhijeet Katte
As a thorough data geek, most of Abhijeet's day is spent in building and writing about intelligent systems. He also has deep interests in philosophy, economics and literature.

AIM Upcoming Events

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 10th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox