MITB Banner

GANs Learn To Do More, Can Now Put Together A Delicious Pizza

Share

Machine Learning has found its application in various parts of the industry. One of the toughest challenges for an intelligent system can be to  build something sensible out of raw inputs — for example, the culinary arts. But what if a machine or an algorithm can compose and generate a recipe for you? This is exactly the question which was answered by a team of researchers from MIT and Qatar Computing Research Institute.

A joint team from these institutes worked on a machine learning system which can follow a recipe and make a pizza. The researchers looked at food preparation as following a set of instructions and also as changing how the food looks after adding a key ingredient or putting the food through a process. To achieve a system that can perceive food making as following a manual, the researchers compose operators that can add or remove ingredients from a dish. Each of the operators is actually a Generative Adversarial Network (GAN) which predict how the food looks after every step.

The aim of the researcher is to build a model that will:

  1. Classify pizza toppings by using supervised learning
  2. Remove the toppings and show what is underneath the topping
  3. Infer the ordering of the pizza topping 

The researchers built a custom dataset which was synthetic in nature and consisted of clip art style pizza images. Researchers see two main advantages of having such images as training data. They say, “ First, it allows us to generate an arbitrarily large set of pizza examples with zero human annotation cost. Second and more importantly, we have access to accurate ground-truth ordering information and multi-layer pixel segmentation of the toppings.”

They also had ground truth annotation which marked the topping for each synthetic pizza. They also downloaded some half a million pizza images from Instagram using the hashtag #pizza. And they got more than 9000 images annotated using human annotators for various toppings found on the pizza. 

The PIZZAGAN

Given image level labels from RCB training images, the team has a binary vector representing labels for each of the pizza images. The goal for the researchers is to learn how the toppings look from the training data. For this purpose, they create small datasets with and without a particular topping. In this architecture, the generator generates a topping on the pizza image and another generator checks how the topping matches the pizza and removes the topping. The discriminator is involved with judging the quality of the generated composite images. 

GAN operators 

The two generators and the discriminator are learned jointly. At the test time, the model can now generate pizzas and can be soon as assembling a pizza using its generator and discriminator architecture (GAN). This can be also seen as following a set of instructions. A reverse scenario can also be envisioned. The researchers put it in the following way, “The reverse scenario is to predict the ordered set of instructions that were used to create an image.” 

The inference procedure happens in the following manner:

Classification: Discriminator identifies pizza toppings.

Ordering: The model also manages to understand the layering of toppings and which to remove.

Training Process and Results

The researchers trained using a learning rate of 0.0002 for the first 100 epochs and the decay took it to zero in the next 100 epochs. For pizza images which were real, the researchers’ centre cropped and resized the images to 256 by 256 pixels. The researchers achieved a 99.9% mAP on the classification of toppings. Furthermore, the average normalized Damerau–Levenshtein distance for the PizzaGAN is claimed to be 0.33.

Outlook

This is a good step towards understanding food science and an innovative way of looking at how AI can change food for humans. This new experiment can be transferred to other layered food items. The researchers say, “Though we have evaluated our model only in the context of pizza, we believe that a similar approach is promising for other types of foods that are naturally layered such as burgers, sandwiches, and salads.”

Share
Picture of Abhijeet Katte

Abhijeet Katte

As a thorough data geek, most of Abhijeet's day is spent in building and writing about intelligent systems. He also has deep interests in philosophy, economics and literature.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.