Advertisement

Active Hackathon

GANs Learn To Do More, Can Now Put Together A Delicious Pizza

Machine Learning has found its application in various parts of the industry. One of the toughest challenges for an intelligent system can be to  build something sensible out of raw inputs — for example, the culinary arts. But what if a machine or an algorithm can compose and generate a recipe for you? This is exactly the question which was answered by a team of researchers from MIT and Qatar Computing Research Institute.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

A joint team from these institutes worked on a machine learning system which can follow a recipe and make a pizza. The researchers looked at food preparation as following a set of instructions and also as changing how the food looks after adding a key ingredient or putting the food through a process. To achieve a system that can perceive food making as following a manual, the researchers compose operators that can add or remove ingredients from a dish. Each of the operators is actually a Generative Adversarial Network (GAN) which predict how the food looks after every step.

The aim of the researcher is to build a model that will:

  1. Classify pizza toppings by using supervised learning
  2. Remove the toppings and show what is underneath the topping
  3. Infer the ordering of the pizza topping 

The researchers built a custom dataset which was synthetic in nature and consisted of clip art style pizza images. Researchers see two main advantages of having such images as training data. They say, “ First, it allows us to generate an arbitrarily large set of pizza examples with zero human annotation cost. Second and more importantly, we have access to accurate ground-truth ordering information and multi-layer pixel segmentation of the toppings.”

They also had ground truth annotation which marked the topping for each synthetic pizza. They also downloaded some half a million pizza images from Instagram using the hashtag #pizza. And they got more than 9000 images annotated using human annotators for various toppings found on the pizza. 

The PIZZAGAN

Given image level labels from RCB training images, the team has a binary vector representing labels for each of the pizza images. The goal for the researchers is to learn how the toppings look from the training data. For this purpose, they create small datasets with and without a particular topping. In this architecture, the generator generates a topping on the pizza image and another generator checks how the topping matches the pizza and removes the topping. The discriminator is involved with judging the quality of the generated composite images. 

GAN operators 

The two generators and the discriminator are learned jointly. At the test time, the model can now generate pizzas and can be soon as assembling a pizza using its generator and discriminator architecture (GAN). This can be also seen as following a set of instructions. A reverse scenario can also be envisioned. The researchers put it in the following way, “The reverse scenario is to predict the ordered set of instructions that were used to create an image.” 

The inference procedure happens in the following manner:

Classification: Discriminator identifies pizza toppings.

Ordering: The model also manages to understand the layering of toppings and which to remove.

Training Process and Results

The researchers trained using a learning rate of 0.0002 for the first 100 epochs and the decay took it to zero in the next 100 epochs. For pizza images which were real, the researchers’ centre cropped and resized the images to 256 by 256 pixels. The researchers achieved a 99.9% mAP on the classification of toppings. Furthermore, the average normalized Damerau–Levenshtein distance for the PizzaGAN is claimed to be 0.33.

Outlook

This is a good step towards understanding food science and an innovative way of looking at how AI can change food for humans. This new experiment can be transferred to other layered food items. The researchers say, “Though we have evaluated our model only in the context of pizza, we believe that a similar approach is promising for other types of foods that are naturally layered such as burgers, sandwiches, and salads.”

More Great AIM Stories

Abhijeet Katte
As a thorough data geek, most of Abhijeet's day is spent in building and writing about intelligent systems. He also has deep interests in philosophy, economics and literature.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

Data Science Skills Survey 2022 – By AIM and Great Learning

Data science and its applications are becoming more common in a rapidly digitising world. This report presents a comprehensive view to all the stakeholders — students, professionals, recruiters, and others — about the different key data science tools or skillsets required to start or advance a career in the data science industry.

How to Kill Google Play Monopoly

The only way to break Google’s monopoly is to have localised app stores with an interface as robust as Google’s – and this isn’t an easy ask. What are the options?

[class^="wpforms-"]
[class^="wpforms-"]