MITB Banner

Unboxing GANs with Milind Jadhav, Lead Data Scientist at Fractal

What we are doing at Fractal is first to solve a basic problem of recreating one data dimension for one time period with the highest possible accuracy.
Share
Milind Jadhav
Listen to this story

Milind Jadhav is a Lead Data Scientist in the AIML team at Fractal. He has a decade long experience in solving business problems for Fortune 500 clients across multiple domains such as Insurance, Banking and Technology. With his expertise, Milind leads data science delivery of client engagements creating automated decisioning solutions through the application of AI and machine learning. He also drives domain-agnostic capability developments catering to common areas of need for businesses from various industries. Apart from this, Milind plays an SME role in business development initiatives and is actively involved in guiding the careers of data scientists at Fractal. 

A few months back, Milind started leading Fractal’s research on what can arguably be considered the hottest field in AI – GANs, Generative adversarial networks. Analytics India Magazine caught up with him to understand the present state of the field and the potential it holds.

Edited excerpts:

AIM: How have GANs grown over the years?

Milind: The concept of GANs is relatively new. It started in 2014 when Ian Goodfellow and the team first introduced it in a paper. Since then, many other researchers have come up with their own versions, and we now have several popular types of GANs, such as conditional GANs, deep convolution GANs, etc. It started with image generation in particular, but now its scope has widened to include other fields such as generating tabular data, audio signals, transferring styles from one image to another, doing semantic segmentation used in autonomous driving, training classifiers to identify adversarial samples and resist attacks and so on.

AIM: What are the real-world use cases of GANs?

Milind: There are numerous use cases across industries. One which interests me is using GANs for synthesising new stable chemical compounds. I have also read about GANs being used to convert user photos into personal emojis. In photography, SRGANs are being used to generate a high-resolution version of old photos. Text to image conversion is also being achieved through GANs. We at Fractal have started with the use case of generating high quality structured enterprise data using GANs. 

AIM: How are GANs used in Fractal?

Milind: Being in the field for 20+ years, Fractal has become a leader in the AI space. We have a lot of experience in dealing with enterprise data, and most of it is in tabular form. However, we have also seen some clients are plagued by data scarcity, data validity, and data security issues. So we have started our research towards being able to generate tabular enterprise data to address these issues. 

For generating this complex multi-dimensional enterprise data with maximum accuracy, understanding the relationship between these dimensions is very important. What we are doing at Fractal is first to solve a basic problem of recreating one data dimension for one time period with the highest possible accuracy. Then, we have built a utility around it in two versions. One is where clients can input the actual data they have, and the utility automatically iterates over multiple GAN architectures, tunes dozens of hyperparameters, and selects them based on validation metrics to finally generate synthetic data. We are able to achieve this through our utility without exposing our clients to the complicated training process. As an outcome, the client gets the output data and validation report about the quality of synthetic data.

In the second version, we are trying to make the process of GAN training easy for data scientists. We offer code accelerators to help data scientists train the best GAN architectures to generate synthetic data quickly. In this version, we provide full flexibility on the training process to the data scientist, including choosing their own ensembles, specifying grids for the hyperparameters they want to tune etc., without needing them to worry about setting up the underlying python code. 

AIM: What are the loopholes in GANs?

Milind: I would not call them loopholes, but by construct, GANs are very difficult to train. Because two models are always competing against each other, so improvements made to one model are at the expense of the other one. As a matter of fact, many data scientists struggle to achieve a balance in training.  

Several pitfalls can happen in GANs training; some are: 

  1. The most challenging one is the case where multiple inputs to the generator result in the same/limited variety output. This is referred to as mode collapse. 
  2. There is also no universally accepted validation metric that could be used to know whether a gan is performing well in training or not. 

What we are trying to do is, through rigorous experimentation on several data sets, automate this entire training process while mitigating issues such as the above as much as possible. For validation specifically, we have come up with our own metric, which we have observed to be working quite well in our experiments. 

AIM: What lies ahead when it comes to GANs? 

Today, enterprises are spending a lot of money on collecting and storing huge amounts of data. Once GANs for structured data mature, we could be looking at companies generating highly accurate samples in a secure manner whenever needed. In terms of unstructured data generation also, the applications are enormous and ever progressing. 

Overall the use of GANs is growing, but one hitch that remains is the huge compute power it requires to train them. They typically require GPU resources to accurately learn the underlying joint distribution and replicate it to the maximum extent. In the future, I think as infrastructure costs reduce further, thereby bringing down the cost of training, we could see several organisations investing in GANs research for a variety of applications.

PS: The story was written using a keyboard.
Share
Picture of Shraddha Goled

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India