Deep Learning has made tremendous progress over the last few years. Researchers have been designing neural net architectures manually and have been successful in several complex tasks such as speech recognition, emotion detection, image and video classification, object detection, machine translation, and much more. However, manually created models are somewhat time-consuming and always have a tendency of errors.
This shortfall has led the researchers to take the next step of automating machine learning technique in the form of Neural architecture search (NAS). This method has outperformed other manually created neural net models. NAS is a subfield of AutoML and has been used for automating the design of deep neural networks which can outperform human-made neural net models.
Recently, researchers from Texas A&M University and MIT-IBM Watson AI Lab developed an architecture known as AutoGAN by introducing the Neural Architecture Search algorithm to the GAN’s architecture. This model is said to outperform the existing state-of-the-art manually created General Adversarial Network (GAN) models. Human created GAN models are mostly unstable and prone to collapse, this is the reason why the researchers merged the architecture of NAS into the training process.
Behind the Model
AutoGAN is based on a multi-level architecture search strategy, where the generator is composed of several cells. In this model, the search space is defined in order to capture the GAN architectural variations and to assist this architecture search, an RNN controller is being used. Basically, AutoGAN follows the basic idea of using a recurrent neural network (RNN) controller to choose blocks from its search space. Then, the model is introduced to three key aspects of Neural architecture search (NAS) which are the search space, the proxy task and the optimisation algorithm.
Dataset Used
In order to achieve competitive image generation results against current state-of-the-art hand-crafted GAN models, the researchers used two datasets, CIFAR-10 and STL-10. CIFAR-10 consists of 50,000 training image and 10,000 test images, where each image is of 32 × 32 resolution and the training set is used to train the AutoGAN model without any data augmentation. The STL-10 dataset is used to show the transferability of the discovered architectures of AutoGAN model.
On the CIFAR-1 dataset, AutoGAN obtained an Inception score of 8.55 and Frchet Inception Distance (FID) score of 12.42. On both datasets, AutoGAN established new state-of-the-art FID scores.
How Is It Better
The AutoGAN framework employs a multi-level architecture search (MLAS) strategy by default. This model can identify highly effective architectures on both CIFAR-10 and STL-10 datasets, achieving competitive image generation results against the current state-of-the-art, hand-crafted GAN models. In terms of inception score, AutoGAN is slightly next to Progressive GAN, and surpasses many latest strong competitors such as SN-GAN, improving MMD-GAN, Dist-GAN, MGAN, and WGAN-GP. In terms of favourable performance under other GAN metrics, e.g. Frchet Inception Distance (FID), the model outperforms all current state-of-the-art models.
Limitations of AutoGAN
- According to researchers, the key challenge lies in how to further improve the search algorithm efficiency
- Due to the high instability and hyperparameter sensitivity of GAN training itself, AutoGAN appears to be more challenging than NAS for image classification
- Finding an appropriate metric to evaluate and guide the search process is another difficult task encountered by the researchers
- Despite the preliminary success, the researchers stated that the model needs improvement in future cases
- The current search space of the AutoGAN is limited
- The model is not tested on higher-resolution image synthesis such as ImageNet
Similar Research
Last year, researchers from Rutgers University and Perspecta Labs developed an AutoGAN model which counters adversarial attacks by enhancing the lower-dimensional manifold defined by the training data and by projecting perturbed data points onto it. The approach used a Generative Adversarial Network (GAN) with an autoencoder generator and a discriminator.
Outlook
Introduced by Ian Goodfellow in 2014, GAN or General Adversarial Network is one of the most popular approaches of neural networks. There are certain advantages of GAN such as this deep neural network does not require any labelled data while learning the internal representations of the data, generate data which are almost like real data, etc. It has proved to be very good at reconstructing manifolds of natural image datasets in original high-dimensional input spaces. In order to improve the quality of generated images, many researchers have put efforts and proposed several sophisticated neural network architectures.