“The art of knowing is knowing what to ignore”-Rumi, Persian poet
The researchers approach the supervised GAN problem from a different perspective, a poetic one, of Rumi, the Persian poet. “The art of knowing is knowing what to ignore,” wrote Rumi and the IISc researchers tapping this notion to make it work for GANs in a way that surpasses the existing methods. The researchers divide into positive and negative samples and make their GAN framework to ignore the negative samples. They call this the Rumi framework.
The Rumi Framework
Rumi’s saying in the context of machine learning, wrote the authors, interpreted as empowering models to learn by ignoring and to learn both from examples as well as counterexamples. In Rumi-GAN, as the authors have named their variant of GAN, the discriminator samples its inputs into one of three categories:
- Positives, representing samples from the target distribution. The generator is tasked with learning the distribution of the positive ones by simultaneously learning to avoid the negative ones.
- Negatives, representing samples that must be avoided; and
- Fakes, which are the samples drawn from the generator.
For the experiments, the authors considered MNIST, Fashion-MNIST, CelebA and CIFAR-10 datasets. And, the GAN models are coded in TensorFlow 2.0. Whereas, the generator and discriminator architectures are based on deep convolutional GAN
For evaluating the results across different GANs, the authors used The Fréchet inception distance (FID) and Precision-Recall. FID is used as a metric for comparing the quality of the images generated by GANs. For experiments, the InceptionV3 model was used loaded with ImageNet pre-trained weights to generate the embeddings. Over these embeddings, the FID scores are evaluated. The researchers used both FID scores and PR curves to analyze the performance of Rumi-LSGAN with respect to the baselines. FID scores, wrote the authors, provide an objective assessment of the image quality, but not the diversity of the learnt distribution. Hence, PR behaviour is considered too.
For the experiment, first, the five even digit classes of MNIST are pooled into the positive class and the odd numbers into the negative class. Now Rumi-LSGAN is pitted against LSGAN and ACGAN and compared on datasets like MNIST. Here the LSGAN is trained only on the positive class data, whereas the other two models are trained using both positive and negative class data.
Positive: 1, 2, 4, 5, 7, and 9, and
Negative: 0, 2, 3, 6, 8, and 9.
According to the authors, the results indicate that the Rumi-LSGAN generates sharper images consistently from the positive class, unlike the other two models. Since the Rumi-GAN can be applied to almost all varieties of GANs, there is more potential to not only make GANs more efficient but also might open up more research into understanding how GANs or neural networks, in general, can benefit from ignorance. GANs have been associated mostly with infamous Deepfake incidents. The consequences are dire, and the networks are prone to adversarial attacks. So, insights like Rumi-GAN will only help us understand these duelling networks better.
That said, the authors posited that one significant application of their work is to generate samples that are usually under-represented and in a way, make machine learning tasks more inclusive.
Neural network based image classification and supervised image generation are data-intensive tasks. These models, when trained on unbalanced data with insufficient racial diversity, tend to inherit the implicit biases present in the data.
They also admit that RUMI formulation could be used to overcome the biases in the dataset, it cannot overcome the biases of the data scientist.
Read the original paper here.