Jason Antic, the creator of DeOldify, one of the most popular applications of Generative Adversarial Networks(GANs), has recently tweeted that the new versions of this DeOldify software don’t use GANs anymore, but something related to the perceptual loss applied in super-resolution.
DeOldify used GANs to colourize both images to create colourized stable video. It uses a variant of GAN called NoGAN, developed for DeOldify. In the next section, we briefly discuss how were GANs used previously and what is the new alternative that the creator has kept under the wraps.
Overview Of NoGAN
According to the DeOldify GitHub repo, NoGAN has the benefits of GAN training while spending lesser time during training. Most of the training time is spent pretraining the generator and discriminator separately with more straightforward, fast and reliable conventional methods. This enables the generator to gain full realistic colourization capabilities within short amount of time, which otherwise would have taken days of progressively resized GAN training.
What’s Wrong With GANs
In one of the most popular ML forums, it was discussed that GANs are working because standard losses on image processing are based on pixel to pixel comparisons, which isn’t an efficient way to compare information in an image. Meanwhile, GANs create their own losses based on features extracted by a discriminator.
Regarding NoGAN, Jason said that they originally used perceptual loss along with GAN loss. But, GANs tend to go haywire with glitches and introduce undesirable constraints. And, finding a stopping point based purely on visual inspection makes it extremely hard to do repeatable experiments.
Loss during NoGAN learning is two parts:
- Perceptual Loss (or Feature Loss) based on VGG16
- loss score from the critic
Perceptual loss function and per-pixel loss functions are popularly used to identify the similarity between two images. The only difference here being perceptual loss function sums all the squared errors between all the pixels and takes the mean. Whereas, per-pixel loss function sums all the absolute errors between pixels. The results using perceptual loss functions are claimed to be more accurate in generating high-quality images but also do so as much as three times faster when optimized.
Using perceptual loss with GANs reduces the role of a discriminator as this perceptual loss serves as a discriminator and the absence of a discriminator makes Generative Adversarial Networks no longer adversarial.
Is There A Better Alternative?
DeOldify creator Jason hinted that their latest experimental model uses something similar to perceptual loss in super-resolution. The closest work related to this area was published by Stanford under the title, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution.”
The above picture compares the results of using perceptual loss with other methods. We can clearly see that the images are sharper. To achieve this, the Stanford researchers trained feedforward transformation networks using perceptual loss functions instead of per-pixel loss functions. Perceptual loss functions depend on high-level features from a pre-trained loss network and measure image similarities more robustly than per-pixel losses.
However, DeOldify creator, in the documentation of NoGAN, wrote that perceptual loss isn’t sufficient by itself to produce good results.
This approach with perceptual loss and super-resolution might only be one component of the new DeOldify software, and we will have to wait for the creator to spill some secrets. GANs have garnered great attention ever since its generated art made millions at Sotheby’s auction. From generating faces that never existed to style transfer and deepfake videos, GANs have had a terrific run so far. But, with DeOldify’s announcement, it looks like GANs might take the back seat at least in generating super-resolution imagery segment.