Last updated April 25, 2024
In AI Breakthroughs

Adobe Unveils World’s First Large-Scale GAN-based Model for Video Super-Resolution

VideoGigaGAN is capable of upscaling videos up to 8x, i.e. from 128x128 to 1024x1024 resolution.

Share

Published on April 25, 2024

by Donna Eva

Listen to this story

In a recent breakthrough, Adobe released VideoGigaGAN, a new generative AI model, capable of upscaling videos while removing temporal flickering and blurriness.

In what could be a game-changer for video upscaling, researchers addressed the problem of current video super-resolution (VSR) models producing blurrier outputs while preserving temporal consistency.

They proposed using a GAN-based approach as opposed to a regression-based approach, in turn creating the world’s first large-scale GAN-based model for VSR.

“VSR has two main challenges. The first is to maintain temporal consistency across output frames. The second challenge is to generate high-frequency details in the upsampled frames,” said Yiran Xu, one of the researchers.

However, according to Xu, most VSR models have only addressed maintaining temporal consistency.

Several big tech companies, including Microsoft and Intel, have released VSR models over the past couple of years. Recently, NVIDIA and AMD have more notably released DLSS 3 and FSR 3, respectively, both promising video upscaling capabilities using AI.

While the two compete in the gaming division, Adobe has reportedly delivered a model capable of upscaling videos up to 8x, i.e. from 128×128 to 1024×1024 resolution.

Problems With Current VSR Models

As stated by the researchers, a significant problem with VSR models is the quality of the upscaled video. They said that upscaled videos tended to be on the blurrier side, though all major VSR models addressed temporal consistency.

Users reported a similar problem with NVIDIA’s DLSS, which struggles with real-life videos.

“It’s a bit of a hit or miss, with the most noticeable artefact being smearing or a Vaseline-like filter on human subjects kinda like those touch-up camera filters that Chinese Android OEMs like to include in their camera app,” a user reported on Reddit.

This is understandable. NVIDIA’s DLSS and AMD’s FSR primarily focus on improving frame rates and image quality specifically for gamers. While DLSS can get its inputs directly from the game engine, it struggles with getting these same inputs from a regular video that lacks that kind of data.

In continuation with NVIDIA’s DLSS capabilities, the American company also released its own VSR model specifically for upscaling videos on Google Chrome and Microsoft Edge last year.

However, these came with their own set of criticisms. The most notable being that while NVIDIA’s VSR model generally did a good job, it struggled with fast-moving videos.

“What we can say is that slow-moving videos (like NVIDIA’s samples) provide the best results, while faster-paced stuff like sports is more difficult, as the frame-to-frame changes can be quite significant,” said Jarred Walton of Tom’s Hardware.

Similarly, AMD released its own video upscaling algorithm to compete with NVIDIA’s VSR model. However, whether this actually uses AI is unknown and general consensus is yet to be solidified, as it was only released in January this year.

What’s the Catch?

Coming back to VideoGigaGAN, initial reactions seem exciting, especially as it promises much more than Topaz Video AI delivers.

“Topaz is really weak, it’s made for like clean 720p video that can upscale to max 2x before it’s ruined by artefacts (sic),” another Reddit user said. This is something to look forward to when Adobe makes VideoGigaGAN available to its users.

With Adobe promising the ability to upscale 8x, whether it actually delivers on this front is yet to be seen since we only have sample videos to go by. Real-time use of the model could paint a completely different picture of its capabilities.

https://twitter.com/dreamingtulpa/status/1782423003743002692

Some have even likened it to popular sci-fi tech like CSI’s near-impossible enhanced capabilities or even Deckard’s Photo Inspector in Blade Runner.

The real-life applications are obvious as well, from upscaling historical videos to aiding investigations reliant on blurry CCTV footage. Though, the debate on whether AI-upscaled videos should be permitted as evidence in a court of law is still raging on.

In terms of Adobe itself, the model could be integrated into Premiere Pro for users to upscale their own footage, or make sure that all their raw video is of a consistent quality.

However, that is still a while away as the researchers admit that the model suffers when it comes to longer videos. In particular, they specify that videos of over 200 frames are difficult to upscale using the current model.

VideoGigaGAN: Towards Detail-rich Video Super-Resolution

Pretty impressive results!https://t.co/9GGqfDB8ew pic.twitter.com/2ZxpDO6S5S
— Ömer Karışman (@okarisman) April 23, 2024

“Additionally, our model does not perform well in handling small objects, such as text and characters, as the information pertaining to these objects is significantly lost in the LR video input,” they said.

However, if these are solved, Adobe could be leagues ahead of competitors in achieving near-perfect VSR capabilities.

In the meantime, Adobe has released the latest version of its creative generative AI models, Firefly Image 3.

Access all our open Survey & Awards Nomination forms in one place