Listen to this story
Last month, Adobe partnered with NVIDIA to enter the generative AI space with Adobe Firefly. But now that the community has got its hands on the new product from one of the world’s largest creative software companies, it seems there’s very little fire here.
Dr Jim Fan, an AI scientist at NVIDIA, recently compared some outputs from Midjourney and Firefly, and noticed an interesting trend. It appeared that the model was approximately at the same level that Midjourney was a year ago, when it launched its V1 model. Moreover, Adobe’s approach to avoiding copyrighted content may have resulted in an inferior product compared to Midjourney.
Firefly is a mile behind
Firefly leverages NVIDIA’s foundational models, known as NVIDIA Picasso, to generate images. Under this program, NVIDIA allows enterprise customers to train one of their ‘Edify’ models with custom datasets and then call them with an API to be integrated into applications.
The likely reason that Adobe went with NVIDIA’s models was the ability to train it on their own data. Adobe also owns and operates a stock image platform known as Adobe Stock, which hosts over 200 million images, ranging from photos to vector graphics. The entirety of Adobe Stock, along with exclusive fully-licensed images, make up the dataset used to train the model behind Firefly, absolving it of any legal or ethical issues.
However, it seems this approach has not only handicapped Firefly’s image generation capabilities, but resulted in an inferior product. A cursory look at Dr Fan’s tweet thread shows miles of difference between images generated by Midjourney and Firefly.
The researcher tested eight prompts (which were optimised for Midjourney) on both Midjourney v5 and the beta version of Adobe Firefly. While the latter performed well in landscape and abstract arts, it simply fell short when it came to reproducing copyrighted characters such as Deadpool, Super Mario, and Pikachu.
This seems to be a problem stemming from Adobe’s dataset, as Adobe Stock contains a relatively low number of images of copyrighted content. In addition to this, Adobe has also made sure to source the data responsibly, as evidenced in a tweet by Valentin Deschaintre, a research scientist at Adobe, where he stated, “I am particularly happy about the data sourcing policy, which is extremely strict (and this is true internally too, not just a facade) to avoid using data which carries the risk of harming artists.”
Midjourney, on the other hand, has scraped the internet for over 5 billion images, which means there is no lack of copyrighted content for the model to ‘learn’ what certain characters look like.
When it comes to non-copyrighted content, Firefly still seems to be struggling with the many common drawbacks of diffusion models. The model does not understand spatial concepts and fails to generate hands or faces accurately. However, low quality might not be a dealbreaker for Adobe’s customers, who will likely choose to be legally absolved of any copyright claim rather than use Midjourney.
Beyond image generation
Image generation only reflects one of the capabilities of Adobe Firefly, with the illustrated use-case only showing its text-to-image capabilities. In reality, Firefly is a portal for testing features that will eventually be integrated into Adobe products like Photoshop and Premiere Pro.
Apart from generating images, some Firefly features include generating text effects, outpointing images, upscaling, depth- and 3D-to-image capabilities, text-to-template, text-to-vector graphics, and a feature termed ‘conversational editing’. Using this feature, users can provide changes to graphics in natural language, with Firefly taking care of the rest.
When seeing the bigger picture of what Adobe is trying to achieve with Firefly, it seems that its technical capabilities should be the last topic of discussion. Integrating generative AI into creators’ workflows with one click is a huge step forward. Moreover, considering the professional nature of Adobe’s users, it is clear that the origin of Firefly’s datasets should be unquestionably legal. Another facet to this is that Adobe risks alienating a large portion of their users if they went the Midjourney way, i.e. training the model on web scrapes. Both Midjourney and Stable Diffusion have come under fire from artists for doing so, with lawsuits on the horizon for web-scraped datasets.
Even though Firefly is legally clean, there is an interesting dichotomy to be found between the two. While Firefly has taken the approach of legally cleansing itself at the expense of the product, Midjourney has openly accepted that it steals the works of the artists, all to create a good product. Firefly’s approach is a huge plus for companies that want no IP copyright issues, but it seems that a middle ground must be found between the two approaches for the future of generative AI.