Last updated February 28, 2024
In AI Origins & Evolution

The Curious Fate of Image Generation Tools

When compared to text-generation tools, image-generation platforms like Midjourney, StabilityAI and DALL.E have seen slower adoption

Share

Published on February 23, 2023

by Vandana Nair

Listen to this story

AI generative models that complement creators in the artistic space have seen rapid adoption. DALL.E 2, for instance, which was released by OpenAI in November 2022, has been adopted widely by developers and companies in building various image-editing applications and tools. Microsoft has also implemented DALL.E 2 in its ‘Designer’ app and ‘Image Creator’, which were part of Bing and Microsoft Edge. Stable Diffusion, released by StabilityAI in August 2022, has also picked up. The list goes on.

Amid the hype around generative AI, one question that gently creeps in on you is whether image-generating platforms are still growing and being adopted in their work ecosystem, or was it simply a sign of early adoption to be part of the buzz without any real purpose.

The Rise & Fall of Generative AI

While there exists scepticism in terms of commercial use cases, data says otherwise. As per SimilarWeb, monthly visits of few of the generative AI platforms are on an upward trajectory. For instance, Midjourney has shown over a 55% jump in its monthly visits in the past two months. Look at the infographic below for numbers.

Midjourney, which had a first mover advantage among other text-to-image platforms, saw a spike in its user base with 29.1 million user visits in January 2023. The adoption of the platform has been on the rise. For instance, architects have experimented with the platform and are trying ways to implement it in the real world. At the OU Gibbs College of Architecture, students have used Midjourney for their semester-long assignment of Native American studies to generate indigenous architectural designs.

https://twitter.com/business/status/1620619221066547202

As of November 2022, DALL.E 2, a text-to-image platform by OpenAI, had over three million users creating over four million images daily. The beta phase for the same was launched at the same time as Midjourney.

Last month, Shutterstock launched an AI text-to-image generator powered by OpenAI’s Dall-E and LG’s EXAONE technology. The model is trained on datasets licensed from Shutterstock. Shutterstock also developed a revenue-sharing model, where if a contributor’s content is used for training the model, a share of the earnings will be given towards downloading all AI-generated content from the platform.

Image Generation Problems

When compared to text-generation tools, image-generation platforms like Midjourney, StabilityAI and DALL.E have seen slower adoption. There are various reasons for that, one being copyright infringement issues.

The machine learning platforms are trained on a plethora of images usually scraped from the internet without proper attributions. This has led to many artists suing companies for using their images without permission. Illustrators Sarah Anderson, Kelly McKernan and visual artist Karla Ortiz had filed a class action suit in California against StabilityAI, Midjourney and DeviantArt for “direct and vicarious copyright infringement”. Getty Images, which provides stock-free images, had also sued StabilityAI for copyright infringement.

Emad Mostaque, the founder of Stability AI, claimed that AI image-generating platforms do not replace artists. He believes that AI creates “brand new forms of expression” and artists cannot be replaced like how photographers and digital artists didn’t replace conventional artists. The argument of how this would eliminate specific job roles such as photo retouchers or daily illustrators, though, is still on.

On the brighter side, it will also bring in new opportunities, like an AI consultancy company in Illinois posting a job opening for a ‘prompt engineer’!

New Breed of Image Generation Tools

Meanwhile, efforts to tackle infringements caused by text-to-image apps are on. The University of Chicago created a software called Glaze that will prevent AI models from learning an artist’s style. Researchers have employed ‘style transfer’ algorithms, which make AI recreate images without changing the content by identifying specific features that change when an image is transformed into another style. The software disturbs those features to trick the AI models into “recognizing a different style from what the art is”.

Versatile Implementation

The limitation of AI-generated images was in the lack of creative control for the generated output as images are created from one particular text prompt. However, that is now addressed with ControlNET, a neural network structure that will refine Stable Diffusion. It solves the problem of spatial consistency and allows additional input conditions to get the desired output image. Sketches, outlines, and depth maps can be used to control the diffusion model.

ControlNET Source: bootcamp.uxdesign

A Much-Needed Push

With a combination of ChatGPT and DALL.E, companies are now finding ways to implement these in both consumer and enterprise segments. Coca-Cola is the first major consumer goods company that announced to partner with OpenAI and Bain & Company for marketing, making ad copies, messaging and other consumer experiences.

New York-based food tech startup, Lunchbox, which offers digital marketing and ordering systems for restaurants, has partnered with OpenAI to use Dall-E 2 for restaurateurs that require food photography. Lunchbox has made a separate AI food image generator where anyone can type out a description of their menu and get up to four images for their online menu. The goal is to help restaurants save costs on marketing.

Overall, the image generation platforms, both new and early entrants, are witnessing adoption across industries. But, in order to accelerate this growth, companies need to focus on integrating with other generative AI tools like text, audio and others.

Access all our open Survey & Awards Nomination forms in one place

Vandana Nair

As a rare blend of engineering, MBA, and journalism degree, Vandana Nair brings a unique combination of technical know-how, business acumen, and storytelling skills to the table. Her insatiable curiosity for all things startups, businesses, and AI technologies ensures that there's always a fresh and insightful perspective to her reporting.