The Curious Fate of Image Generation Tools

When compared to text-generation tools, image-generation platforms like Midjourney, StabilityAI and DALL.E have seen slower adoption
Listen to this story

AI generative models that complement creators in the artistic space have seen rapid adoption. DALL.E 2, for instance, which was released by OpenAI in November 2022, has been adopted widely by developers and companies in building various image-editing applications and tools. Microsoft has also implemented DALL.E 2 in its ‘Designer’ app and ‘Image Creator’, which were part of Bing and Microsoft Edge. Stable Diffusion, released by StabilityAI in August 2022, has also picked up. The list goes on. 

Amid the hype around generative AI, one question that gently creeps in on you is whether image-generating platforms are still growing and being adopted in their work ecosystem, or was it simply a sign of early adoption to be part of the buzz without any real purpose. 

The Rise & Fall of Generative AI 

While there exists scepticism in terms of commercial use cases, data says otherwise. As per SimilarWeb, monthly visits of few of the generative AI platforms are on an upward trajectory. For instance, Midjourney has shown over a 55% jump in its monthly visits in the past two months. Look at the infographic below for numbers.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Midjourney, which had a first mover advantage among other text-to-image platforms, saw a spike in its user base with 29.1 million user visits in January 2023. The adoption of the platform has been on the rise. For instance, architects have experimented with the platform and are trying ways to implement it in the real world. At the OU Gibbs College of Architecture, students have used Midjourney for their semester-long assignment of Native American studies to generate indigenous architectural designs. 

​​https://twitter.com/business/status/1620619221066547202


Download our Mobile App



As of November 2022, DALL.E 2, a text-to-image platform by OpenAI, had over three million users creating over four million images daily. The beta phase for the same was launched at the same time as Midjourney. 

Last month, Shutterstock launched an AI text-to-image generator powered by OpenAI’s Dall-E and LG’s EXAONE technology. The model is trained on datasets licensed from Shutterstock. Shutterstock also developed a revenue-sharing model, where if a contributor’s content is used for training the model, a share of the earnings will be given towards downloading all AI-generated content from the platform. 

Image Generation Problems 

When compared to text-generation tools, image-generation platforms like Midjourney, StabilityAI and DALL.E have seen slower adoption. There are various reasons for that, one being copyright infringement issues. 

The machine learning platforms are trained on a plethora of images usually scraped from the internet without proper attributions. This has led to many artists suing companies for using their images without permission. Illustrators Sarah Anderson, Kelly McKernan and visual artist Karla Ortiz had filed a class action suit in California against StabilityAI, Midjourney and DeviantArt for “direct and vicarious copyright infringement”. Getty Images, which provides stock-free images, had also sued StabilityAI for copyright infringement. 

Emad Mostaque, the founder of Stability AI, claimed that AI image-generating platforms do not replace artists. He believes that AI creates “brand new forms of expression” and artists cannot be replaced like how photographers and digital artists didn’t replace conventional artists. The argument of how this would eliminate specific job roles such as photo retouchers or daily illustrators, though, is still on. 

On the brighter side, it will also bring in new opportunities, like an AI consultancy company in Illinois posting a job opening for a ‘prompt engineer’!

New Breed of Image Generation Tools 

Meanwhile, efforts to tackle infringements caused by text-to-image apps are on. The University of Chicago created a software called Glaze that will prevent AI models from learning an artist’s style. Researchers have employed ‘style transfer’ algorithms, which make AI recreate images without changing the content by identifying specific features that change when an image is transformed into another style. The software disturbs those features to trick the AI models into “recognizing a different style from what the art is”.  

Versatile Implementation

The limitation of AI-generated images was in the lack of creative control for the generated output as images are created from one particular text prompt. However, that is now addressed with ControlNET, a neural network structure that will refine Stable Diffusion. It solves the problem of spatial consistency and allows additional input conditions to get the desired output image. Sketches, outlines, and depth maps can be used to control the diffusion model. 

ControlNET Source: bootcamp.uxdesign

A Much-Needed Push 

With a combination of ChatGPT and DALL.E, companies are now finding ways to implement these in both consumer and enterprise segments. Coca-Cola is the first major consumer goods company that announced to partner with OpenAI and Bain & Company for marketing, making ad copies, messaging and other consumer experiences. 

New York-based food tech startup, Lunchbox, which offers digital marketing and ordering systems for restaurants, has partnered with OpenAI to use Dall-E 2 for restaurateurs that require food photography. Lunchbox has made a separate AI food image generator where anyone can type out a description of their menu and get up to four images for their online menu. The goal is to help restaurants save costs on marketing. 

Overall, the image generation platforms, both new and early entrants, are witnessing adoption across industries. But, in order to accelerate this growth, companies need to focus on integrating with other generative AI tools like text, audio and others. 

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Vandana Nair
As a rare breed of engineering, MBA, and journalism graduate, I bring a unique combination of technical know-how, business acumen, and storytelling skills to the table. My insatiable curiosity for all things startups, businesses, and AI technologies ensure that I'll always bring a fresh and insightful perspective to my reporting.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.