Stability AI, Midjourney Face Legal Woes As Shutterstock Settles with Artists

The lawsuit might force the generative AI industry to do as Shutterstock did.
Stability AI, Midjourney Face Legal Woes As ShutterStock Settles with Artists
Listen to this story

Over the weekend, news emerged that a class action lawsuit had been filed against generative AI companies Stability AI, Midjourney, and image hosting platform DeviantArt. The lawsuit, filed by three artists, seeks compensation for damages caused by these companies along with an injunction to prevent ‘further harms’. This lawsuit is just the latest in a long line of protests by artists to not use uncredited art to train generative AI algorithms. 

Notably, the team fighting the lawsuit is the same one that filed a class action lawsuit against GitHub Copilot along the same lines. Led by lawyer Matthew Butterick, the lawyers are on a mission to prevent the misuse of people’s hard work as training data for AI. The primary point of contention is use of the LAION-5B dataset, which contains billions of uncredited, copyrighted images.

On the other hand, a hopeful trend of companies beginning to compensate people for using their images in AI datasets has started to emerge. Shutterstock, the stock image marketplace platform, has taken a stance of supporting artists, with plans to pay artists for their contribution in training machine learning models. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

In a world where a company like Shutterstock has proved that it is possible to compensate artists adequately for using their art in an AI dataset, there is no reason for Stability AI, Midjourney, and Deviantart to not do so. Let’s delve deeper into the intricacies of AI copyright laws.

Breaking down the lawsuit

To understand why this lawsuit was filed in the first place, we must first look at how modern-day generative AI creates its images. All the parties targeted in the lawsuit use the diffusion method to generate artwork from noise. Diffusion is also the method used by the algorithms to interpret what a source image from a dataset is, as it is a more ‘comprehensible’ method of storing the insights from a large dataset such as LAION-5B. Butterick said in his blog,

“These result­ing images may or may not out­wardly resem­ble the train­ing images. Nev­er­the­less, they are derived from copies of the train­ing images, and com­pete with them in the mar­ket­place.”

Simply put, generative AI uses diffusion to create a ‘noise cloud’ of data from its training images, the knowledge of which it uses to create images out of another ‘noise cloud’. The lawsuit compares this method to other methods of storing compressed data such as MP3 or JPEG files, thus making generative algorithms a ‘collage system’ of their training dataset. 

Even Stable Diffusion’s latent space visualisation techniques aren’t enough to save their generations from being derivative works, as the lawsuit states that it is a more complex way of interpolating source images without adding anything to it. Moreover, the lawyers also argue that the conditioning process by using text is simply a ‘layer of magical misdirection’ that makes it harder for users to generate obvious copies of the training data. 

However, many have discounted this argument for being reductionist, as seen by this post by Twitter user Daniel, who goes by ‘KeyTryer’ on the platform. He carefully breaks down how diffusion models ‘learn’ the concepts associated with images instead of storing a representation of these images in latent space and using it to interpolate between different images.

While the specifics of how diffusion models work are a bit different from what has been described in this blog post by Butterick, the argument still stands. Whether it’s paying companies like LAION to create datasets like Stability AI, or scraping artwork from those hosted on their website like DeviantArt—the fact is that artists should be compensated for enabling the use of their art to train AI. 

Compensating artists for their work

Even as Stable Diffusion, Midjourney, and DeviantArt are developing projects based on the work of millions of artists, Shutterstock has found a way to right the scales for both parties. While other companies like Getty Images have banned the sale of AI art on their platforms, Shutterstock is looking to incentivise the creation of such services with due compensation. 

When they partnered with Meta, Shutterstock announced that they would develop a system to compensate artists whose work has been used to train an AI algorithm. In their words

“Shutterstock is one of the first companies to pay artists for their contributions to training machine learning models, and it has proven to be a trusted partner to those entering the space by ensuring the responsible creation and licensing of content with a transparent IP transfer.”

With the rise of computer vision and generative AI, Shutterstock saw a business opportunity to provide datasets to companies. Otherwise known as ‘data deals’, this product aims to provide high-quality labelled data to companies building CV models. In July 2021, they announced the launch of Shutterstock.AI, a website focused solely on providing these datasets to AI researchers and companies. 

The product was launched with an opt-out feature following closely behind, so as to allow certain contributors to exclude their content from the datasets, if they so prefer. They also established the Shutterstock Contributor Fund, which identified users whose content was included in the dataset and compensated them adequately. 

Shutterstock’s responsible approach towards including and supporting the rise of AI is something other companies could take inspiration from. While the lawsuit currently stands on shaky ground from a scientific perspective, the precedent set by it will prove important to future generations of artists. If the artists win this lawsuit, it will establish an unequivocal legal precedent for generative AI companies in the future to adopt a measured approach towards art—including unlicensed and uncredited art—used in their datasets. 

Anirudh VK
I am an AI enthusiast and love keeping up with the latest events in the space. I love video games and pizza.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox