Last updated February 28, 2024
In AI Origins & Evolution

OpenAI Gets Slapped With Another Class-Action Lawsuit

The latest case, filed in a federal court in the northern district of California, aims to test a novel legal theory claiming that OpenAI infringed upon the rights of millions of internet users

Share

Published on June 29, 2023

by K L Krithika

Listen to this story

A prominent law firm based in California has initiated a class-action lawsuit against OpenAI. The lawsuit alleges that OpenAI extensively violated the copyrights and privacy of numerous individuals by utilising scraped data from the internet to train its technology.

This is not the first time, previously, a lawsuit was filed by Anthony Trupia, alleging that OpenAI had been running a “Non-Profit” entity – supposedly for ‘the benefit for all humanity’ – has perpetrated a massive fraud on donors, beneficiaries, and the public at large, and has exposed ‘all of humanity’ to massive risks for personal gain.

In November last year, a class action lawsuit was filed against Microsoft, OpenAI and GitHub for scraping the licensed code to build its AI-powered Copilot. This has become the worst nightmare for them, and keeps getting bigger. They are now desperately looking for an escape, asking the court to dismiss a proposed class action complaint.

Charming Sam Altman, at the US Senate hearing, had acknowledged that the company had been sued many times before. When asked “What for” by US Senator Lindsey Graham, he said: “Um, I mean, they’ve mostly been like pretty frivolous things, like I think happens to any company,” without getting into much details.

This time it gets real

AIM, previously, had predicted how OpenAI might attract more legal trouble. But, the latest case, filed in a federal court in the northern district of California, is one of many. It aims to test a novel legal theory claiming that OpenAI infringed upon the rights of millions of internet users when it used their social media comments, blog posts, Wikipedia articles, and even family recipes. The law firm behind the lawsuit, Clarkson, specialises in large-scale class-action suits covering various issues such as data breaches and false advertising.

Ryan Clarkson, the managing partner of the law firm, stated that their intention is to represent “real people whose information was unlawfully obtained and exploited for the development of this highly influential technology.” OpenAI has not responded to requests for comment regarding the lawsuit.

This legal action addresses a significant, unresolved question surrounding the proliferation of “generative” AI tools like chatbots and image generators. These technologies operate by processing vast amounts of textual data from the internet and learning to establish connections between them.

After extensive data ingestion, these “large language models” gain the ability to predict appropriate responses to prompts, enabling them to engage in complex conversations, compose poetry, and even pass professional exams. However, the individuals who originally created the billions of words never granted permission for their data to be utilised by companies like OpenAI for profit.

Clarkson argued, “All of that information is being taken at scale when it was never intended to be utilised by a large language model.” He expressed his hope that the lawsuit would result in the establishment of guidelines for training AI algorithms and fair compensation for individuals whose data is used.

The law firm has already assembled a group of plaintiffs and actively seeks additional participants for the lawsuit.

The legality of employing data scraped from the publicly available internet to train lucrative AI tools remains uncertain. Some AI developers contend that using internet data falls under the concept of “fair use” in copyright law, which allows for exceptions if the material is significantly transformed. Katherine Gardner, an intellectual-property lawyer from Gunderson Dettmer, a firm primarily representing tech start-ups, suggested that artists and other creators who can demonstrate that their copyrighted works were used in training AI models may have a case against companies. However, individuals who merely posted or commented on websites are less likely to succeed in claiming damages.

Generative AI legal tussle continues

The new class-action lawsuit against OpenAI expands its accusations by claiming that the company lacks sufficient transparency when it comes to informing users of its tools that their data may be utilised to train new products, which OpenAI can monetise, such as its Plugins tool. Additionally, the lawsuit alleges that OpenAI fails to take adequate measures to prevent children under the age of 13 from using its tools, a concern that has been raised against other tech companies like Facebook and YouTube in the past.

This lawsuit adds to the growing list of legal challenges faced by companies involved in the development and potential profitability of AI technology.

Getty Images also filed a lawsuit against the AI start-up Stability AI in February, accusing the company of illegally employing Getty’s photos to train its image-generating bot. Furthermore, OpenAI faced a defamation lawsuit this month from a radio host in Georgia, who claimed that ChatGPT produced text falsely accusing him of fraud.

Although OpenAI is not the sole company utilising scraped data from the internet to train their AI models, the law firm decided to target OpenAI due to its role in prompting larger competitors to develop their own AI technologies after ChatGPT captured the public’s imagination last year.

New regulations under discussion aim to enforce greater transparency from companies regarding the data used in their AI models. Alternatively, a court case might compel companies like OpenAI to disclose information about the data they employed, according to Gardner.

Certain companies have attempted to prevent AI firms from scraping their data. In April, Universal Music Group, a music distributor, requested that Apple and Spotify block scrapers, according to the Financial Times.

Social media site Reddit is demanding payment for further content after shutting off access to its data stream, citing how Big Tech companies have for years scraped the comments and conversations on its site. Elon Musk threatened to sue Microsoft for using Twitter data it had gotten from the company to train its AI. Musk is building his own AI company.

Access all our open Survey & Awards Nomination forms in one place

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Bad Times for Perplexity AI Begins

Siddharth Jindal

At Google I/O, the tech giant also introduced ‘AI Overviews,’ which generates summaries for the queries provided by the user on the go— again similar to Perplexity, but on steroids.