Listen to this story
Copyright infringement is increasingly becoming the talk of the AI world with governments imposing laws trying to rein in AI makers and their AI systems. Recently, during his appearance before Congress, OpenAI CEO Sam Altman agreed with the need for government regulation for building responsible AI systems, this included giving proper attribution and rights to the original content creators.
“We have been talking with the artists and content owners about what they think about this [copyright and attribution],” said Altman. “I think people totally deserve control over how their content and likenesses are used in this technology.”
Altman said that the creation of AI models like these was meant for helping and benefiting the creators and artists, and not stealing ownership. “That is exactly what the economic model is,” said Altman. But the same cannot be said about the government.
On the other hand, it seems as though the artists and creators that the company wants to assist with its products is actually against its adoption. Similar to how the Writer’s Guild of America was protesting against the use of ChatGPT in script writing, major music labels are sending notices to streaming services to take down “AI soundalikes“.
While Altman and other AI makers are concerned about aligning their models with the economic and ethical objectives, the government might be in this to get more control over the developing technology, and shape its future.
During the same hearing, Gary Marcus, the AI sceptic, spoke about how there are no existing laws that talk about the copyright issue with these generative AI models. To this, Senator Josh Hawley replied, “We can just say that section 230 of copyright law can be applied to models like this.”
If the government gets the power to control AI development, with the reason of “protecting” copyright owners, the very high pace of progress of these AI models would just stop. Moreover, Hawley’s statement makes it clear that the government wants to impose an all-out copyright ban on generative AI models, which can be tricky as it would not address the concerns but merely make the government rule over AI. Do we really want that?
David Holz, the founder of Midjourney, doesn’t really care about copyright infringement. In an interview with Forbes, Holz said that he is using images without seeking permission from the owners. He explains that it is impossible to do so with the huge dataset.
Regulations with Scepticism
Under the new proposed draft of the AI act by the European Parliament, any content generated by AI models like Midjourney or Stable Diffusion will have to mention that it has used copyrighted material for training, and appropriately attribute the original creator. This sounds like fair practice unless looked closely.
Under Article 28b (4) of the draft, it is written that the foundational model used in AI systems to generate either images, text, audio, or video, should comply with the transparency obligations of Article 52 (1), and publicly disclose a detailed summary of the training data under the copyright law.
If these generative AI companies have to go back and address all the corpus of data it has been trained on, which is essentially scrapped from the internet, and is put under the copyright law, they would not be able to create any more AI models. Furthermore, which specific image or generated content utilises which specific fed data is untraceable even for the companies that created the models. Or even if they can, it is extremely difficult to trace and thus difficult to attribute.
For example, Shutterstock took a step to address proper licensing of stock images used for training AI generated images. To compensate the artists, the company would pay the “fair share” through royalties each time the image was used for generating art. Though there was no precise method to explain how the model would work for this, the company did launch a contribution fund to compensate the artists. This received a lot of criticism as it meant treating the artists’ work as tokens.
Moreover, this also put the makers of these models, like OpenAI and StabilityAI, under a lot of pressure, hindering further development of this technology and risking the potential benefits. When it comes to copying images or using someone else’s work manually or through Photoshop-like softwares, the copyright holders will know for sure that their work was used. But when it comes to these generative models, the amount of data they are trained on will make their offices fill up with lawsuits.
According to the new act, anyone breaking the copyright act as mentioned in Article 28 and 52, the foundation model providers are liable to a fine of “€10 million or 2% annual turnover, whichever is higher.”
On the other hand, it is true that these technologies bring up several ethical questions with them about training on unauthorised and private data. But there needs to be a fair balance between copyright infringement and building generative AI models.
Is there a balance?
Amid the concerns around the European Parliament’s proposed AI act and the US Senate discussion, the Copyright Office of the US has specified some guidelines regarding the registration of works created solely by machines. For instance, if an AI technology generates intricate written, visual, or musical works based on a prompt from a human, the traditional elements of authorship in such works will not be registered.
The reason behind this is that AI technology determines the expressive elements, rather than the human user, making the generated content ineligible for copyright protection.
Nonetheless, a work that incorporates AI-generated material can still be eligible for copyright protection if it includes a sufficient amount of human authorship. For instance, if a human creatively selects or arranges AI-generated content or modifies it to the point where the modifications meet the standard for copyright protection, then copyright protection applies only to the aspects that were contributed by the human.
In a similar notable event, a macaque monkey named Naruto captured selfies using a photographer’s camera. Subsequently, the photographer faced a lawsuit from the People for the Ethical Treatment of Animals (PETA) who contended that Naruto, the monkey, was the rightful owner of the photographs, and thus the photographer was infringing on Naruto’s copyright.
However, the Court of Appeals for the 9th Circuit ruled that nonhuman entities are not eligible for copyright protection. This decision aligned with the US Copyright Office’s definition of an “original work”, which explicitly requires a “human author” to be involved.
This might be the way forward for generative AI as well. Instead of following the European path, the US should stick with the current copyright laws. Generative AI companies should adopt techniques and make ways to incentivize artists and original authors. This would encourage AI innovation with sufficient regulation, and not too much control by the governments.