Tech giants are faced with mounting consequences for disregarding data protection laws, with hefty fines being levied on many of them by the governments. The latest to get a rap on the knuckles is Meta that was penalised a whopping $410 million by the Irish Data Protection Commission for failing to adhere to EU’s General Data Protection Regulation.
Adding to the privacy woes is ChatGPT. While OpenAI’s widely-acclaimed conversational chatbot has garnered a lot of publicity for its wide usage across different domains, little has been spoken about whether it ensures privacy. And this is particularly important considering that for a consumer-facing product like ChatGPT to be better, it needs to keep collecting user data to train its model. It is a never-ending cycle where new data will train the model to provide better AI, in turn to provide new data again. ChatGPT’s methods of data collection have left companies visibly spooked.
A case for this emerged in the warning issued by Amazon to its employees against sharing company information with ChatGPT. Business Insider’s examination of Amazon’s internal communication revealed that a company lawyer warned employees against sharing confidential information or code with the AI chatbot. This precautionary measure was taken after ChatGPT generated responses that replicated Amazon’s internal data.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
As a recent report showed, the chatbot was able to correctly answer interview questions from Amazon. It was able to provide answers to some exclusive questions known only to the recruiting team of the company. It won’t be long before ChatGPT will also be able to reproduce the technical questions (or find a pattern among them) asked generally at individual organisations. In such a case, it would be necessary for all organisations to issue guidelines for its usage. Therefore, policies pertaining to AI for fair use have become even more critical. And, not just enterprises but also for regulators to intervene and establish standards for building safe AI systems.
AI Act in the Making
The EU AI Act (AIA) – the first of its kind – assigns AI applications into three risk groups. The first group includes AI systems that pose an ‘unacceptable risk’ and are therefore banned, such as social scoring systems run by the government. The second group includes ‘high-risk’ AI systems, such as CV-scanning tools for job applicants, which are subject to specific legal requirements. Lastly, the third group includes AI systems that are neither high-risk nor banned and are largely unregulated.
Download our Mobile App
At the moment, the categorisation and enforcement of such laws seem vague and done for no end cause. Until now, be it GDPR, AIA, or India’s DPDP (Digital Personal Data Protection) Bill, policies have primarily focused on protecting the interests of the consumer. However, given the harm that AI systems are causing to businesses, it is imperative that regulatory bodies develop standards that dictate how AI systems should be created.
Setting AI Standards – an impossible task?
Hadrien Pouget, in writing for Lawfare, explains that currently there is a lack of knowledge on how to create state-of-the-art AI systems that consistently adhere to established principles. Furthermore, there is also an absence of methods to test if AI systems are adhering to these principles. Although simpler AI techniques may be more manageable, the recent advancements in AI, particularly with neural networks, remain largely mysterious.
“Their performance improves by the day, but they continue to behave in unpredictable ways and resist attempts at remedy,” Pouget writes. In fact, along the lines set up earlier, Pouget also stresses that the reason setting standards for reliability of neural networks is difficult is because these models are essentially guided by data and they can learn from data in unintuitive and unexpected ways.
Therefore, working with neural networks is such that it is almost impossible to set standards on how AI systems should be made and tested. However, on the other hand, it is absolutely unavoidable for chatbots and other AI systems to collect data to be able to produce better outputs. It is the same argument that we have visited numerous times before in the case of big tech companies. While the means for collecting data were unlawful, it was deemed “inevitable” to ensure personalised user experience.
The issue of the kind of input data used to train AI has emerged in many other contexts as well. For instance, Clearview AI collected people’s images from the web and used them to train its facial surveillance AI without consent. Its database comprises approximately 20 billion images. Despite facing numerous lawsuits, fines, and cease-and-desist orders for violating people’s privacy, Clearview has managed to evade paying some fines and has refused to delete data despite orders from regulators.
This is just one example of how unclear regulations can impact enterprise and consumers alike at an unprecedented scale.
There was also the case of Matthew Butterick who filed a lawsuit against GitHub Copilot for violating open-source licences. Butterick claimed that GitHub makes “suggestions” for code based on someone else’s intellectual property, while failing to credit or compensate them for the same. Microsoft, on the other hand, puts it on the end-user to IP scan the code before usage. Perhaps the lack of AI standards is what led Microsoft and OpenAI to appeal to court to throw out the AI copyright lawsuit against them.
We are concerned about the data that big tech companies are currently collecting for advertising and user experience purposes. However, with the advancement of AI and chatbots, the volume of data collected is expected to dramatically increase, raising even more concerns about privacy and the proper use of personal information.