MITB Banner

LLMs are an Ethical Nightmare 

Since ChatGPT became an internet celebrity, differentiating between human- and AI-generated content has become next to impossible

Share

Listen to this story

LLMs are an ethical nightmare and band-aid solutions are nowhere to be found. As users struggle with problematic outputs of language models, researchers have been striving to solve them one-by-one.  

A collectively authored research paper, from Stability AI, Meta AI Research and others have established a set of open problems so that ML researchers can comprehend the field’s current state quicker and become more productive. The paper discusses the design, behaviour, and the science behind the models rather than the political, philosophical, or moral aspects of it. 

Furthermore, the authors have identified 11 domains where LLMs have successfully been applied. Across these, we provide an overview of existing work as well as constraints we identify in the literature. The research aims to provide a map to focus on for future research. 

Issues Raised 

“People often think the machine learning algorithms introduce bias. Fifty years ago, everybody knew ‘garbage in garbage out’. In this particular case, it is ‘bias in, bias out’,“ a veteran data scientist and Turing Award laureate Jeffrey Ullman told AIM. Along similar lines, the research paper addresses the first challenge of ‘unfathomable data’. 

The next issue the paper addresses is tokenisation – the process of breaking a sequence of words or characters into smaller units. For instance, the number of tokens necessary to convey the same information varies significantly across languages, making the pricing policy of API language models unfair. For instance, the price of generating  800 words using the Ada model, the Hindi translation would require nearly 7X of tokens as well as 7X of the pricing in comparison to the same produced in English. For a language like Kannada, the pricing is 11X more than English. 

The pricing factor is not just restricted to tokens as a hefty price is paid for training these models. A few months ago, the estimated price for training a language model like GPT 3 was estimated to be $5 million. The researchers suggest, while selecting the model size, computation resources for later usage should be taken into consideration rather than one time training cost. 

Next is the issue with context length that causes a barrier for models to handle long inputs well to facilitate applications like novel or textbook writing or summarising. Very recently AI researchers stopped obsessing over model size and set their eyes on context size. The model size debate has been settled for now – smaller LLMs trained on much more data have eventually proven to be better than anything else that we know of. But then the painful task of  fine-tuning models on individual downstream tasks (e.g., for text classification or sequence labelling) comes in the way. 

The paper then talks about other issues like prompt brittleness — variations of the prompt syntax, often occurring in ways unintuitive to humans, can result in dramatic output changes, alignment bias, and hallucinations. The researchers also take into account the issues with the current methods of evaluating and benchmark tests for language models. 

Since ChatGPT has become an internet celebrity, differentiating between human-generated and produced by AI has become close to impossible. As a probable solution, AI detection tools are available all over the internet and companies like Google have also announced plans to label metadata and watermark AI-generated content on its websites. 

Anyone who has used ChatGPT or any AI powered chatbot knows that a prompt can generate different outputs just by moving a word here and there. Developing LLMs that are robust to the prompt’s style and format remains unsolved, leaving practitioners to design prompts ad-hoc rather than systematically.

Solutions Offered

Everyone from startups to big tech companies are trying to solve the pertaining issues in language models. The most common problem users have been pointing out since day one is the hallucinatory nature of these models leading them to generate factually incorrect information. Open source messiah Hugging Face has also raised red flags as the hallucination problem can further develop into a snowball. 

Furthermore, talking about ‘aligned research’ OpenAI’s models follow human intent, along with human values. “At the time [2014], this problem was almost completely neglected, but it is now becoming increasingly recognised by more mainstream AI researchers,” resonated philosopher Nick Bostrom in an interview with AIM. Today, even Google has a 34-page elaborate document on ways the tech giant is tackling the issue of AI governance. 

The research also states that the capability gap between fine tuned closed-source and open-source models pertains. With models the Vicuna, Stanford’s Alpaca, and Meta’s (leaked) LLaMA the gap has definitely narrowed but no model has proven to be an equal competitor of OpenAI’s GPT4.  

The authors of ‘Challenges and Applications of Large Language Models’ conclude that the problems pinpointed in the research remains unsolved. Apart from serving as a guideline for further research of language models, the research paper also highlights the lack of training regulation and the need for stakeholders to step in.  

Share
Picture of Tasmia Ansari

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.