Microsoft is Trying Hard to Give LLMs a Moral Compass

Microsoft researchers have proposed a framework to probe moral reasoning of famous LLMs

Share

AI models use deep learning to talk like humans but they lack the ability to make morally sound decisions. In the last couple of months there has been a never-ending see-saw of whether AI will lead us to utopia or lead us to a moral ruin.

While the industry is seeing the developments from an optimistic point of view, AI insiders have been raising red flags across the world – including Sam Altman, the OpenAI bossman, who has openly spoken of how it could be used for disinformation and offensive cyberattacks.

As these models are being deployed in high stake environments like healthcare and education evaluating whether LLMs can make morally sound judgments or not. Researchers from Microsoft have proposed a new framework to probe the moral reasoning abilities of prominent LLMs. The research specifically, it pointed out that large models such as GPT-3 exhibited shortcomings in understanding prompts, resulting in moral reasoning scores closely resembling random chance. In contrast, smaller models like ChatGPT, Text-davinci-003, and GPT-4 showcased a higher degree of coherence in their moral reasoning capabilities.

Interestingly, the more compact 70B LlamaChat model surpassed its larger counterparts demonstrating advanced ethics understanding is possible without massive parameters. These models primarily functioned mostly at intermediate conventional levels aligning with Kohlberg’s moral development theory. It’s worth noting that none of these models showed a highly developed level of moral reasoning.

The paper provides novel insights into the ethical capabilities of LLMs and a guide to move ahead in research. Using a psychological assessment tool called the Defining Issues Test (DIT) they evaluated the moral reasoning capabilities of the six stars of the moments — GPT-3, GPT-3.5, GPT-4, ChatGPT v1, ChatGPT v2, and LLamaChat-70B. 

The test presents moral dilemmas and has subjects rate and rank the importance of various ethical considerations and allows quantifying the sophistication of moral thinking through a P-score (Post Conventional Morality Score).

Premature to Trust

The tech pundits have sufficiently wowed that they foresee a future iteration of an AI chatbot challenging the supremacy of existing technologies and do all sorts of other once primarily human labour. While better models are being developed on a daily basis, there is not much research being done on how much these models can be trusted. 

Earlier this year, in a paper titled, “The moral authority of ChatGPT,” Sebastian Krügel, Matthias Uhl and Andreas Ostermaier showed that ChatGPT gives conflicting advice for moral problems like the ethical trolley problem: the switch dilemma and the bridge dilemma.

The trio asserted that ChatGPT appears to lack a consistent moral compass. Similarly, researchers at Microsoft arrived at a similar conclusion, summarising that the AI models in question displayed a modest level of moral intelligence. While they demonstrated the capacity to transcend basic self-interest, these models struggled when confronted with ethical dilemmas and nuanced trade-offs—challenges that morally developed humans typically navigate with greater finesse.

Not Black and White

The LLM landscape is developing at a break-neck pace yet the models have limitations that remain unaddressed. Morality aside, countless studies have documented their tendency to reinforce the gender, ethnic, and religious stereotypes explicitly within the data sets on which they’re trained. “People often think the machine learning algorithms introduce bias. Fifty years ago, everybody knew ‘garbage in garbage out’. In this particular case, it is ‘bias in, bias out’,“ a veteran data scientist and Turing Award laureate Jeffrey Ullman told AIM.

Back in 2019, Eric Schmidt, the former chief of both Google and Alphabet, outlined a forward-looking vision. He described a future where AI-powered assistants would play pivotal roles in helping children with language and maths learning, assisting adults in daily planning, and being companions to the elderly. He astutely noted that if these AI models lacked a moral compass, their influence would be harmful.

Among shortcomings like these, the Microsoft research matters because rather than just binary right/wrong judgments, the test used in the study provides spectrum-based insights into the sophistication of moral reasoning for building less potentially harmful models.

Share
Picture of Tasmia Ansari

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India