Last updated September 29, 2023
In AI Insights & Analysis

Microsoft is Trying Hard to Give LLMs a Moral Compass

Microsoft researchers have proposed a framework to probe moral reasoning of famous LLMs

Share

Published on October 1, 2023

by Tasmia Ansari

AI models use deep learning to talk like humans but they lack the ability to make morally sound decisions. In the last couple of months there has been a never-ending see-saw of whether AI will lead us to utopia or lead us to a moral ruin.

While the industry is seeing the developments from an optimistic point of view, AI insiders have been raising red flags across the world – including Sam Altman, the OpenAI bossman, who has openly spoken of how it could be used for disinformation and offensive cyberattacks.

As these models are being deployed in high stake environments like healthcare and education evaluating whether LLMs can make morally sound judgments or not. Researchers from Microsoft have proposed a new framework to probe the moral reasoning abilities of prominent LLMs. The research specifically, it pointed out that large models such as GPT-3 exhibited shortcomings in understanding prompts, resulting in moral reasoning scores closely resembling random chance. In contrast, smaller models like ChatGPT, Text-davinci-003, and GPT-4 showcased a higher degree of coherence in their moral reasoning capabilities.

Interestingly, the more compact 70B LlamaChat model surpassed its larger counterparts demonstrating advanced ethics understanding is possible without massive parameters. These models primarily functioned mostly at intermediate conventional levels aligning with Kohlberg’s moral development theory. It’s worth noting that none of these models showed a highly developed level of moral reasoning.

The paper provides novel insights into the ethical capabilities of LLMs and a guide to move ahead in research. Using a psychological assessment tool called the Defining Issues Test (DIT) they evaluated the moral reasoning capabilities of the six stars of the moments — GPT-3, GPT-3.5, GPT-4, ChatGPT v1, ChatGPT v2, and LLamaChat-70B.

The test presents moral dilemmas and has subjects rate and rank the importance of various ethical considerations and allows quantifying the sophistication of moral thinking through a P-score (Post Conventional Morality Score).

Premature to Trust

The tech pundits have sufficiently wowed that they foresee a future iteration of an AI chatbot challenging the supremacy of existing technologies and do all sorts of other once primarily human labour. While better models are being developed on a daily basis, there is not much research being done on how much these models can be trusted.

Earlier this year, in a paper titled, “The moral authority of ChatGPT,” Sebastian Krügel, Matthias Uhl and Andreas Ostermaier showed that ChatGPT gives conflicting advice for moral problems like the ethical trolley problem: the switch dilemma and the bridge dilemma.

The trio asserted that ChatGPT appears to lack a consistent moral compass. Similarly, researchers at Microsoft arrived at a similar conclusion, summarising that the AI models in question displayed a modest level of moral intelligence. While they demonstrated the capacity to transcend basic self-interest, these models struggled when confronted with ethical dilemmas and nuanced trade-offs—challenges that morally developed humans typically navigate with greater finesse.

Not Black and White

The LLM landscape is developing at a break-neck pace yet the models have limitations that remain unaddressed. Morality aside, countless studies have documented their tendency to reinforce the gender, ethnic, and religious stereotypes explicitly within the data sets on which they’re trained. “People often think the machine learning algorithms introduce bias. Fifty years ago, everybody knew ‘garbage in garbage out’. In this particular case, it is ‘bias in, bias out’,“ a veteran data scientist and Turing Award laureate Jeffrey Ullman told AIM.

Back in 2019, Eric Schmidt, the former chief of both Google and Alphabet, outlined a forward-looking vision. He described a future where AI-powered assistants would play pivotal roles in helping children with language and maths learning, assisting adults in daily planning, and being companions to the elderly. He astutely noted that if these AI models lacked a moral compass, their influence would be harmful.

Among shortcomings like these, the Microsoft research matters because rather than just binary right/wrong judgments, the test used in the study provides spectrum-based insights into the sophistication of moral reasoning for building less potentially harmful models.

Access all our open Survey & Awards Nomination forms in one place

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.