Listen to this story
AI could gain an upper hand over humanity and pose “catastrophic” risks under the Darwinian rules of evolution. This particular thought of ‘machines ruling the world’ intrigued AI aficionado Dan Hendrycks, who recently came up with the philosophical paper – Natural Selection Favors AIs over Humans – which has created quite a buzz on the internet.
Inspired by Darwin’s Dangerous Idea and Moral Origins, an anthropological book, Hendrycks started questioning the current state of affairs of safety and ethics, alongside exploring the evolution of AI.
His study comes in the backdrop of people calling for a six-month pause on giant AI experiments, particularly training new AI systems more powerful than OpenAI’s GPT-4 warning against ‘profound risks to society and humanity’.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Interestingly, Hendrycks’ other work – X-Risk Analysis for AI Research – was also cited in the open letter. In one of the interviews, he told Reuters that ‘it was sensible to consider black swan events – those which appear unlikely, but would have devastating consequences. He believes that the rapid AI advancements need to be scrutinised thoroughly – i.e. imagining the worst-case scenarios for the best.
Early Signs of Disaster
Few weeks ago, a GPT-powered chatbot was assigned a sole task – to destroy humanity. Called ChaosGPT, the chatbot scans the web for the most destructive weapons and in a tweet, even said, “Human beings are among the most destructive and selfish creatures in existence.”
“Right now, people are repurposing large language models not just to be interesting chatbots, but instead have them act in the world. People know how to jailbreak and can have those agents that act in more nefarious ways,” Hendrycks said while making a point about ChaosGPT.
Previously, in AI and safety, a single agent used to be studied. “But now you’ve got multiple agents,” said Hendrycks, stating the recent Stanford experiment where the agents put together threw a Valentine’s Day party and started asking each other out on dates. “It is very hard to see how that ecosystem of AI agents would end up behaving and how to control that,” he added.
The AI safety advocate blames the newly introduced ChatGPT plugins for enabling agents like ChaosGPT. He believes it really erodes the safety barriers that we used to have in place.
Hendrycks further addressed the evolving nature of these agents as they’re increasingly becoming autonomous. “We’re giving them a looser relation and letting them directly influence the world. There’s not much attention paid to the implications of giving these AI plugins direct access to the internet,” he added.
“If we don’t build it, someone else will, and then we’ll fall behind,” said Hendrycks, highlighting the current thought process followed by the big techs.
Media-shy Hendrycks, who currently heads the California-based Center for AI Safety, thinks that the big tech are not behaving responsibly. “It is worrying that some of these companies, Google, for instance, don’t have a safety team. Proportionally, we’ve got machine learning communities of thousands of people working on the capabilities, but very few working on safety,” he added.
Less than a month ago, Microsoft fired its entire ethics and society team. The AI industry is no stranger to this behaviour. In 2020, Google fired its ethical AI team leader Tinmit Gebru. The tech giant has made efforts to stabilise the department but the chaos still reigns. A few months after Gebru’s exit, Meg Mitchell, was also shown the door and at the same time Alex Hanna decided to leave.
“Given the risks that some of the people put behind it and with the lack of research, it looks like they’re in a prisoner’s dilemma. We have a classic problem where individual incentives don’t align with the public’s incentives,” he said while talking about the collective action problem in the industry today. “Waiting until all the evidence is in to address the issue is not a good strategy,” he stressed pointing at the COVID situation when the stakeholders were unprepared for the catastrophe, taking reactive measures instead of a proactive one.
CAIS to the Rescue
“We [Center for AI Safety] are doing a lot of field building and research, trying to connect the results from the safety field to actual stakeholders and industry,” Hendrycks shared, talking about his current focus of work.
There isn’t a silver bullet to this problem, he believes. “AI is touching society in many ways and will create risk. We’ll need lots of different solutions and people working on it consequently,” he said.
To connect the research with stakeholders Hednrycks is working with the National Science Foundation (NSF) and the National Institute for Standards and Technology (NIST) for the under AI risk management framework. “I’m trying to get more grants for safety research. It’s a large constellation of things. The overall objective is to reduce some of the potentially catastrophic risks from advanced AI systems,” he said.