Meet the Meta AI Researcher Who Helped Build CICERO

Diplomacy is an art that even the most powerful AI was not able to master till CICERO made an appearance

Share

Published on November 29, 2022

by Tasmia Ansari

Listen to this story

Centuries ago, the future of diplomacy was questioned due to technological progress – the invention of the radio, the telegraph and the public’s intervention in the foreign policy domain. Meta’s recent AI model has revived the conversation. CICERO has become the talk of the tech town and international chatbot diplomacy may not be far behind.

Diplomacy is an art that even the most powerful AI was not able to master till CICERO made an appearance. Inspired by the famous Roman orator Cicero, Meta AI showed off its model that can beat several humans in the board game, Diplomacy, which requires strategic planning and verbal negotiations with several other players. The work, researchers say, could pave the way for virtual exercise coaches and dispute mediators.

Speaking with Analytics India Magazine, one of the researchers, Athul Paul Jacob, shared his story about how Meta AI succeeded in building a model for Diplomacy with human-level performance.

Jacob got involved in AI through his undergrad mentor, Yoshua Bengio, the 2018 Turing Award laureate. Describing the foundation of CICERO, Jacob said, “I started my PhD at MIT in 2019, and Noam Brown (the project’s main lead) spoke to my advisor about the grand challenge of Diplomacy. He had a successful contribution to previous science articles on solving poker. And so everyone was wondering what his next thing was? And the conclusion was Diplomacy.”

Why Diplomacy?

Over decades, conquests have piled up, with AI agents reckoning ways to beat mortals at all sorts of games. Jacob said, “All the games AI has tackled have been much simpler. That’s when Brown originated this project and got some of the most brilliant engineers and scientists from Meta and outside involved in the project..” The objective of this research was to figure out a way to build systems that can cooperate with humans in different systems and settings.

Jacob got involved in the summer of 2020 when the team focused on a diplomacy variant where communication was not allowed. “It is called no press diplomacy that focuses on the purely strategic aspects of the game. There is communication but only through moves on the board, so it’s very different. My background in language systems was of interest to the team.”

Building dialogue systems which negotiate and coordinate with other human players was Jacob’s core interest. Initially, he worked on the model’s two major components: the language component and the dialogue component.

During his first year, he spent much time on the dialogue component. The team built a prototype that could communicate and do things but it didn’t do that well. So, a year later, in 2021, Jacob joined a different part of the team that worked on the strategic elements. One of the things that the team needed was to account for human rationality.

Diplomacy vs Dialogue

The game of Diplomacy has tonnes of challenges. According to Jacob, the first was to figure out how to communicate using language with 6 other players, as one had studied that setting before. The other was to understand how to cooperate with humans with different skill levels—and, lastly, decipher a system that does not spew nonsense like the other conversational agents in the domain today.

According to Jacob, the main difference is that none of the systems today can handle modeling humans, and that’s where the strategic component comes into the picture. Most of these systems repeat what humans say. In essence, it looks human, but is not able to do much reasoning, like how humans would think or identify what a human would say in response.

Bringing the theory of mind into the scene, he said, “Our strategic component can identify what humans will do in response, and so most of these chatbots don’t rely on that. And that is a weakness of current systems in deployment.”

What’s next?

Jacob thinks there’s still much to be done, particularly regarding the dialogue component. The model still says nonsensical things that must be clarified to fit the setting. “We did a lot of work trying to fix that, but I think there’s still more work to be done there,” he added.

When released, AlphaGo was very specific to the game of Go, and then they extended it to other board games, but then people thought that was the end. But it turns out the developments made in Monte Carlo Tree Search which was used in AlphaGo has since been used in several other use cases outside of board games.

“Our final goal was to solve Diplomacy, but that’s not the end of the story. We knew that there were going to be many techniques that we would need to develop. These techniques that we developed towards solving the game have a lot of applications to self-driving cars, automatic contract negotiations, AI personal assistants that can hold long form conversations, AI tutor that can teach you new skills among others. ” he concluded.

Access all our open Survey & Awards Nomination forms in one place

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.