Listen to this story
Of late, we have seen AI agents like CICERO and DeepNash take on games like Diplomacy and Stratego, respectively. Why did big tech companies like Meta and DeepMind pick these old strategy games to test their AI chops?
There are multiple reasons why researchers use games to train AI. Games are usually built on a rule-based methodology, which is fairly easier to code for. However, the secret sauce lies in the open-ended nature of games. The very same thing that drives people to play games and create their own experiences is what makes them the best place to train AI algorithms.
Researchers aim to create a model using the rules of the game, and then proceed to let it learn how to solve various problems in a real game. This not only improves the agent, but also offers more information for it to further improve in its tasks. For instance, if the model is trained using reinforcement learning, it can also improve itself based on the tasks it undertakes in the game.
All these reasons and more make video games the perfect training gauntlet for games. However, this method also has its own downsides.
Generating training data
First and foremost, collecting data and making a dataset free of bias is one of the most overlooked parts of creating a new AI model. However, many models today still function with bias due to certain inferences being drawn from the dataset. AI trained in a simulated environment, like a game, is less prone to such problems.
The process of data collection can be skipped completely when training an agent in a simulated environment. We only need to take a look at DeepMind’s approach to training its algorithms, where the neural networks are put through hundreds of hours worth of simulation to train against themselves. In a process known as multi-agent reinforcement learning, AlphaStar was able to surpass 99.8% of human players in the game Starcraft II. These strategic problem-solving algorithms can process decisions and derive insights from real-world military standoffs, helping humans make important decisions.
This approach of letting the algorithm play the game by itself was able to save the researchers the hard work of collecting data on how the game is actually played in the real world. Extending this approach to more realistic games, like Grand Theft Auto V, researchers can collect data on how roads look, how vehicles might function, and the decisions that an AI agent will make in a similar real-world situation. This can be used as a dataset for an agent in-charge of managing a self-driving car, reducing the need to collect high-quality data from real roads and traffic.
Training an agent on a real-world simulation also highlights the issues that need to be solved by the model. CICERO, a model which learned to play the strategy game Diplomacy, had to combine principles of strategic reasoning with a potent natural language processing model to solve the complex problems presented by the game. Using this approach, this model can be deployed in the real world to function as a diplomatic assistant in human-to-human interactions.
Secondly, games have a rule-based skeleton that allows neural networks to adapt themselves to various scenarios. Clear rules are a developer’s best friend, as they clearly demarcate the boundaries between various mechanics of the game.
This then enables programmers to create clear and concise algorithms that aim to solve for the problems that the rules create in the game. However, when the rules become less concrete and the gameplay takes on a more abstract tone, AI agents begin to fall short. A go-to example of this is DeepMind’s AI trained to play 57 Atari games, which consistently failed in the game Pitfall due to its trial-and-error approach.
It is easy for a human playing the game to engage in logical thinking and extend the principles they learned from the rules of the game to solve the problems presented in the game. It’s a different story for AI, as they either have to be coded in such a way where they actively learn from prior knowledge, or engage in a trial-and-error approach to beat the game through brute force.
An ideal game offers an open environment that extends beyond the rules to support various tasks, goals, and ways to solve these problems. These environments can create unique problems for AI researchers to solve. NVIDIA’s MineDojo is a prime example of this, as researchers fused together three internet-scale datasets with a natural language model to create a neural network that can execute a variety of tasks in Minecraft.
The open-ended nature of Minecraft makes this a behemoth challenge to undertake, as the game is not only infinite in size, but also in possibility. While the rules are clearly set, the scope of what the player can do is endless. Solving for performing tasks in Minecraft lays the groundwork for future AI agents that can interact with humans and clearly understand what they mean by their prompts, even if it is an open-ended statement.
Games function as worlds that we cannot reach, but experience. For AI agents, they represent the ultimate training ground and one of the cornerstones of future AI research. Learning from close-to-real-world situations might just be what agents need to bridge the gap between current narrow AI and future generalised AI.