Video games are usually based in a perfect (or stable) world scenarios where one adorns a character who is dismissed from the reality, possesses inhuman capabilities and is thus invincible. It comes as no surprise then, that these highly complex environments are the breeding ground for some of the most interesting work in artificial intelligence. In fact, many AI researchers admit to being obsessed with building smart agents, which are cannot be worked within the real-world scenarios. This makes these games better for AI than real-world problems. In this article, we shall look at some of the examples and results of the implementation of reinforcement learning.
Sign up for your weekly dose of what's up in emerging technology.
Try, Fail, Learn, Repeat
Reinforcement learning is a technique in building an AI network where an agent is allowed to play or run by itself, correcting its movements and outputs every time it makes a mistake. The computation power and training time required solely depends on the type of problem we are trying to solve by building a model.
For example, if you are building a model for sentiment analysis of Twitter data, one can get this data from the internet — it can be scrapped or it can be from Twitter itself. A model can be built with this data which can predict with an accuracy above 90%. But for video game scenarios, the play-style data is not available because every style is not suitable for every situation. Gathering data till you have every situation seems eternal in games like Dota 2. That is where reinforcement learning is useful. Here, agents are let inside an environment like a game. Every game has rewards and loss — for example, completing the objectives in a given time or collecting bonuses before they disappear. These tasks are considered as rewards whereas failing to complete the objective is considered as a loss or punishment.
Let us say that an agent is driving a car — a Koenigsegg Regera in Need For Speed Payback. The goal of the agent is to win the game. But when the agent is let inside the game, it will not know how to move forward, left or right. By trial-and-error method it learns how to navigate, it is also rewarded for reaching the checkpoints on time and if it doesn’t a negative value is assigned. This way the agent will know how to navigate through the course also learns to dodge the obstacles and finally master the track. This is a single agent game, where it makes decisions for rewarding itself alone.
In real-world scenarios, the trial and error methods like driving fast through a checkpoint for rewards can buy you a ticket, or failure to dodge obstacles on road might be fatal. These are impossible to encounter and deal with in real-world. That is why games are perfect to experiment one’s logic in building human-like decision making agents.
Team Spirit: Duos Or Squads
We all know building an AI agent for a single scenario like driving a car or walking through a maze is easy, but when AI researchers tried to use the same algorithm on a team of five agents. It did not produce the convincing results. The researchers at OpenAI are performing the training algorithms in an environment called Rapid, where parallel reinforcement learning can take place.
To work in environments which includes teamwork, for example, games like Counter-Strike or Dota 2. Both the games consist of 10 players each, where each team consists of 5 plays and battle against each other. Counter-Strike is a First Person Shooting game which includes eliminating the opponents or defusing the bomb. Whereas, Dota 2 is a role-playing game, where each team has a set of unique heroes with abilities. Every hero has abilities and can buy items from the gold which can be earned and eliminating the opponents or killing the creeps. Teamwork is the strategy behind both the games and are highly complex to pull off some game-plays.
If the same technique of rewarding each agent separately would result in selfish acts, such as stealing the kill to attain a reward and taking bounty runes in Dota 2 (applicable in previous patches) without helping the carry to farm. This definitely reminds us of 1k MMR games of Dota, where never involves teamwork. To overcome this situation or counter the selfish acts of agents, the researchers awarded points for team plays, like a 5 vs 5 team fights. If the team wins, they are awarded more points compared to rewards for solo objectives. This way the agents have learned to play together and selflessly as a team, which have resulted in game-winning scenarios.
With reinforcement learning being implemented for solving unsupervised tasks, we believe that some of the work which is considered impossible by humans can be accomplished by AI agents.