Machine learning today has revolutionised the way video games are made. We all have seen how DeepMind’s AlphaGo algorithm beat world champions at Go back in 2016. However, with time, ML has also ventured into other games like StarCraft II and DOTA 2. DeepMind has even proved that ML is going all out to take over the gaming space. StarCraft II program AlphaStar (an AI) has defeated 99.8% of the StarCraft 2 players active on Battle.net.
Furthermore, ML-powered games are also focused on testing how they can work more like humans and find solutions and creative strategies that could transfer to the real world. Interestingly, ML has succeeded, and not just in problems like manual data entry, spam detection, product recommendations, but in finding solutions that only humans are likely to come up with.
This year, OpenAI unveiled that they had created a hide and seek game where they have taught ML agents to be more human and come up with new tactics. While many claims that this is just like previous ML-powered games, the video released by OpenAI showed that it was much more than just hide and seek.
The entire project of making these ML bots play hide and seek more as humans have resulted in distinct strategies that bots used — both hiders and seekers. The intelligent behaviour that these bots have shown is just commendable.
Here are the strategies:
Chase and Run
The first strategy that the agents have learnt is the chase and run. After a few games, the blue bots have understood that they have to hide from the red bot and the red one has understood that it has to find and chase them. However, in that small arena, it was tough for the hiders to stay away from the seekers.
The game is inside an arena and the arena has few elements and tools the researchers were expecting the bots to use that. And the bots did learn how to use the boxes to block the doors. At the initial stage, the bots managed to close one door, but with time, they managed to use two boxes to close both the doors and created a separate compartment to hide.
It’s not only the hiders who learnt to use tools and props to win the game, but seekers over time have also learnt to make the best use of tools like a ramp. When the blue bots hid in their separate compartment and blocked the doors, the seekers used a ramp to climb and make its way to the compartment.
Stealing Seeker’s Tools
While all the strategies seemed to really effective and the bots with their intelligence were doing things that were least expected, the hiders came up with another strategy that was nothing less than a big surprise — the hiders stole the ramp and kept inside the compartment before blocking the compartment. Also, as an add-on, the hiders keep pushing the boxes on the door against the seekers.
To test the intelligence of the bots, the OpenAI team has deployed them in an open environment with several other tools and elements, and the bots soon learned how to use the tools to make a shelter and hide. It was a complete surprise for the entire team.
As more and more games passed, the intelligence of the bots reached a whole new level. The seekers soon realised that they can use the ramp and climb on top of the box and surf them to the nearest hiding shelter.
The Role Of Reinforcement Learning
Talking about how these bots managed to become so smart, it was possible because of reinforcement learning. Algorithms that use reinforcement learning are more about prioritising decisions that give it an advantage and makes it difficult for the opponent to make the counter move.
These AI agents of OpenAI’s hide and seek game played thousands and thousands of rounds of games in parallel for many days, they train against each other as well with their past versions using an algorithm called ‘Self Play’ and this algorithm has helped these bots to become smarter after each game.
With this simple hide and seek game by OpenAI, capabilities of AI have started becoming more clear to the people. The moves and strategies these bots have shown in the game is nothing less than incredible for any AI agent. Furthermore, this also proves that AI is not just about a set of data on which it gets trained, it is also about solving open-ended, real-world problems by self-learning. And if these advancements in this domain take another step forward, then we are definitely likely to see AI and ML solving problems that the human race have failed to solve for years.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Harshajit is a writer / blogger / vlogger. A passionate music lover whose talents range from dance to video making to cooking. Football runs in his blood. Like literally! He is also a self-proclaimed technician and likes repairing and fixing stuff. When he is not writing or making videos, you can find him reading books/blogs or watching videos that motivate him or teaches him new things.