Here’s Why Tencent’s New AI Beat 99.81% Of Humans At MOBA

Tencent, the owner of China’s largest messaging app WeChat, is one of the biggest tech players in the country. The company has been doing a number of AI researches in its video game business and has gained the position of being the second-largest cloud platform in China. The company has been implementing AI in games in various ways such as real-time identification games, playtime limits for children, among others. 

Last year, Fine Art, a Go-playing computer built by Tencent, defeated a human Go champion. The International Go Federation reported that Fine Art played 34 games against professionals given a two-stone handicap, and won 30.

Recently, the researchers at Tencent AI lab developed an AI system which has the capability to defeat human champions in a smash-hit mobile game called Arena of Valor. The Arena of Valor, also known as Honor of Kings is a multiplayer online battle arena (MOBA) game. To the latest, the researchers revealed the technique which has been utilised to master the MOBA game.


Sign up for your weekly dose of what's up in emerging technology.

AI Techniques Behind the System

For this system, the researchers studied the deep reinforcement learning problem of complex action control in the Multi-player Online Battle Arena (MOBA) 1v1 games. The researchers claimed that this system is of low coupling and high scalability which enables efficient explorations at large scale. The algorithm basically includes several strategies such as decoupling of control dependency, an attention mechanism for target selection,  game-knowledge-based pruning method called action mask for the efficient exploration, LSTM for learning skill combos, and an improved version of proximal policy algorithm (PPO) objective called dual-clip PPO. 

The researchers design a scalable and loosely-coupled system architecture to construct the utility of data parallelism. The architecture mainly consists of four modules which help in providing high throughput and smooth data storage and transmission while avoiding the bottleneck of communication cost. They are mentioned below

Download our Mobile App

  • Reinforcement Learning (RL) learner: The RL Learner is a distributed training environment. To accelerate policy update using large batch sizes, multiple RL Learners are integrated to parallelly fetch data from the same number of Memory Pools.
  • AI Server: AI Server covers the interaction logic between the game environment and the AI model. It basically generates episode via self-play with mirrored policies.
  • Dispatch Module: Dispatch Module collects data samples from AI Servers, consisting of reward, feature, action probabilities, etc.
  • Memory Pool: Memory pool is basically a server where its internals are implemented as a memory-efficient circular queue for data storage.

How It Works

The researcher designed a deep reinforcement learning framework together with a set of algorithm which helps to enable efficient explorations at massive scale for multi-agent competitive environments like MOBA 1v1 games. For this system, a neural network architecture along with encoding of multi-modal inputs, the decoupling of inter-correlations in controls, exploration pruning mechanism, and attack attention is designed to consider the everchanging game situations in MOBA 1v1 games.

Wrapping Up

In order to evaluate the trained AI system’s capability in the real world, the researchers deployed the AI model into the Honor of Kings game to play against the professional human players. The results were such that the AI model beat the professional human players on heroes of different types. The model achieved 5 kills per game but gets killed only 1.33 times on average. 

The researches further evaluated whether the policies learned by the AI model could counter to a diversity of top human players. In this case, the model achieved a 99.81% win rate among 2,100 matches, while losing in only 4 games. As the next move, the researchers assured that the framework and algorithm will be open-sourced to the public, while the game core of Honor of Kings will be made accessible to the community to facilitate further research on complex games. 

More Great AIM Stories

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

AIM Upcoming Events

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 10th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Top BI tools for Mainframes

Without BI, organisations will not be able to dominate with data-driven decision-making but focus on experiences, intuition, and gut feelings.