MITB Banner

Here’s Why Tencent’s New AI Beat 99.81% Of Humans At MOBA

Share

Tencent, the owner of China’s largest messaging app WeChat, is one of the biggest tech players in the country. The company has been doing a number of AI researches in its video game business and has gained the position of being the second-largest cloud platform in China. The company has been implementing AI in games in various ways such as real-time identification games, playtime limits for children, among others. 

Last year, Fine Art, a Go-playing computer built by Tencent, defeated a human Go champion. The International Go Federation reported that Fine Art played 34 games against professionals given a two-stone handicap, and won 30.

Recently, the researchers at Tencent AI lab developed an AI system which has the capability to defeat human champions in a smash-hit mobile game called Arena of Valor. The Arena of Valor, also known as Honor of Kings is a multiplayer online battle arena (MOBA) game. To the latest, the researchers revealed the technique which has been utilised to master the MOBA game.

AI Techniques Behind the System

For this system, the researchers studied the deep reinforcement learning problem of complex action control in the Multi-player Online Battle Arena (MOBA) 1v1 games. The researchers claimed that this system is of low coupling and high scalability which enables efficient explorations at large scale. The algorithm basically includes several strategies such as decoupling of control dependency, an attention mechanism for target selection,  game-knowledge-based pruning method called action mask for the efficient exploration, LSTM for learning skill combos, and an improved version of proximal policy algorithm (PPO) objective called dual-clip PPO. 

The researchers design a scalable and loosely-coupled system architecture to construct the utility of data parallelism. The architecture mainly consists of four modules which help in providing high throughput and smooth data storage and transmission while avoiding the bottleneck of communication cost. They are mentioned below

  • Reinforcement Learning (RL) learner: The RL Learner is a distributed training environment. To accelerate policy update using large batch sizes, multiple RL Learners are integrated to parallelly fetch data from the same number of Memory Pools.
  • AI Server: AI Server covers the interaction logic between the game environment and the AI model. It basically generates episode via self-play with mirrored policies.
  • Dispatch Module: Dispatch Module collects data samples from AI Servers, consisting of reward, feature, action probabilities, etc.
  • Memory Pool: Memory pool is basically a server where its internals are implemented as a memory-efficient circular queue for data storage.

How It Works

The researcher designed a deep reinforcement learning framework together with a set of algorithm which helps to enable efficient explorations at massive scale for multi-agent competitive environments like MOBA 1v1 games. For this system, a neural network architecture along with encoding of multi-modal inputs, the decoupling of inter-correlations in controls, exploration pruning mechanism, and attack attention is designed to consider the everchanging game situations in MOBA 1v1 games.

Wrapping Up

In order to evaluate the trained AI system’s capability in the real world, the researchers deployed the AI model into the Honor of Kings game to play against the professional human players. The results were such that the AI model beat the professional human players on heroes of different types. The model achieved 5 kills per game but gets killed only 1.33 times on average. 

The researches further evaluated whether the policies learned by the AI model could counter to a diversity of top human players. In this case, the model achieved a 99.81% win rate among 2,100 matches, while losing in only 4 games. As the next move, the researchers assured that the framework and algorithm will be open-sourced to the public, while the game core of Honor of Kings will be made accessible to the community to facilitate further research on complex games. 

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.