Active Hackathon

Intel AI Proposes Novel RL For Teaching Robots Teamwork

Intel AI researchers presented CERL — a novel framework that allowed agents to learn challenging continuous control problems - e.g., training a 3D humanoid model to walk from scratch.

Intel AI has proposed and developed MERL (Multiagent Evolutionary Reinforcement Learning), a scalable, data-efficient method for training a team of agents to solve a coordination task jointly. In short, the agents can learn to not only maximise their reward and develop self-interested strategies, but take decisions that can benefit a team as a whole.

The advances in computer vision and reinforcement learning (RL) have improved perception and decision making – the two key aspects of autonomous systems. Recent advances in RL methods often leverage these capabilities to enable agents to interact with the environment more efficiently and make better decisions.


Sign up for your weekly dose of what's up in emerging technology.

However, when it comes to training multiple agents in an environment — the complexity level increases.  Take, for example, in a game of soccer; forward agents are trained to score goals. Sometimes, the game requires even the forward player to sideline its goal-scoring prerogative and defend the team’s lead in the final minutes of the game. 

In the proposed method, a set of agents is represented as a multi-headed neural network with a common trunk. Researchers split the learning objective into two optimisation processes that operate simultaneously. For each agent, they use a policy gradient method to optimize its dense local rewards. For the sparser team objective, the team utilizes an evolutionary method similar to its earlier approach in CERL

Earlier this year, Intel AI researchers presented CERL — a novel framework that allowed agents to learn challenging continuous control problems – e.g., training a 3D humanoid model to walk from scratch.

This enables the team to optimize both objectives simultaneously without explicitly mixing them together. We construct a population of teams, where each team is evaluated on its performance on the actual task. Following each evaluation, strong teams are retained, weak teams are eliminated, and new teams are formed by genetic operations like mutation and crossover on the elite survivors. Periodically, agents that are trained using policy gradients are inserted into the evolutionary population to provide building blocks for the evolutionary search process. At any given time, the team with the highest score for the task is considered the champion team.

One can read the entire methodology here.

Additionally, Intel recently announced Intel Arc, its upcoming consumer graphics product. This brand will cover hardware, software, and services. Under Arc, Intel introduced its first generation of GPUs based on the Xe HPG microarchitecture. Formerly called DG2, it is now code-named Alchemist. Intel also revealed that future generation hardware code names under Arc would be Battlemage, Celestial, and Druid.

More Great AIM Stories

kumar Gandharv
Kumar Gandharv, PGD in English Journalism (IIMC, Delhi), is setting out on a journey as a tech Journalist at AIM. A keen observer of National and IR-related news.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM