What Happened in Reinforcement Learning in 2022

From robots playing football to learning how to walk on the moon!
Listen to this story

Just like how we learn from our environment and our actions determine whether we are rewarded or punished, so do reinforcement learning agents whose ultimate aim is to maximise the rewards.

This article brings the top 8 reinforcement learning innovations that shaped AI across several industries in 2022. 

  1. Ithaca – AI model to decipher ancient text

Alphabet’s DeepMind collaborated with the University of Venice, the University of Oxford and the Athens University of Economics and Business to build a deep neural network called ‘Ithaca’, which can restore missing text from ancient texts.

In a paper published in Nature, DeepMind stated that Ithaca was trained using natural language processing (NLP) to not only recover lost ancient text that has been damaged over time but also identify the original location of the text and establish the date when it was made.

For more information, click here

  1. AlphaTensor – Fastest method to multiply matrices

With DeepMind’s latest release AlphaTensor, an AI system (based on a 3D board game), researchers shed light on a 50-year-old fundamental mathematics question of finding the fastest way to multiply two matrices.

To play the game, the researchers trained a new version of AlphaZero, called ‘AlphaTensor’. Instead of learning the best moves to make in ‘Go’ or chess, the system learned the best steps to make when multiplying matrices. Then, using DeepMind’s favourite reinforcement learning, the system was rewarded for winning the game in as few moves as possible.

For more information, click here

  1. Architecture for tokamak magnetic controller design

Google’s DeepMind AI team collaborated with physicists from the Swiss Plasma Centre at EPFL in Ecublens, Switzerland, to develop an AI method to control the plasmas inside a nuclear fusion reactor. 

The study helps further nuclear fusion research and could also help quicken the arrival of a cheaper, cleaner, and unlimited source of energy. 

For more information, click here

  1. Human-level Atari 200x Faster

In the new paper ‘Human-level Atari 200x Faster’, a DeepMind research team applies diverse strategies to Agent57, with their resulting MEME (Efficient Memory-based Exploration) agent surpassing the human baseline on all 57 Atari games in just 390 million frames—two orders of magnitude faster than Agent57.

For more information, click here.

  1. LEAP (Legged Exploration of the Aristarchus Plateau)

Just like Apollo astronauts, a four-legged robot trained through AI learned that jumping is the best way to move around on the moon’s surface.

An update on LEAP, a mission concept study to explore some of the most challenging lunar terrains, was presented in September at the Europlanet Science Congress (EPSC) 2022.

The robot has been trained using reinforcement learning in a virtual environment to simulate the lunar ground, dust properties as well as gravity.

For more information, click here.

  1. InstructGPT

OpenAI used reinforcement learning from human intervention and feedback fine-tuned GPT-3. As a result, the new model, ‘InstructGPT’, is extremely good at generating text from single-sentence prompts. 

(Source: OpenAI Blog)

For more information, click here.

  1. MIT’s mini cheetah robot

MIT researchers detail how they taught a mini cheetah robot to play goalie in a soccer match through reinforcement learning. 

According to the researchers, the proposed framework can be extended to other scenarios. The authors explained, “Soccer goalkeeping using quadrupeds combines highly dynamic locomotion with precise and fast non-prehensile object manipulation. The robot needs to react and intercept a flying ball using dynamic locomotion manoeuvres in a very short amount of time, usually less than one second”.

For more information, click here

  1. Sparrow – DeepMind’s Chatbot

To fill the communication gap between man and machine, DeepMind recently released its new AI chatbot ‘Sparrow’, a “useful dialogue agent that reduces the risk of unsafe and inappropriate answers”. 

As per the subsidiary of Google’s parent company, Alphabet, the chatbot is designed to “talk, answer questions and look up evidence using Google when it’s helpful to inform its responses”.

For more information, click here.

Download our Mobile App

Tasmia Ansari
Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Is Sam Altman a Hypocrite? 

While on the one hand, Altman is advocating for the international community to build strong AI regulations, he is also worried when someone finally decides to regulate it