MITB Banner

Reinforcement Learning With Augmented Data Is So Superior It Beats Google & DeepMind Hands-down

Share

From teaching a robot to drive itself off-road to adapting to never-before-seen tasks and grasping occluded objects, University of California, Berkeley, has been investing and doing a number of researches around self-learning techniques. 

Recently, researchers at UC Berkeley open-sourced reinforcement learning with Augmented Data (RAD). It is a simple plug-and-play module that can be used to enhance any reinforcement learning algorithm. The researchers claimed that this technique has proved to be faster as well as efficient in computation method by noticeable margins as compared to state-of-the-art model-based algorithms such as Google AI’s PlaNet and DeepMind’s Dreamer and SLAC for data and wall-clock efficiency.

They stated, “We hope that the performance gains, ease of implementation along with wall clock efficiency of this new technique make it a useful module for future research in data-efficient and generalisable RL methods as well as a useful tool for facilitating real-world applications of reinforcement learning.”

Behind RAD

Reinforcement learning with Augmented Data or RAD is a technique to incorporate data-augmentations to image-based observations for reinforcement learning pipelines. This technique can be combined with any on-policy or off-policy reinforcement learning algorithm and can be utilised for both discrete and continuous control tasks without any additional losses.

The technique does not make any changes to the underlying RL method and ensures that the trained policy, as well as the value function neural networks, are consistent across augmented views of the image-based observations.

With the help of RAD, the researchers ensure that an agent is learning on multiple views. In other words, the model is trained on the augmented data of the same input. According to the researchers, this allows the agent to improve on two main capabilities. They are:-

  • Data Efficiency: Agents learn to quickly master the task at hand with drastically fewer experience rollouts
  • Generalisation: Agents improve transfer to unseen tasks or levels simply by training on more diversely augmented samples  

Why RAD?

According to the researchers, supervised learning, in the context of computer vision, has addressed the problems of data-efficiency and generalisation by injecting useful priors, where one such ignored prior is data augmentation. 

Although the advancements in algorithms combined with convolutional neural networks (CNNs) have proved to be groundbreaking in various aspects, yet the current methods lack sample efficiency in learning as well as a generalisation in new environments. To mitigate such issues, RAD has been developed to incorporate data-augmentations on input observations for reinforcement learning pipelines.

Contributions Of This Research

The researchers highlighted some of the crucial contributions of this work. They are mentioned below:-

  • The researchers showed that across 15 DeepMind’s control environments, a simple RL algorithm coupled with augmented data either matches or beats every state-of-the-art baseline in terms of performance and data-efficiency
  • This technique improves test-time generalisation in several environments in the OpenAI ProcGen benchmark suite that are widely used for generalisation in RL
  • This method is faster as well as a more compute-efficient by noticeable margins compared to state-of-the-art model-based algorithms such as SLAC, PlaNet and Dreamer for data and wall-clock efficiency
  • The custom implementations of random data augmentations enabled researchers to apply augmentation in the RL setting, where observations consist of stacked frames inputs, without breaking the temporal information present in the stack
  • The vectorised and GPU-accelerated augmentations in RAD are competitive and on average faster than state-of-the-art framework APIs such as PyTorch.

Wrapping Up

The researchers open-sourced the RAD module, which is available on GitHub. The researchers showed that data augmentations such as random crop, colour jitter, patch cutout, and random convolutions could enable simple RL algorithms to match or outperform complex state-of-the-art methods across common benchmarks in terms of data-efficiency, generalisation, and wall-clock speed.

Read the paper here.

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.