MITB Banner

Does Deep Reinforcement Learning Really Work For Robotics?

Share

“Beyond the cost of a robot, there are many design choices in choosing how to set-up the algorithm and the robot.”

Levine et al.,

From Atari to chess, to playing poker to a single robotic arm solving rubik’s cube, deep reinforcement learning has demonstrated remarkable progress on a wide variety of challenging tasks.

Like humans, DeepRL agents adopt strategies to generate long-term rewards. The reward-driven paradigm of learning by trial-and-error is known as reinforcement learning (RL). DeepRL has emerged at the confluence of deep learning and RL, geared to achieve human-level performance across challenging domains.

Application of reinforcement learning calls for setting up of an environment, modelling of reward functions etc. You might even have to start every task from scratch. RL methods can be data-hungry and starting from scratch for every new problem makes it impractical in real-world situations. For instance, RL algorithms require millions of stochastic gradient descent (SGD) steps to train policies that can accomplish complex tasks. The number of training steps will increase with model size.  It is well known that the usefulness of captured knowledge depends on the quality of the data provided.

Overview Of DeepRL

Deep RL algorithms leverage the representational power of deep learning to tackle the reinforcement learning problem through smart selection of rewards. Rewards’ mathematical functions are carefully crafted to guide the agent in the desired direction. For example, consider teaching a robotic arm or an AI playing a strategic game like Go or chess to reach a target on its own.

Key Concepts in DeepRL

  • On Policy vs Off policy
  • Exploration strategies
  • Generalization
  • Reward Shaping

Exploration algorithms in Deep RL could be based on randomized value functions, unsupervised policy learning or intrinsic motivation. Whereas, memory-based exploration strategies offset the disadvantages of reward-based reinforcement learning. Rewards in varying environments can be inadequate in real time scenarios. 

When it comes to DeepRL deployment in real world robotics, collecting high-quality data becomes challenging. This in turn makes generalization difficult. RL generalization typically refers to transfer learning between tasks. Achieving generalisation in robotics requires reinforcement learning algorithms that take advantage of vast amounts of prior data as opposed to computer vision, where humans can label the data. DeepRL agents struggle to transfer their experience to new environments. According to OpenAI researchers, generalizing between tasks still remains difficult for state of the art DeepRL algorithms. 

Also Read: Generalization In Reinforcement Learning

In a recent survey published by renowned researcher Sergey Levine and his peers, the authors provide a treatise into how deep RL fares in a robotics context. They addressed many key challenges in RL and offered a new perspective on major challenges that remain to be solved.

Addressing The Challenges

Credits: OpenAI

The researchers took various robotic activities such as locomotion, grasping and others into account and explored the current solutions and outstanding challenges crippling these applications.

For example, the researchers observed that grasping still remains one of the significant open problems in robotics. To teach a robot to grasp requires complex interaction with previously unseen objects, closed loop vision-based control to react to unforeseen dynamics or situations, and, in some cases, pre-manipulation to isolate the object to be grasped. 

The researchers concluded: 

  • To learn generalizable grasping, we need unattended data collection and a scalable RL pipeline.
  • For getting large varied data, we need to leverage all of the previously collected data so far that is offline and need a framework that makes this easy.
  • To achieve maximal performance, combine offline data with a small amount of online data; this leads to a 86% to 96% grasp success.

Another bottleneck in robotic learning is the autonomous and safe collection of a large amount of data. The learning algorithms that perform well in the popular “Gym” environments may not work well on real robots. This is where simulation comes into picture. The researchers suggest simulation can run orders of magnitude faster than real-time, and can start many instances simultaneously. “Combining with sim-to-real transfer techniques, simulators allow us to learn policies that can be deployed in the real world with a minimal amount of real world interaction,” the authors explained.

Deep RL algorithms are notoriously difficult to use in practice. The performance depends on careful settings of the hyperparameters, and often varies substantially between runs. According to researchers at Berkeley,any effective data-driven method for DeepRL should be able to use data to pre-train offline while improving with online fine-tuning. This helps learn about the dynamics of the world and the task being solved. 

Key Takeaways

The researchers covered all the bases of deepRL from a robotics perspective. Here are a few takeaways:

  • Current deep RL methods are not as inefficient as often believed.
  • Of the many challenges, training without persistent human oversight is itself a significant engineering challenge.
  • A suitable goal for robotic deep reinforcement learning research would be to make robotic RL as natural and scalable as the learning performed by humans and animals.

Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.