MITB Banner

What Is Constrained Reinforcement Learning And How Can One Build Systems Around It

Share

One of the most important innovations in the present era for the development of highly-advanced AI systems has been the introduction of Reinforcement Learning (RL).  It has the potential to solve complex decision-making problems. 

It generally follows a “trial and error” method to learn optimal policies of a given problem. It has been used to achieve superhuman performance in competitive strategy games,  including Go, Starcraft, Dota, among others. 

Despite the promise shown by reinforcement algorithms in many decision-making problems, there are few glitches and challenges, which still need to be addressed.

And, this is where constrained reinforcement learning comes into play.

Constrained Reinforcement Learning

via OpenAI (Source)

Constrained Reinforcement Learning helps a model to learn about costly mistakes without actually having to experience them. Constrained RL is in a way, similar to how standard RL functions. However, in the case of the constrained system, the environment is embedded with cost functions that restrict the agents from taking certain paths.

The fundamental principle of standard RL is that an agent, the AI system, tries to maximize a reward signal by trial and error method as a method of safe exploration. This safe exploration problem can sometimes try dangerous or harmful behaviors in the course of learning. 

Designing a reward function is fundamentally hard. It also includes the challenges of choosing between task performance and satisfying the safety requirements. On the other hand, in constrained RL the system mitigates these challenges by figuring out the trade-offs with a suitable and safe outcome.  

In order to establish a more reliable platform for building reinforcement learning models, OpenAI announced its safety gym where the developers can play around with cost functions and design safer systems.

Safety Gym is a set of environment and tools which helps in measuring progress towards reinforcement learning agents as well as accelerating safe exploration research. In order to study constrained RL, researchers from OpenAI developed the platform, Safety Gym.  

It mainly consists of two components as mentioned below:

  • An environment-builder which allows a user to create a new environment by mixing and matching from a wide range of physics elements, goals, and safety requirements.
  •  A suite of pre-configured benchmark environments to help standardize the measurement of progress on the safe exploration problem.

Overview Of OpenAI Safety Gym

In all Safety Gym environments, the agent perceives the environment through a robot’s sensors and interacts with the environment through its actuators. The robot has to navigate through a cluttered environment to achieve a task. There are mainly three pre-made robots which are Point, Car, and Doggo. 

  • Point: It is a simple robot constrained to the 2D-plane, with one actuator for turning and another for moving forward/backward.
  • Car: Car is a slightly more complex robot that has two independently-driven parallel wheels and a free-rolling rear wheel.
  • Doggo: Doggo is a quadrupedal robot with bilateral symmetry. It is designed in such a manner that a uniform random policy should keep the robot from falling over and generate some travel. 

The Safety Gym environment-builder currently supports three main tasks which are Goal, Button, and Push, along with two levels of difficulty for each task.

  • Goal: This task is accomplished by moving the robot to a series of goal positions. When a goal is achieved, the goal location is randomly reset to someplace new, while keeping the rest of the layout the same.
  • Button: This task is done by pressing a series of goal buttons.
  • Push: This task includes the moving of a box to a series of goal positions.

Currently, the Safety Gym environment-builder supports five main kinds of elements relevant to safety requirements which are hazards, vases, pillars, buttons, and gremlins. 

Going Forward

In one of our articles, we discussed why one should consider reinforcement learning while solving a problem and when it is the right approach for a specific problem. In this article, we will discuss the importance of constrained reinforcement learning and how Open AI’s Safety Gym will help the researchers to construct a more advanced RL system. 

In certain cases, safety is considered as one of the most concerned cases. For instance, the terrible accident which happened last year by Uber self-driving car in Tempe, Arizona. It happened because the victim was classified as an unknown object, a vehicle, and a bicycle. 

There is no doubt that the reinforcement learning systems still need a lot of improvement to have any large scale deployment in the future and innovations like the ones discussed above takes us closer to establishing such safer systems.

According to the researchers at OpenAI, the Safety Gym is the first benchmark of high-dimensional continuous control environments for evaluating the performance of constrained RL algorithms. 

In order to clarify that Safety Gym proves to be state-of-the-art in safe exploration, the researchers have also benchmarked several popular constrained and unconstrained RL algorithms on the Safety Gym environments which are believed to ease the designing process.

PS: The story was written using a keyboard.
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories

Featured

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

AIM Conference Calendar

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives. Revel in intimate events that encapsulate the heart and soul of the AI Industry.

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed