Top 10 Frameworks For Reinforcement Learning An ML Enthusiast Must Know

One of the popular machine learning techniques, reinforcement learning has been used by various organisations and academia to handle large and complex problems. The technique has been thoroughly used by the researchers to gain efficient automation in machines and systems.

Below here, we listed down the top ten frameworks for reinforcement learning, in alphabetical order, an ML enthusiast must know.


About: Acme is a framework for distributed reinforcement learning introduced by DeepMind. The framework is used to build readable, efficient, research-oriented RL algorithms. At its core, Acme is designed to enable simple descriptions of RL agents that can be run at various scales of execution, including distributed agents. This framework aims to make the results of various RL algorithms developed in academia and industrial labs easier to reproduce and extend for the machine learning community at large. 

Know more here.


About: DeeR is a Python library for deep reinforcement learning. The framework is built with modularity in mind so that it can easily be adapted to any need and provides many possibilities such as Double Q-learning, prioritised Experience Replay, Deep deterministic policy gradient (DDPG), Combined Reinforcement via Abstract Representations (CRAR).

Know more here.


About: Dopamine is a popular research framework for fast prototyping of reinforcement learning algorithms. The framework aims to fill the need for a small, easily grokked codebase in which users can freely experiment with research. The design principles of this framework include flexible development, reproducibility, easy experimentation and more.

Know more here.


About: Frap or Framework for Reinforcement learning And Planning is unifying that identifies the underlying dimensions on which any planning or learning algorithm has to decide. The framework provides deeper insight into the algorithmic space of planning and reinforcement learning and also suggests new approaches to integrate both the fields. The aim of this framework is to provide a common language to categorise algorithms as well as it identifies new research directions.

Know more here.

Learned Policy Gradient (LPG)

About: Introduced by DeepMind, Learned Policy Gradient (LPG) is a new meta-learning approach that generates a reinforcement learning algorithm. This new approach sheds light into an entirely new rule which includes both ‘what to predict’, such as value functions and ‘how to learn from it’, such as bootstrapping by interacting with a set of environments. The framework automatically discovers reinforcement learning algorithms from data generated by interaction with a set of environments. 

Know more here.


About: RLgraph is a reinforcement learning framework that quickly prototypes, defines and executes reinforcement learning algorithms both in research and practice. The framework supports TensorFlow (or static graphs in general) or eager/define-by-run execution (PyTorch) through a single component interface. Using RLgraph, developers can combine high-level components in a space-independent manner and define input spaces.

Know more here.


About: Surreal, also known as Scalable Robotic REinforcementlearning ALgorithms, is an open-source, scalable framework that supports state-of-the-art distributed reinforcement learning algorithms. The framework decomposes a distributed RL algorithm into four components, which are a generation of experience (actors), storage of experience (buffer), updating parameters

from experience (learner), and storage of parameters (parameter server). It is a principled distributed learning formulation that accommodates both on-policy and off-policy learning.

Know more here.


About: SLM Lab is a software framework for reproducible reinforcement learning (RL) research. The framework implements several popular RL algorithms and provides synchronous and asynchronous parallel experiment execution, hyperparameter search, and result analysis. Reinforcement learning algorithms in SLM Lab are built around three base classes, which are algorithms, deep networks and memory.

Know more here.


About: TayPO or Taylor expansion Policy Optimisation is a policy optimisation framework that generalises methods like trust region policy optimisation (TRPO) and improves the performance of several state-of-the-art distributed algorithms. It is a general framework where Taylor expansions share high-level similarities with both trust-region policy search and off-policy corrections. Taylor expansions are basically a method based on the Taylor series concept that is used to describe and approximate math functions. 

Know more here.


About: Tensorforce is an open-source deep reinforcement learning framework. The framework works with an emphasis on modularised flexible library design and straightforward usability for applications in research and practice. Built on top of Google’s TensorFlow framework and requires Python 3, Tensorforce follows a set of high-level design choices including modular component-based design, separation of RL algorithm, etc.

Know more here.

More Great AIM Stories

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

More Stories


8th April | In-person Conference | Hotel Radisson Blue, Bangalore

Organized by Analytics India Magazine

View Event >>

30th Apr | Virtual conference

Organized by Analytics India Magazine

View Event >>

AI Good Teammate
Victor Dey
Can AI Be A Good Teammate?

Recently, researchers have been able to develop a few RL agents that can learn games from scratch through pure self-play without any human input.

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM