DeepMind, on 1st June, released Acme — a framework for building reliable, efficient, research-oriented RL algorithms. According to the researchers, the idea behind building the Acme framework was to decrease complexities in ML-based solutions, as well as help researchers and firms, to scale effortlessly.
While we have witnessed major advancements in deep learning and computational power, complexities in developing robust solutions have also increased rapidly. Such challenges, according to the authors of the paper, has increased the difficulties for researchers to rapidly prototype ideas, thereby causing serious reproducibility issues.
Reproducibility has brought numerous criticism to the AI-based models as it has decreased trust among the users. However, with Acme, the researchers of DeepMind believe that the framework will mitigate the challenges of reproducibility and simplify the process for researchers to develop novel and creative algorithms. With Acme, one will able to scale while ensuring RL agents deliver desired results.
DeepMind, along with the Acme framework, benchmarked agents created using Acme on several environments: control suite, Atari, and bsuite.
DeepMind, with Acme, wants to meet the following goals: –
- Enhance the reproducibility of methods and results
- Simplify the design of new algorithms
- Improve the readability of RL agents
DeepMind’s researchers have kept various design principles in mind that will enable developers to easily create, test and debug RL agents in small-scale scenarios before scaling them up. Acme also leverages Reverb — an efficient data storage system that was specially designed for machine learning workflows. However, Acme supports other data structure representations like FIFO and priority queues to simplify its use for on- and off-policy algorithms.
Reverb was open-sourced by DeepMind on 26th May 2020, to streamline data storage and transport for ML-based products. It is a highly flexible system but primarily used as experience replay system for distributed reinforcement learning algorithms.
Check the research paper here.