Recently, researchers at DeepMind open-sourced the dm_control software package, which is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation. Besides the tools and libraries, the cutting-edge AI company also introduced a public colab notebook with a tutorial for dm_control software.
One of the most critical pre-requisites of artificial general intelligence (AGI) is the power of controlling the physical world. The scientists and engineers at DeepMind designed the dm_control package to facilitate their own continuous control and robotics needs, and is therefore, appropriate for research.
Behind dm_control Package
The infrastructure of dm_control package includes MuJoCo wrapper, that provides convenient bindings to functions and data structures, and provides full access to the underlying engine, the PyMJCF and Composer libraries that enable procedural model manipulation and task authoring, the Control Suite, the Locomotion framework and manipulation tasks.
The core components of this package are –
- dm_control.mujoco: Libraries that provide Python bindings to the MuJoCo physics engine.
- dm_control.suite: A set of Python reinforcement learning environments powered by the MuJoCo physics engine.
- dm_control.viewer: An interactive environment viewer.
MuJoCo Python Interface
The dm_control package is written in Python and relies on the C-based MuJoCo physics library. MuJoCo or Multi-Joint dynamics with Contact physics library is a fast reduced-coordinate, continuous-time physics engine. It is a general-purpose simulator and a popular choice for robotics and reinforcement learning research.
One of the advantages of using this physics engine is that the engine supports name for all model elements as it is often more convenient and less error-prone to refer to model elements by name rather than by index. The library makes the package easy to use and modify by the developers. Further, the wrapper bindings provide easy access to all MuJoCo library functions and enums, automatically converting NumPy arrays to data pointers where appropriate.
The DeepMind Control Suite has been around for a few years now. It is a set of continuous control tasks with a standardised structure and interpretable rewards, intended to serve as performance benchmarks for continuous control learning agents or reinforcement learning agents.
In the Control Suite, the researchers added a new quadruped and dog environment. The quadruped has 56 state dimensions, where each leg has 3 actuators for a total of 12 actions. In the dog environment, the leo3Dmodels created a realistic model of a Pharaoh Dog for DeepMind researchers and made it available to the wider research community.
PyMJCF is basically a document object model. This library provides a Python object model for MuJoCo’s XML-based MJCF physics modelling language. The goal of the library is to allow users to easily interact with and modify MJCF models in Python.
One key feature of this library is the ability to easily compose multiple separate MJCF models into a larger one, while automatically maintaining a consistent, collision-free namespace. According to the researchers, one typical use case of this library is the need for robots that includes a variable number of joints. Additionally, the library provides Pythonic access to the underlying C data structures with the bind() method of mjcf.Physics.
The Composer Library
Composer is the high-level “game engine” which streamlines the composing of Entities into scenes and the defining observations, rewards, terminations and general game logic.
According to the researchers, the Composer framework organises reinforcement learning environments into a common structure and endows scene elements with optional event handlers. At a high level, the Composer defines three main abstractions for task design, which are composer.Entity, composer.Task and composer.Environment.
The Locomotion framework is designed to facilitate the implementation of a wide range of locomotion tasks for RL algorithms by introducing self-contained, reusable components
which compose into different task variants. The Locomotion framework introduces several abstract Composer entities, such as the Arena (a self-scaling randomised scene) and Walker (a controllable entity with common locomotion-related methods), facilitating locomotion-like tasks.