AI Research Lab DeepMind acquired and open-sourced MuJoCo, a rich and effective contact model. By open-sourcing Multi-Joint Dynamics with Contact (MuJoCo), DeepMind has given a major push to its robotics ambition.
Deep reinforcement learning to training robots
In 2016, DeepMind researchers demonstrated how deep reinforcement learning can train real physical robots. The paper showed that deep Q-functions-based reinforcement learning algorithms can scale to complex 3D manipulation tasks and efficiently learn deep neural network policies. The authors further showed that the time to train the robots can be further reduced by algorithm parallelisation across multiple robots that asynchronously pool their policy updates. The proposed methodology can learn a variety of 3D manipulations skills in simulation and a door opening skill (often considered a complex task for robots to train on) without manually designed representations.
Producing Flexible Behaviours
In 2018, DeepMind published three major papers to demonstrate flexible and natural behaviours to reuse and adapt to solve tasks. The scientists trained agents with a variety of simulated bodies to perform activities like jumping, turning, and crouching across diverse terrains. The results showed that the agents develop these skills without receiving specific instructions.
Another paper demonstrated a method to train a policy network that imitates motion capture data of human behaviours to pre-learn skills like walking, getting up from the ground, turning, and running. These behaviours can then be tuned and repurposed to solve other tasks like climbing stairs and navigating through walled corridors.
The third paper produced a neural network architecture based on state of the art generative models. This research showed how this architecture is capable of learning relationships between different behaviours and imitating specific actions that are shown to it. After training, the systems could encode a single observed action and create a new novel movement.
Scaling data-driven robotics
DeepMind demonstrated a framework for data-driven robotics which uses a large dataset of recorded robot experience before scaling it to several tasks using a learned reward function. This framework can be applied to accomplish three different object manipulation tasks on a real robot platform.
The scientists used a special form of human annotations as supervision to learn a reward function and demonstrate tasks with task-agnostic recorded experience. This helps in dealing with real-world tasks where the reward signal cannot be acquired directly.
The learned rewards and large dataset of experience derived from different tasks are used to learn robot policy offline using batch reinforcement learning. This approach makes it possible to train agents to perform challenging manipulation tasks like stacking rigid objects.
New Benchmark for Stacking
DeepMind recently introduced RGB-Stacking as the new benchmark for vision-based robotic manipulation tasks. Here the robot has to learn how to grasp different objects and balance them over each other. It was different from previous works because of the diversity of the objects used and the variety of empirical evaluations performed to verify the accuracy of the results.
The results demonstrated that complex multi-object manipulation can be learnt using a combination of simulation and real-world data. The experiment could also suggest a strong baseline for generalisation to novel objects.
This experiment is considered a major advancement in DeepMind’s endeavour towards making generalisable and useful robots. The authors will now work to make robots better understand the interaction with objects of different geometries. The RGB-Stacking benchmark has been open-sourced along with the designs for building real-robot RGB-stacking environments, RGB-object models and information for 3D printing.
MuJoCo is a physics engine simulator that facilitates research and development in fields that require fast and accurate simulations like robotics, biomechanics, graphics, animation, etc. Developed by Emo Todorov for Roboti, MuJoCo is one of the first full-featured simulators designed from scratch for model-based optimisation through contacts. Before DeepMind’s acquisition, MuJoCo was a commercial product between 2015 and 2021.
MuJoCo helps in scaling up computationally intensive techniques like optimal control, system identification, physically consistent state estimation, and automated mechanism design before applying them to complex dynamic systems in contact-rich behaviours. It also has applications like testing and validating control schemes before deploying on physical robots, gaming, and interactive scientific visualisation.
This is probably a slow phase for research and development work in robotics. DeepMind rival OpenAI, after investing many years of research, resources and efforts into robotics, finally decided to disband its robotics research team and shift focus to domains where data is more readily available. On the industry side, too, several robotics-based companies have shut shop or are undergoing major losses. Given the circumstances, robotics, despite being such a lucrative industry, has limited to no buyers.
Backed by Alphabet, DeepMind’s progress has helped it hold the flag high in this field over the past few years.