DeepMind’s Progress Over The Years In Robotics

Backed by Alphabet, DeepMind’s progress has helped it hold the flag high in robotics over the past few years.
DeepMind Robotics

AI Research Lab DeepMind acquired and open-sourced MuJoCo, a rich and effective contact model. By open-sourcing Multi-Joint Dynamics with Contact (MuJoCo), DeepMind has given a major push to its robotics ambition.

This article will trace how DeepMind has been making consistent efforts in pushing the envelope in robotics.

Deep reinforcement learning to training robots

In 2016, DeepMind researchers demonstrated how deep reinforcement learning can train real physical robots. The paper showed that deep Q-functions-based reinforcement learning algorithms can scale to complex 3D manipulation tasks and efficiently learn deep neural network policies. The authors further showed that the time to train the robots can be further reduced by algorithm parallelisation across multiple robots that asynchronously pool their policy updates. The proposed methodology can learn a variety of 3D manipulations skills in simulation and a door opening skill (often considered a complex task for robots to train on) without manually designed representations.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Producing Flexible Behaviours

In 2018, DeepMind published three major papers to demonstrate flexible and natural behaviours to reuse and adapt to solve tasks. The scientists trained agents with a variety of simulated bodies to perform activities like jumping, turning, and crouching across diverse terrains. The results showed that the agents develop these skills without receiving specific instructions.

Credit: DeepMind

Download our Mobile App

Another paper demonstrated a method to train a policy network that imitates motion capture data of human behaviours to pre-learn skills like walking, getting up from the ground, turning, and running. These behaviours can then be tuned and repurposed to solve other tasks like climbing stairs and navigating through walled corridors.

The third paper produced a neural network architecture based on state of the art generative models. This research showed how this architecture is capable of learning relationships between different behaviours and imitating specific actions that are shown to it. After training, the systems could encode a single observed action and create a new novel movement.

Scaling data-driven robotics

DeepMind demonstrated a framework for data-driven robotics which uses a large dataset of recorded robot experience before scaling it to several tasks using a learned reward function. This framework can be applied to accomplish three different object manipulation tasks on a real robot platform. 

The scientists used a special form of human annotations as supervision to learn a reward function and demonstrate tasks with task-agnostic recorded experience. This helps in dealing with real-world tasks where the reward signal cannot be acquired directly.

The learned rewards and large dataset of experience derived from different tasks are used to learn robot policy offline using batch reinforcement learning. This approach makes it possible to train agents to perform challenging manipulation tasks like stacking rigid objects.

New Benchmark for Stacking

DeepMind recently introduced RGB-Stacking as the new benchmark for vision-based robotic manipulation tasks. Here the robot has to learn how to grasp different objects and balance them over each other. It was different from previous works because of the diversity of the objects used and the variety of empirical evaluations performed to verify the accuracy of the results. 

Credit: DeepMind

The results demonstrated that complex multi-object manipulation can be learnt using a combination of simulation and real-world data. The experiment could also suggest a strong baseline for generalisation to novel objects. 

This experiment is considered a major advancement in DeepMind’s endeavour towards making generalisable and useful robots. The authors will now work to make robots better understand the interaction with objects of different geometries. The RGB-Stacking benchmark has been open-sourced along with the designs for building real-robot RGB-stacking environments, RGB-object models and information for 3D printing. 


MuJoCo is a physics engine simulator that facilitates research and development in fields that require fast and accurate simulations like robotics, biomechanics, graphics, animation, etc. Developed by Emo Todorov for Roboti, MuJoCo is one of the first full-featured simulators designed from scratch for model-based optimisation through contacts. Before DeepMind’s acquisition, MuJoCo was a commercial product between 2015 and 2021.

MuJoCo helps in scaling up computationally intensive techniques like optimal control, system identification, physically consistent state estimation, and automated mechanism design before applying them to complex dynamic systems in contact-rich behaviours. It also has applications like testing and validating control schemes before deploying on physical robots, gaming, and interactive scientific visualisation.

Wrapping up

This is probably a slow phase for research and development work in robotics. DeepMind rival OpenAI, after investing many years of research, resources and efforts into robotics, finally decided to disband its robotics research team and shift focus to domains where data is more readily available. On the industry side, too, several robotics-based companies have shut shop or are undergoing major losses. Given the circumstances, robotics, despite being such a lucrative industry, has limited to no buyers.

Backed by Alphabet, DeepMind’s progress has helped it hold the flag high in this field over the past few years.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Shraddha Goled
I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Is Foxconn Conning India?

Most recently, Foxconn found itself embroiled in controversy when both Telangana and Karnataka governments simultaneously claimed Foxconn to have signed up for big investments in their respective states