Active Hackathon

Breathing Life Into Robots Through Simulators

Simulation enables engineers to prototype rapidly and with minimal human effort. In robotics, physics simulations provide a secure and low-cost virtual playground for robots to gain physical skills through Deep Reinforcement Learning (DRL). However, simulations use hand-derived physics that will have difficulty adapting when tested on real hardware. This challenge is termed the “sim-to-real gap” or the domain adaptation problem. Reinforcement-based approaches(  RL-CycleGAN and RetinaGAN) have been utilised to bridge the simulation-to-reality gap for purely perceptual tasks, such as grasping. However, the gap is still present because of the dynamic characteristics of robotic systems. In this case, researchers are prompted to ask whether or not they can find a more accurate physics simulator by examining a few real robot trajectories. If so, then it may be possible to use the improved simulator to give the robot controller a higher chance of succeeding in the real world.

In a paper published in ICRA 2021, titled SimGAN: Hybrid Simulator Identification for Adversarial Reinforcement Learning, researchers proposed to treat the Physics Simulator as a learning component with a particular rewarding function trained by the DRL that penalises differences between the trajectories generated in simulation, that is, robots moving over time.


Sign up for your weekly dose of what's up in emerging technology.

According to the researchers, reinforcement learning (RL) policies can be trained using simulation data to support more diversified actions in robots. While controller creation in simulation has been made far more automatic due to implementing learning-based methodologies, moving a trained policy from simulation to real hardware typically involves considerable manual work. To account for the possible ranges in the simulation and the actual world, a range should be big enough to include all of the unmodeled differences but not so large as to impede performance.

The researchers, therefore, focused their contribution in : 

  • A unique simulation identification formulation that is posed as an adversarial RL problem
  • A learnt GAN loss that alleviates manual loss design and sensitive excitation trajectories by providing limited set-level supervision
  • Reducing the necessity for a properly defined parameter set through an expressive hybrid simulator parameterisation.

Hybrid simulator

A conventional physics simulator is a system to simulate the movement or interaction of objects in a virtual world by solving differential equations. However, given the complexity of the circumstances that robots could experience in the actual world, such environmental modelling techniques would be arduous (or possibly impossible), so it is helpful to employ a machine-based approach instead. While the simulators can fully learn from the data, the learnt simulator may violate the laws of physics when it needs to model scenarios if the training data does not cover a diversity of situations. Hence, the robot trained in such a small simulator in the actual world is more likely to fail.

To address these complications, researchers built a hybrid simulator, combining both neural networks and physics equations. In particular, researchers replace those parameters, often manually defined by the simulator — contact parameters and motor parameters — with a simulation that can be learned as the unmodified contact and motor dynamics details are important causes of the sim-to-real gap.

The third component of the hybrid simulator includes physical equations that ensure that the simulation complies with fundamental physical principles, such as energy preservation, bringing it closer to the real world and lowering the sim-to-real gap.

(Source: ResearchGate – SimGAN: Hybrid Simulator Identification for Domain Adaptation via Adversarial Reinforcement Learning)

Therefore the researchers set up the experiment to see if their method could work in order to: 

  • Enhance domain adaptability for robots with varying morphologies
  • Deal with dynamical mismatches that occur during sim-to-real transfer 
  • Manage dynamical disparities that are not intuitively translated to the list of parameters that our model identifies but can be absorbed by our model’s state-dependence.


( Source: Google AI Blog) 

Generating identical trajectories in a hybrid simulator to those collected on the real robot will be successful if one learns the parameter functions for the simulation. This ability to learn is enabled by having a metric for the trajectory similarity. GANs, created to generate synthetic images with the same distribution or “style” with a limited number of authentic images, can now be used to create synthetic trajectories indistinguishable from real ones. 

Reinforcement learning

(Source : Google AI Blog)

Therefore, the research concludes that simulation learning can be thought of as an RL problem. A trained neural network using only a small number of real-world trajectories learns state-dependent contact and motor parameters. To do this, the neural network is configured to produce the simulation’s trajectories with minimum error. Reducing this inaccuracy over a period of time increases the accuracy of a simulation that will ultimately guide the control system. 

Summing up 

One of the significant impediments preventing robots from harnessing the power of reinforcement learning is the sim-to-real gap. Researchers addressed this problem by developing a simulator that can more accurately replicate real-world dynamics while requiring only a modest quantity of real-world data. The researchers plan to expand on this basic framework by extending it to other robot learning tasks, including navigation and manipulation.

More Great AIM Stories

Ritika Sagar
Ritika Sagar is currently pursuing PDG in Journalism from St. Xavier's, Mumbai. She is a journalist in the making who spends her time playing video games and analyzing the developments in the tech world.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

The curious case of Google Cloud revenue

Porat had earlier said that Google Cloud was putting in money to make more money, but even with the bucket-loads of money that it was making, profitability was still elusive.

Global Parliaments can do much more with Artificial Intelligence

The world is using AI to enhance the performance of its policymakers. India, too, has launched its own machine learning system NeVA, which at the moment is not fully implemented across the nation. How can we learn and adopt from the advancement in the Parliaments around the world? 

Why IISc wins?

IISc was selected as the world’s top research university, trumping some of the top Ivy League colleges in the QS World University Rankings 2022