MITB Banner

New research on enabling a vision-based robotic manipulation system

The Google AI study concluded that robots could use the BC-Z system to complete 24 new tasks with an average success rate of 44%.

Share

Robots with the ability to interact with the real-world and navigate multiple novel tasks based on random user commands remain the holy grail of robotics. While research in general-purpose robots has made great strides, machines with the human-like ability to learn something new on their own is still a distant dream. 

Of late, the robotics team at Google AI published a paper demonstrating how robots can understand new instructions and figure out how to finish a novel task. The research tackled the problem of helping robots adapt to generalisable language models using a visual system.

The paper titled “BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning” aimed to prove that having a broader and scaled-up dataset strengthened the robot’s generalisation abilities. 

The study was divided into two parts: 

  • A large demonstration dataset that included 100 different tasks
  • A neural network policy

The study concluded that robots could use the BC-Z system to complete 24 new tasks with an average success rate of 44%. 

Data collection

The study collected data by remote-controlling the robot using a virtual reality headset. The researchers then recorded the robots demonstrating each task. When the robot has finished learning a policy, the researcher deploys the policy under tight supervision. As soon as the robot gets stuck or makes a mistake, the researcher interferes, and course corrects.

Berkeley Artificial Intelligence Research or BAIR developed a visual training method called One-Shot Imitation, which combined model-agnostic meta learning (MAML) and imitation learning. In model-agnostic meta learning, a model could use a small sample dataset and apply it to various learning problems like regression and reinforcement learning.

Google AI used this method of visual training along with periodic human intervention.

The mixed approach, which includes both demonstration and intervention, led to a notable improvement in the robot’s performance. Sequential problems like imitation learning rely on observations from past actions, which can cause compounding errors. The data collection strategy led to better results than experiments that only used human demonstrations. 

Training 

The data was used to do all 100 tasks by training a neural network policy to map the robot’s positioning and orientation from camera images. The next process was to describe the task either as a language command or video, where a person shows how to do the task. 

After the policy was trained and conditioned to the instructions, there was a chance that the neural network could interpret them to do a new task. The robot will face the challenge of identifying the relevant objects and ignoring cluttered objects in its environment. 

Results 

Out of 28 held-out tasks, the robot succeeded in completing 24 tasks, suggesting the experiment was successful to a certain degree. Also, natural language models can give robots flexibility using pre-trained language embeddings. Furthermore, language models can generalise concepts in the training data. The compositional generalisation capabilities can be transferred to robots to help them follow instructions for pairs of objects that were previously unseen together. The study shows human intervention at essential moments can speed up the learning curve for a robot to adapt to new tasks. The solution to the grand problem of robots being able to perform new tasks independently may still be a far off dream, but this indicates gradual progress in this regard.

Share
Picture of Poulomi Chatterjee

Poulomi Chatterjee

Poulomi is a Technology Journalist with Analytics India Magazine. Her fascination with tech and eagerness to dive into new areas led her to the dynamic world of AI and data analytics.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.