Active Hackathon

Stanford Brings Out BEHAVIOR Benchmark For 100 Everyday Household Tasks

BEHAVIOR is a benchmark for embodied AI with 100 everyday activities

A team of researchers from different disciplines at Stanford University has released BEHAVIOR (Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments), a benchmark for embodied AI with 100 everyday activities like washing dishes, picking up toys, cleaning floors, etc. in simulation. It has been the current version of BEHAVIOR available publicly at

In creating this benchmark, the team led by leading computer scientist and Stanford Institute for Human-Centered AI co-director Fei-Fei Li and experts from computer science, psychology, and neuroscience, have established a “North Star”. It is a visual reference point to gauge the success of future AI solutions. It has usage potential to develop and train robotic assistants in virtual environments that are then shifted to operate in real ones. This paradigm is known as “sim to real.”


Sign up for your weekly dose of what's up in emerging technology.

What is Embodied AI?

Scientists have always wanted to reach a stage in technological advancement where robots will help humans do daily (yet complex tasks). The researchers say that even when we reach that level of sophistication, for a robot to do these tasks, it must be able to perceive, reason, and operate with full awareness of its own physical dimension and capabilities and also the objects surrounding it. This combination of physical and situational awareness is called embodied AI.

As per the research titled, “BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments”, progress has been made to bring out embodied AI solutions. These include visual navigation, interactive Q&A, and instruction following, among others. But to develop artificial agents that can eventually perform and assist in daily tasks with human-level flexibility, a comprehensive benchmark is needed with more realistic, diverse, and complex activities.

Complex for Robots

Though on the surface, we might think, it is not complicated as these robots have to be trained just to do basic tasks which human beings can do very easily, in reality, this is not the case at all. It is indeed a complex phenomenon. 

The researchers give an example of cleaning a countertop.

  • The robot has to perceive and understand what a countertop is
  • Where to find it
  • Understand that it needs cleaning and assess counter’s physical dimensions
  • What products are best used to clean the countertop
  • How to coordinate its motions to get the countertop
  • The robot has to then determine the best course of action needed to clean the counter. While this might be a minor procedure for humans, for robots, it will be complex. It has to understand which materials are soakable and then declare whether a countertop is actually clean or not.

Although much progress has happened, the research says that three major issues have prevented existing benchmarks from filling the above three criteria. These are

  • Identifying and defining meaningful activities for benchmarking
  • Developing simulated environments that support such activities
  • Defining success and objective metrics to evaluate performance.

How is BEHAVIOR different?

The research says that BEHAVIOUR works on the three issues by:

  • Introducing BEHAVIOR Domain Definition Language (BDDL). It is a representation adapted from predicate logic that maps simulated states to semantic symbols. It allows the team to define 100 activities as initial and goal conditions. It then helps for the generation of potentially infinite initial states and solutions for achieving the goal states.
  • Help in its realization by listing environment-agnostic functional requirements for realistic simulation. 
  • The team provides a comprehensive set of metrics to evaluate agent performance in terms of success and efficiency. To make evaluation comparable across diverse activities, scenes, and instances, it proposes a set of metrics relative to demonstrated human performance on each activity and provide a large-scale dataset of 500 human demonstrations (758.5 min) in virtual reality, 

Future Moves

The research team aims to provide initial solutions to the benchmark with plans to extend it to presently not benchmarked tasks. It says that this will require contributions from diverse domains – robotics, computer vision, computer graphics, and cognitive science.

More Great AIM Stories

Sreejani Bhattacharyya
I am a technology journalist at AIM. What gets me excited is deep-diving into new-age technologies and analysing how they impact us for the greater good. Reach me at

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: How to Evolve with Changing Workforce

The demand for digital roles is growing rapidly, and scouting for talent is becoming more and more difficult. If organisations do not change their ways to adapt and alter their strategy, it could have a significant business impact.

All Tech Giants: On your Mark, Get Set – Slow!

In September 2021, the FTC published a report on M&As of five top companies in the US that have escaped the antitrust laws. These were Alphabet/Google, Amazon, Apple, Facebook, and Microsoft.

The Digital Transformation Journey of Vedanta

In the current digital ecosystem, the evolving technologies can be seen both as an opportunity to gain new insights as well as a disruption by others, says Vineet Jaiswal, chief digital and technology officer at Vedanta Resources Limited

BlenderBot — Public, Yet Not Too Public

As a footnote, Meta cites access will be granted to academic researchers and people affiliated to government organisations, civil society groups, academia and global industry research labs.