Facebook Crowdsources This Computer Vision Dataset, Partners With An Indian University

It's critical to understand that poor representation in computer vision datasets can be harmful, especially as the AI industry lacks unambiguous explanations of bias.

Facebook AI has announced its ambitious long-term project, Ego4D, to solve challenges in egocentric perception – the ability of AI to understand and interact with the world in a similar fashion as we humans do, i.e., from a first-person perspective. To further simplify the understanding, in order to train and teach AI, take, for example, the computer vision system is fed with millions of photos and videos captured by a third person. However, the next-gen AI systems need data that shows the world from the first-person perspective.

Further, the Facebook AI team has collaborated with 13 universities and labs across nine countries, including India. International Institute of Information Technology (IIIT), Hyderabad, is the only university from India to team up for the Ego4D project. Founded in 1998, IIIT-H has evolved strong research programmes over the years in several areas, with a strong emphasis on science, technology and applied research for both industry and society.

This consortium of universities and labs collected more than 2,200 hours of first-person video in the wild, with over 700 participants going about their daily lives. This cooperation dramatically scales up the amount of egocentric data publicly available to the research community, that too by orders of magnitude more than 20 times greater than any other data set in terms of hours of footage. Facebook supported and funded the project through academic gifts to each of the participating labs and universities.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Image: Facebook AI

The Project

Facebook AI also created five benchmark challenges based on first-person visual experience, which will help future AI assistants progress toward real-world applications. These include:

Download our Mobile App

  • Episodic memory: What happened when? (For example, “Where did I leave my purse?”)
  • Forecasting: What am I likely to do next? (e.g., “You have to add two spoons of sugar now.”)
  • Hand and object manipulation: What am I doing? (e.g., “Let me know how to play the guitar.”)
  • Audio-visual diarization: Who said what when? (e.g., “What was the time to reach the cinema?”)
  • Social interaction: Who is interacting with whom? (e.g., “Help me better hear the person talking to me at this noisy restaurant.”)

Ego4D is aimed at addressing issues in embodied AI, a discipline that aims to develop AI systems with a physical or virtual embodiment, such as robots. Embodied AI is based on the theory of embodied cognition, which states that many parts of psychology, human or another, are shaped by aspects of an organism’s entire body. Researchers intend to increase the performance of AI systems such as chatbots, autonomous vehicles, robots, and even smart glasses that have to interact continuously with their environments, humans, and other AI by applying this logic to AI. 

Facebook distributed head-mounted cameras and wearable sensors to the participants so that they could capture first-person, unscripted videos of their daily lives. The research participants capture video of their day to day routines such as cooking, grocery shopping, talking while playing games and engaging in activities with family and friends. Hence, everything was captured from the centre of action rather than someone shooting the video or capturing the photo from the sidelines.

Moreover, the Facebook AI team said it will make this data publicly available in November 2021.

In addition to the same, researchers from Facebook Reality Labs employed Vuzix Blade Smart Glasses to further produce an additional 400 hours of first-person video data in staged situations in their own study labs. This information will also be made public.

Wrapping up

It’s critical to understand that poor representation in computer vision datasets can be harmful, especially as the AI industry lacks unambiguous explanations of bias. Consider, for example — ImageNet and OpenImages — two big, publicly available datasets, have previously been discovered to be US and Euro-centric, embodying humanlike biases regarding race, gender, colour, ethnicity, weight, and others. The datasets from the Ego4D project can ward off these concerns up to a good extent. 

Additionally, we can hope that it will be possible for assistants to deliver value in unique and meaningful ways using AI-driven capabilities enabled by Ego4D’s benchmarks and trained on the data set. 

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

kumar Gandharv
Kumar Gandharv, PGD in English Journalism (IIMC, Delhi), is setting out on a journey as a tech Journalist at AIM. A keen observer of National and IR-related news.

Our Upcoming Events

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023

21 Jul, 2023 | New York
MachineCon USA 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

The Great Indian IT Reshuffling

While both the top guns of TCS and Tech Mahindra are reflecting rather positive signs to the media, the reason behind the resignations is far more grave.

OpenAI, a Data Scavenging Company for Microsoft

While it might be true that the investment was for furthering AI research, this partnership is also providing Microsoft with one of the greatest assets of this digital age, data​​, and—perhaps to make it worse—that data might be yours.