In the last six months, the world has witnessed a hurried adoption of virtual alternatives to dodge the pandemic blues. Virtual meetings, conferences, digital twins have almost become a norm. With AR/VR poised to ride the next big wave of innovations, it is essential to revisit their current state of functionality in the realms of the physical world. For this to happen, we need to innovate heavily in a field called embodied AI.
AI assistants of the future must navigate effectively, look around their physical environment, listen and build memories of their 3D space. The field of embodied AI deals with the study of intelligent systems with a physical or virtual embodiment (robots and egocentric personal assistants). The idea here is that intelligence emerges in the interaction of an agent with the environment, and as a result of sensorimotor activity.
Sign up for your weekly dose of what's up in emerging technology.
To equip the AI agents to navigate smartly, the researchers at Facebook AI have released a handful of frameworks:
For Super Realistic Acoustics
The team has released an audio-visual platform, which the researchers can use to train AI agents in 3D environments with highly realistic acoustics. The team believes that this platform can be leveraged to launch more embodied AI tasks for navigating to a sound-emitting target, to learn from echolocation, or even exploring with multimodal sensors.
Adding sound, wrote the researchers, can yield faster training and more accurate navigation at inference. This will also enable the agent to discover the goal on its own from afar. With the help from Facebook Reality Labs, they have released SoundSpaces, audio rendering for the 3D environments.
Source: Paper by Changan Chen et al.,
Navigate Through Your Hall
The team at FAIR developed Semantic MapNet, an end-to-end learnable framework for building top-down semantic maps. This framework can be used to show where the objects are located from egocentric observations. Semantic MapNets can enable agents to learn and reason about how to navigate. For instance, in a future where virtual real estate agents are a thing, buyers might need a tour that mimics reality. A smart home assistant should be able to answer questions like how many rooms or how many chairs can fit into a hall and other trivia.
For Embodied AI Simulations
Habitat-Lab, is a product of AI Habitat simulation environment for embodied AI research. Habitat lab is a modular high-level library for end-to-end development in embodied AI — defining embodied AI tasks like navigation, question and answering etc. This library helps in configuring embodied agents (physical form, sensors, capabilities), training the agents (via imitation or reinforcement learning), and benchmarking their performance on the defined tasks using standard metrics.
For a long time now, Facebook AI has invested heavily in building intelligent AI systems that can think, plan, and reason about the real world. By combining embodied AI systems with their in house state of the art 3D deep learning tools, the team aims to further improve understanding of objects and places. The contributions of the team to usher next-gen embodied AI systems can be summarised as follows:
- Developed a new algorithm for the room navigation task in Habitat.
- New systems can build maps in a top-down style, like humans; think of agents that need to navigate to a room first to fetch something.
- Improved the way virtual robots follow instructions in simulation, by utilising disembodied large-scale image captioning data sets.
- Created a new benchmark for faster mapping in unfamiliar environments.
- Introduced a reinforcement learning method so that an embodied agent autonomously discovers the affordance landscape of a new unmapped 3D environment.
Know more embodied AI advancements here.