Now Reading
Google AI Introduces Pathdreamer, A World Model for Indoor Navigation

Google AI Introduces Pathdreamer, A World Model for Indoor Navigation

BERT in Google Search

Google AI recently introduced their reinforcement learning world model to encapsulate rich and meaningful information about surroundings, which enables a learning agent to make its specific predictions about actionable outcomes within the environment. 

The world model, known as Pathdreamer, is an indoor navigation world model that generates high-resolution 360º visual observations of areas of a building unseen by an agent, using only limited seed observations and a proposed navigation trajectory.

The Pathdreamer model can synthesize an immersive scene from a single viewpoint, predicting what an agent might see if it moved to a new viewpoint or even a completely unseen area, such as around a corner. Beyond potential applications in video editing and bringing photos to life, solving this task promises to codify knowledge about human environments to benefit robotic agents navigating in the real world. 

Image Source: Google

World models such as Pathdreamer can also be used to increase the amount of training data for agents by training agents in the model.

The inputs and predictions both consist of RGB, semantic segmentation, and depth images. Internally, Pathdreamer uses a 3D point cloud to represent surfaces in the environment. Points in the cloud are labelled with both their RGB colour value and their semantic segmentation class, such as wall, chair or table.

To predict visual observations in a new location, the point cloud is first re-projected into 2D at the new location to provide ‘guidance’ images, from which Pathdreamer generates realistic high-resolution RGB, semantic segmentation and depth. As the model ‘moves’, new observations (either real or predicted) are accumulated in the point cloud. 

See Also

Image Source: Google

Pathdreamer is trained with images and 3D environment reconstructions from Matterport3D and is capable of synthesizing realistic images as well as continuous video sequences. Pathdreamer is capable of generating multiple diverse and plausible images for regions of high uncertainty. 

Google aims to apply Pathdreamer to several embodied navigation tasks such as Object-Nav, continuous VLN, and street-level navigation. For further details, you can try out Pathdreamer yourself using its open-source code link here.

What Do You Think?

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top