Navigation among robots has been a challenge for quite some time now. This includes not only coordinated navigation, but also pathfinding, which is a tough task given that autonomous machines require a whole range of computation to execute mapping capabilities.
However, a team from Facebook seem to have come up with a potential solution. In a recent paper published by the tech giant in association with the Carnegie Mellon University and the University of Illinois, Urbana-Champaign has proposed an Active Neural Simultaneous Localization and Mapping (Active Neural SLAM) module which teaches AI agents to navigate through different environments hierarchically. As per the team, the approach is robust as it is capable of avoiding errors by using strong points of both classical and AI-based path and goal-planning methods.
Active Neural SLAM comprises raw sensor input from camera images by exploiting the regularities present in the layouts of an environment. Following this step, the module achieves better performance than the models that exist today with minimal training data. The neural SLAM module consists of a Mapper and a Pose Estimator. The Mapper creates a spatial map of a given environment and detects obstacles. The Pose Estimator, on the other hand, predicts an agent’s pose based on the past pose estimates.
Elements in the spatial map are equivalent to a cell size of 25 square centimetres in the physical world, which is tagged along with an agent’s pose by the global policy to generate long-term goals. A Planner model takes into account the goals, spatial maps and agent’s pose to estimate short-term goals from the current location to the long-term goals.
The team held an experiment where they paired Facebook’s open-source Habitat platform, which comprises a modular high-level library for training agents, with a variety of tasks, environments, and simulators. It also includes datasets with 3D reconstructions of real-world environments like an office, school and home interiors. Agents are designed to make different moves in different directions, such as 25 centimetres forward, leftward 10 degrees, or rightward 10 degrees in the environments. All the components of the Active Neural SLAM were trained simultaneously in 994 episodes consisting of 1,000 steps or 10 million frames.
The results show that the SLAM performed better when it explored a small environment with 500 steps, compared to the baseline’s 85% to 90% exploration of the same environment in 1000 steps. The baseline models also got stuck in several areas even after exploring them earlier, which cleared that the models were not able to remember the environment. Fortunately, this was not the case with SLAM.
With such positive results from the experiment, the team deployed SLAM from a simulated environment to a real-world Locobot robot. The team adjusted the height of the camera and vertical field view as it was in the Habitat simulator. The Locobot was able to explore the living room of an environment with any complexities. The co-authors wrote, “In the future, Active Neural SLAM can be extended to complex semantic tasks, such as semantic goal navigation and embodied question answering by using a semantic Neural SLAM module, which creates a map capturing semantic properties of the objects in the environment.” They concluded by writing that the model can also be combined with prior work on localization to relocalize in a previously created map for efficient navigation in subsequent episodes.