Last updated February 23, 2022
In AI News & Update

Google & Waymo introduce Block-NeRF to enable large-scale scene reconstruction

Block-NeRF is built on NeRFs and the recently introduced mip-NeRF extension.

Share

Published on February 23, 2022

by SharathKumar Nair

Researchers from UC Berkeley, Waymo and Google Research have proposed a grid-based Block-NeRF variant for representing larger environments. In the paper, Block-NeRF: Scalable Large Scene Neural View Synthesis, the researchers demonstrated that when scaling NeRF to render city-scale scenes spanning multiple blocks, it is vital to decompose the scene into individually trained NeRFs.

Block-NeRF is built upon NeRFs and the recently introduced mip-NeRF extension, a multiscale representation for anti-aliasing neural radiance fields that reduces aliasing issues that hurt NeRF performance in scenes where the input images observe a given scene from different distances. The team also incorporates techniques from NeRF in the Wild (NeRF-W) to deal with inconsistent scene appearances when applying NeRF to landmarks from the Photo Tourism dataset. The proposed Block-NeRF can thus combine many NeRFs to reconstruct a coherent large environment from millions of images

The researchers used Block-NeRF, a variant of Neural Radiance Fields that can represent large-scale environments. Researchers demonstrated that when scaling NeRF to render city-scale scenes spanning multiple blocks, it is vital to decompose the scene into individually trained NeRFs. This decomposition decouples rendering time from scene size, enables rendering to scale to arbitrarily large environments, and allows per-block updates of the environment. The team adopted several architectural changes to make NeRF robust to data captured over months under different environmental conditions. They also added appearance embeddings, learned pose refinement, and controllable exposure to each individual NeRF, and introduced a procedure for aligning appearance between adjacent NeRFs so that they can be seamlessly combined.

Researchers used San Francisco’s Alamo Square neighbourhood as the target area and the city’s Mission Bay District as the baseline. The training dataset was derived from 13.4 hours of driving time sourced from 1,330 different data collection runs for a total of 2,818,745 training images.

Access all our open Survey & Awards Nomination forms in one place

SharathKumar Nair

Sharath is an ardent believer in the ‘Transhumanism’ movement. Anything and everything about technology excites him. At Analytics India Magazine, he writes about artificial intelligence, cybersecurity and the impact these emerging technologies have on day-to-day human lives. When not working on a story, he spends his time reading tech novels and watching sci-fi movies and series.