MITB Banner

Guide to Intel’s Stable View Synthesis – A State-of-Art 3D Photorealistic Framework

Stable View Synthesis achieves state-of-the-art performance in 3D photorealistic view synthesis, significantly outperforming any of the current approaches

Share

Stable View Synthesis
Some of the synthesized 3D photorealistic images with Stable View Synthesis

Stable View Synthesis achieves state-of-the-art performance in 3D photorealistic view synthesis, significantly outperforming any of the current approaches. It was developed by Gernot Riegler and Vladlen Koltun from Intel Labs and published(Research Paper) recently.  Photorealistic view synthesis is the art of acquiring a new viewpoint of a subject by learning from various actual images of that subject captured in different views and orientations with identical camera settings. 

Photorealistic view synthesis can help explore space and other technologies where real photography is hardly possible. Stable View Synthesis develops a scene-based image and allows one to view the same scene from almost all possible viewpoints that can be run as a sequence of images. Input to the computer vision system can be a short video of a subject by moving the camera around the subject with a focus on the subject. 

Stable View Synthesis, shortly called SVS, develops structure-from-motion (SfM) scenario to develop image poses of input images and prediction of camera settings and orientation. These image poses are used in multi-view stereo to generate 3D dense point clouds. A 3D geometric scaffold of the scene is synthetically constructed by meshing these points. On the other hand, an autoencoder convolutional neural network is incorporated to encode sequences of input images into feature tensors. 

Photorealistic Stable View Synthesis
Photorealistic Geometric scaffolding in Stable View Synthesis – an example
Encoding of input images into feature tensors in Stable View Synthesis
Photorealistic Stable View Synthesis
Decoding synthesized feature tensors into an output 3d image in Stable View Synthesis

The pixels on the geometric scaffold corresponding to that specific view are located in many of the original images to synthesise a new view. Each of such images is used to generate feature maps through rays to arrive at view synthesis. SVS employs on-surface aggregation using a differentiable set network to process this synthesized data to produce the target ray’s feature vector.

Photorealistic Stable View Synthesis
Surface aggregation of different input rays in Stable View Synthesis

Rendering of the output image can be done by developing a depth map using camera poses and other details. This depth map is used to define how far the pixels on the geometric scaffold need to be unprojected. Thus output-view-dependent feature vectors are generated and assembled to form the feature tensors. Using the already-trained convolutional neural network, these feature tensors are transformed into the 3D reconstructed scene.

A few sampled images in a sequence capturing a playground scene from the Tanks and Temples dataset are shown below.

Photorealistic Stable View Synthesis

Coding Stable View Synthesis in python

To install Stable View Synthesis and its dependencies in your local machine, run the following commands. It should be noted that Stable View Synthesis can be trained or run only on CUDA GPU. Hence, users who work with notebook environments should enable CUDA GPU runtime to install and train the system.

 # install necessary libraries
 %%bash
 sudo apt-add-repository universe
 sudo apt-get install libeigen3-dev
 pip install torchvision 
 pip install torch-scatter 
 pip install torch-sparse 
 pip install torch-geometric
 pip install torch-sparse
 pip install open3d
 pip install python-opencv
 pip install ninja 

In order to obtain necessary source files from the github repository, clone it and update submodules.

 %%bash
 git clone https://github.com/intel-isl/StableViewSynthesis.git
 cd StableViewSynthesis
 git submodule update --init --recursive --remote 

Install the files

 %%bash
 cd StableViewSynthesis/ext/preprocess
 cmake -DCMAKE_BUILD_TYPE=Release .
 make 
 cd ../mytorch
 python setup.py build_ext --inplace 

Open up the experiments directory and run evaluation by providing the following commands. This invokes the pretrained model and runs with four sampled sequences from the tanks and temples dataset.

 %%bash
 cd StableViewSynthesis/experiments
 python exp.py --net resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16 --cmd eval --iter last --eval-dsets tat-subseq 

The whole model can also be retrained completely using the command,

 %%bash
 python exp.py --net resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16 --cmd retrain 

Stable View Synthesis exhibits qualitative as well as quantitative outperformance compared to well acclaimed approaches such as Free View Synthesis (FVS), Local Light Field Fusion (LLFF), Neural Radiance Fields (NERF), Improved NERF (NERF++), Extreme View Synthesis (EVS), and Neural Point-Based Graphics (NPBG). 

Note: The articles’ illustrations are obtained from the Tanks and Temples dataset, FVS dataset, and original research paper.

Some useful references:

Github official code repository

Original research paper

Performance analysis of SVS

View Synthesis – Wiki

Free View Synthesis – Research paper

Tanks and Temples dataset

FVS dataset

Share
Picture of Rajkumar Lakshmanamoorthy

Rajkumar Lakshmanamoorthy

A geek in Machine Learning with a Master's degree in Engineering and a passion for writing and exploring new things. Loves reading novels, cooking, practicing martial arts, and occasionally writing novels and poems.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.