MITB Banner

Introduction To AutoInt: Automatic Integration For Fast Neural Volume Rendering

Share

AutoInt

Rendering, also known as image synthesis, generates an output from a set of specific descriptions. It can also be described as transforming an impression or idea into a real space. Rendering Systems find their application in a wide spectrum of areas and across a variety of industries. Rendering these days is used in architecture, video games, simulators, movie and TV Visual effects and design visualization. It can be used to figure out the final layout for housing architecture, visualize what a final product might look like in the manufacturing industry and much more. One popular such use case for Rendering can be found in Image Processing. By using Rendering techniques, the description of an image is transformed into an actual image. For delivering the output, computational power is taught and used as a tool. Photorealistic and Non Photorealistic images can be implemented using 2D and 3D modelling techniques. 

The resulting output image is called a “render”. A model is created to execute the rendering process, and multiple such models can be defined to create a scene file, a file with predefined objects presented in a data structure. The data contained in the scene file is then passed through a rendering program to be processed and generated output in a digital image or graphical image file. Image Rendering is an essential step for graphical pipelines as they provide mere models with their final appearance. Metrics like viewpoint, texture, lighting and shading can be easily adjusted during the synthesis. With the rapid development of computational power, it has become one of the most hotly debated areas of implementation.

As a product, a wide variety of renderers are available in the market. Some are integrated into larger modelling programs and animation packages, while some are stand-alone, others being free and open-source projects. When the pre-image or the borderline wireframe sketch gets completed, rendering is used, adding in the bitmap and procedural textures, lights, mapping, and relative positioning to the objects in the image being processed. The result is a completed image the consumer or intended viewer sees. 

There are a few different types of efficient modelling techniques that have emerged: rasterization, which includes scanline rendering, considers the objects in the scene and projects them to form an image; ray casting technique considers the scene as observed from a specific point-of-view and calculates the observed image based only on its geometry and provides judgement based on very basic optical laws of reflection intensity, also using Monte Carlo technique to reduce artifacts. The Monte Carlo technique is used in most, to obtain more realistic results at a speed that is often orders of magnitude slower.

What is AutoInt?

AutoInt, also known as Automatic integration, is a modern image rendering library used for high volume rendering using deep neural networks. It is used to learn closed-form solutions to an image volume rendering equation, an integral equation that accumulates transmittance and emittance along rays to render an image. While conventional neural renderers require hundreds of samples along each ray to evaluate such integrals and require hundreds of costly forward passes through a network, AutoInt allows evaluating these integrals with far fewer forward passes.

For training, it first instantiates the computational graph corresponding to the derivative of the coordinate-based network. The graph is then fitted to the signal to integrate. After optimization, it reassembles the graph to obtain a network that represents the antiderivative. Using the fundamental theorem of calculus enables the calculation of any definite integral in two evaluations of the network. By applying such an approach to neural image rendering, the tradeoff between rendering speed and image quality is improved on a greater scale, in turn improving render times by greater than 10× with a tradeoff of slightly reduced image quality.

The AutoInt Framework

The AutoInt Framework is distributed into four essential parts :

  • A predefined integral network architecture that instantiates the integral network.
  • Building the corresponding grad network, which is later.
  • Optimized to represent a function. 
  • Definite integrals can then be computed by evaluating the integral network, which shares parameters with its grad network.

During training in the volume rendering pipeline, the grad networks representing volume density through σ and the color c are optimized for a given set of multi-view images. For inference synthesis, the grad networks’ parameters are reassembled to form the integral networks, representing antiderivatives that can be efficiently evaluated to calculate the ray integrals through the volume. A sampling network then predicts the locations of piecewise sections used for evaluating the definite integrals.

The grad network is then fit to the input image signal with direct supervision. The integral network is now reassembled, and querying the output results in a 1D signal that is integral to the input coordinate.

AutoInt uses piecewise approximation to learn efficient closed-form solutions to integrals from the input image signals along sections. At the time of inference, rather than using hundreds of forwarding passes, it evaluates the signals efficiently using piecewise division along the ray. The rendered piecewise sections result in high image quality generated.

Implementing AutoInt

We will try to implement an example of AutoInt where we will first create class coordinates on a sample 1D function we wish to fit. We will infuse different functions we want to integrate, after which we will try to set up the integral and grad network and finally fit our grad network on the integral function. So let’s get started.

Installing the AutoInt Libraries

We will first import and install a bunch of third party libraries to get started with. We will use torchmeta here, a collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch. Torchmeta contains popular meta-learning benchmarks to use.

The following is an official implementation from AutoInt’s tutorial, the link to the colab notebook can be found here.

#installing dependencies

 import torch
 import matplotlib.pyplot as plt
 import numpy as np
 from functools import partial
 from torch.utils.data import DataLoader
 !pip install --no-dependencies torchmeta==1.4.6
 !pip install ordered-set
 !pip install colour
 from torchmeta.modules import MetaModule 
Importing the AutoInt Library,
 import sys
 import os
 from autoint.session import Session
 import autoint.autograd_modules as autoint 
Getting Started

Creating A class to create coordinates from a 1D function we wish to fit.

 #creating a class for 1D functions to fit
 class Implicit1DWrapper(torch.utils.data.Dataset):
     def __init__(self, range, fn, grad_fn=None, integral_fn=None, sampling_density=100,
                  train_every=10):
         avg = (range[0] + range[1]) / 2
         coords = self.get_samples(range, sampling_density)
         self.fn_vals = fn(coords)
         self.train_idx = torch.arange(0, coords.shape[0], train_every).float()
         #coords = (coords - avg) / (range[1] - avg)
         self.grid = coords
         self.grid.requires_grad_(True)
         #self.val_grid = val_coords
         if grad_fn is None:
             grid_gt_with_grad = coords
             grid_gt_with_grad.requires_grad_(True)
             fn_vals_with_grad = fn((grid_gt_with_grad * (range[1] - avg)) + avg)
             gt_gradient = torch.autograd.grad(fn_vals_with_grad, [grid_gt_with_grad],
                                               grad_outputs=torch.ones_like(grid_gt_with_grad), create_graph=True,
                                               retain_graph=True)[0]
             try:
                 gt_hessian = torch.autograd.grad(gt_gradient, [grid_gt_with_grad],
                                                  grad_outputs=torch.ones_like(gt_gradient), retain_graph=True)[0]
             except Exception as e:
                 gt_hessian = torch.zeros_like(gt_gradient)
         else:
             gt_gradient = grad_fn(coords) 
             gt_hessian = torch.zeros_like(gt_gradient)
         self.integral_fn = integral_fn
         if integral_fn:
             self.integral_vals = integral_fn(coords)
         self.gt_gradient = gt_gradient.detach() #implementing gradient
         self.gt_hessian = gt_hessian.detach() 
     def get_samples(self, range, sampling_density):
         num = int(range[1] - range[0])*sampling_density
         avg = (range[0] + range[1]) / 2
         coords = np.linspace(start=range[0], stop=range[1], num=num)
         coords.astype(np.float32)
         coords = torch.Tensor(coords).view(-1, 1)
         return coords
     def get_num_samples(self):
         return self.grid.shape[0]
     def __len__(self):
         return 1
     def __getitem__(self, idx):
         if self.integral_fn is not None:
             return {'coords':self.grid}, {'integral_func': self.integral_vals, 'func':self.fn_vals,
                     'gradients':self.gt_gradient, 'val_func': self.val_fn_vals,
                     'val_coords': self.val_grid, 'hessian':self.gt_hessian}
         else:
             return {'idx': self.train_idx, 'coords':self.grid}, \
                    {'func': self.fn_vals, 'gradients':self.gt_gradient,
                     'coords': self.grid}
 Defining the different 1D functions we want to integrate in our network,
 #integrating different 1D Functions
 def cos_fn(coords):
   return torch.cos(10*coords)
 def polynomial_fn(coords):
     return .1*coords**5 - .2*coords**4 + .2*coords**3 - .4*coords**2 + .1*coords
 def sinc_fn(coords):
     coords[coords == 0] += 1
     return torch.div(torch.sin(20*coords), 20*coords)
 def linear_fn(coords):
     return 1.0 * coords
 def xcosx_fn(coords):
     return coords * torch.cos(coords)
 def integral_xcosx_fn(coords):
     return coords*torch.sin(coords) + torch.cos(coords) 

We will now set up the integral network and the grad network; we will first define an integral network using the AutoInt API. Here the integral network is an MLP with sine non-linearities: a SIREN.

 #creating the SIREN class and defining the structure
 class SIREN(MetaModule):
     def __init__(self, session):
         super().__init__()
         self.net = [] 
         self.input = autoint.Input(torch.Tensor(1, 1), id='x_coords')
         self.net.append(autoint.Linear(1, 128))
         self.net.append(autoint.Sine())
         self.net.append(autoint.Linear(128, 128))
         self.net.append(autoint.Sine())
         self.net.append(autoint.Linear(128, 128))
         self.net.append(autoint.Sine())
         self.net.append(autoint.Linear(128, 128))
         self.net.append(autoint.Sine())
         self.net.append(autoint.Linear(128, 1))
         self.net = torch.nn.Sequential(*self.net)
         self.session = session
     def input_init(self, input_tensor, m):
         with torch.no_grad():
             if isinstance(m, autoint.Input):
                 m.set_value(input_tensor, grad=True)
     def constant_init(self, input_tensor, m):
         with torch.no_grad():
             if isinstance(m, autoint.Constant):
                 m.set_value(input_tensor, grad=False)
     def forward(self, x):
         with torch.no_grad():
             input_init_func = partial(self.input_init, x[:, 0, None])
             self.input.apply(input_init_func)
         input_ctx = autoint.Value(x, self.session)
         out1 = self.input(input_ctx)
         return self.net(out1) 

In AutoInt, a session helps handle the derivation of the integral network into the grad network. It also takes care of the reassembly of the weights too.

integralnet_session = Session() #creating session

Instantiating the integral net, we defined earlier. Thus, the session can be thought of as representing the integral network.

 #instantiate the integral network using cuda
 net = SIREN(integralnet_session)
 net.cuda() 

We will get the following output,

 SIREN(
   (input): Input()
   (net): Sequential(
     (0): Linear()
     (1): Sine()
     (2): Linear()
     (3): Sine()
     (4): Linear()
     (5): Sine()
     (6): Linear()
     (7): Sine()
     (8): Linear()
   )
   (session): Session()
 ) 

We can evaluate the SIREN we instantiated using the forward function as we would do for any Pytorch module.

 x = torch.ones(1, 1).cuda() # defines a dummy input
 y = torch.ones(1, 1).cuda()
 x.requires_grad_(True)
 session_input = {'x_coords': x,
                  #'y_coords': y,
                  'params': None}
 y = net(x)
 forward_siren_evaluation = y.data #print result of evaluation
 print(f"result of forward SIREN evaluation={forward_siren_evaluation}") 

Output :

result of forward SIREN evaluation=tensor([[0.0843]], device='cuda:0', grad_fn=<AddBackward0>)

We can now visualize the integral network we created by visualizing its associated session :

integralnet_session.draw() #visualizing the integral network

gradnet_session.draw() #visualzing gradnet

Fitting the grad network 

We first choose the function we want to calculate the integral using AutoInt.

func_to_fit = cos_fn #using cos function

We create the data loader that will create the pairs of data points of the form (input coordinate, output of the function to integrate).

 dataset = Implicit1DWrapper([-1,1], fn=func_to_fit, \
                             sampling_density=1000, train_every=1)
 dataloader = DataLoader(dataset,shuffle=True, batch_size=1, \
                         pin_memory=True, num_workers=0)
 def dict2cuda(d):
     tmp = {}
     for key, value in d.items():
         if isinstance(value, torch.Tensor):
             tmp.update({key: value.cuda()})
         else:
             tmp.update({key: value})
     return tmp 

Training loop to fit the function for 500 epochs and Adam Optimizer :

 epochs = 500 #setting the number of epochs
 loss_fn = torch.nn.MSELoss() #setting loss function
 optimizer = torch.optim.Adam(lr=5e-5, params=net.parameters(),amsgrad=True) #setting optimizer
 print_loss_every = 50
 for e in range(epochs):
   for step, (input, gt) in enumerate(dataloader):
       input = dict2cuda(input)
       gt = dict2cuda(gt)
       gradnet_output = gradnet_session.compute_graph_fast({'x_coords': input['coords'],
                                                            'params': None})
       loss = loss_fn(gradnet_output,gt['func']).mean()
       optimizer.zero_grad()
       loss.backward()
       optimizer.step()
   if not e % print_loss_every:
       print(f"{e}/{epochs}: loss={loss}") 
 0/500: loss=0.5242253541946411
 50/500: loss=0.0002673180715646595
 100/500: loss=8.674392120155971e-06
 150/500: loss=3.343612206663238e-06
 200/500: loss=2.0269669676054036e-06
 250/500: loss=1.393691377415962e-06
 300/500: loss=1.0891841384363943e-06
 350/500: loss=9.32661862407258e-07
 400/500: loss=0.0003990948316641152
 450/500: loss=7.659041330043692e-06 

Plotting results using matplotlib :

 #plotting our results 
 x_coords = torch.linspace(-1,1,100)[:,None].cuda()
 grad_vals = func_to_fit(x_coords).cpu()
 fitted_grad_vals = gradnet_session.compute_graph_fast({'x_coords': x_coords,
                                                        'params': None}).cpu()
 integral_vals = integralnet_session.compute_graph_fast({'x_coords': x_coords,
                                                         'params': None}).cpu()
 x_coords = x_coords.cpu()
 plt.plot(x_coords,grad_vals,'-k', label='Function to integrate')
 plt.plot(x_coords,fitted_grad_vals.detach(),'.r', label='Grad network')
 plt.plot(x_coords,integral_vals.detach(),'-b', label='Integral network')
 plt.legend(bbox_to_anchor=(1.05, 1.0), loc='upper left') 

EndNotes 

This article talks about how the image rendering library AutoInt works and tries to get a hands-on overview of what happens under the hood when the input signals process. I would recommend exploring the library even further using other modules to understand its immense qualities even better. You can access my implemented colab notebook here

Happy Learning!

References

Share
Picture of Victor Dey

Victor Dey

Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.