Summary | cs395T

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections, Martin-Brualla, Radwan, Sajjadi, Barron, Dosovitskiy, Duckworth; 2020 - Summary

author:	nilesh2797
score:	9 / 10

What is the core idea?

NeRF works well on images of static subjects captured under controlled settings, but it fails to model real-world variabilities such as illumination and transient occluders
NeRF-W (this paper) extends NeRF to address these issues and is able to render accurate reconstructions from unconstrained image collections taken from the internet!

Sample image renders taken from https://nerf-w.github.io

How is it realized (technically)?

Overview of NeRF-W model (Source)

Latent appearance modelling

To adapt to the various lighting and camera settings, NeRF-W learns a separate low-dimensional “appearance” embedding for each of the training image.
The appearance embedding is only fed to the MLP which gives out the color but not to the MLP which estimates the density, this ensures that the 3-D geometry is shared across all training images of a scene

Transient objects modelling

Another MLP outputs color and density at each location of its own, which models the transient objects in the image
Note, in the this part, the density is allowed to be different for different training images of the same scene
The transient MLP also outputs a uncertainity parameter which allows it to adapt reconstruction loss to ignore unreliable pixels and 3D locations that are likely to contain occluders.

How well does the paper perform?

NeRF-W outperforms baselines on various scenes from Phototourism dataset

TL;DR

Most neural rendering algorithms including NeRF doesn’t work very well on images taken in uncontrolled settings
NeRF-W models the photometric variabilities and disentangles static and transient features of the image to do novel view synthesis on unconstrained photo collections