Traditional Scene Reconstruction and Rendering
The first novel-view synthesis approaches were based on light fields, first densely sampled then allowing unstructured capture.
The advent of SfM enabled an entire new domain where a collection of photos could be used to synthesize novel views. SfM estimates a sparse point cloud during camera calibration, that was initially used for simple visualization of 3D space.
Subsequent MVS produced impressive full 3D reconstruction algorithms over the years, enabling the development of several view synthesis algorithms.
All these methods re-project and blend the input images into the novel view camera, and use the geometry to guide this re-projection.
These methods produced excellent results in many cases, but typically cannot completely recover from unreconstructed regions, or from “over-reconstruction”, when MVS generates inexistent geometry.
Recent neural rendering algorithms vastly reduce such artifacts and avoid the overwhelming cost of storing all input images on the GPU, outperforming these methods on most fronts.
Neural Rendering and Radiance Fields
Deep learning techniques were adopted early for novel-view synthesis; CNNs were used to estimate blending weights, or for texture-space solutions. The use of MVS-based geometry is a major drawback of most of these methods; in addition, the use of CNNs for final rendering frequently results in temporal flickering.
Volumetric representations for novel-view synthesis were initiated by Soft3D; deep-learning techniques coupled with volumetric ray-marching were subsequently proposed building on a continuous differentiable density field to represent geometry. Rendering using volumetric ray-marching has a significant cost due to the large number of samples required to query the volume.
NeRFs introduced importance sampling and positional encoding to improve quality, but used a large Multi-Layer Perceptron negatively affecting speed. The success of NeRF has resulted in an explosion of follow-up methods that address quality and speed, often by introducing regularization strategies; the current SOTA in image quality for novel-view synthesis is Mip-NeRF360.
Wh