Novel View Synthesis with Diffusion Models

3DiM is a diffusion model for 3D novel view synthesis from as few as a single image. Comparing it to the SRN ShapeNet dataset, it is clear that 3DiM's generated videos from a single view achieve much higher fidelity while being approximately 3D consistent.

Sophia

· Dec 23, 2022

Paper

Abstract

We present 3DiM (pronounced "three-dim"), a diffusion model for 3D novel view synthesis from as few as a single image. The core of 3DiM is an image-to-image diffusion model -- 3DiM takes a single reference view and a relative pose as input, and generates a novel view via diffusion. 3DiM can then generate a full 3D consistent scene following our novel stochastic conditioning sampler. The output frames of the scene are generated autoregressively. During the reverse diffusion process of each individual frame, we select a random conditioning frame from the set of previous frames at each denoising step. We demonstrate that stochastic conditioning yields much more 3D consistent results compared to the naïve sampling process which only conditions on a single previous frame. We compare 3DiMs to prior work on the SRN ShapeNet dataset, demonstrating that 3DiM's generated videos from a single view achieve much higher fidelity while being approximately 3D consistent. We also introduce a new evaluation methodology, 3D consistency scoring, to measure the 3D consistency of a generated object by training a neural field on the model's output views. 3DiMs are geometry free, do not rely on hyper-networks or test-time optimization for novel view synthesis, and allow a single model to easily scale to a large number of scenes.

0:00

Subscribe

Novel View Synthesis with Diffusion Models

Abstract

Comments

Read Next

Cursor launches Composer 2: a model more capable, cheaper and faster than its predecessor

Yann LeCun's AMI Labs just raised Europe's largest seed round for its world models

Encyclopedia Britannica and Merriam-Webster are the latest publishers to sue OpenAI

Nscale announces Europe's largest Series C, Sheryl Sandberg and Nick Clegg join its board

Replit launches Agent 4 as part of its mission to make software development widely accessible