Generative 3D Models

2023-06-01
2 min read

Generative models have made impressive breakthroughs in the text-to-image task since 2022. Their success comes from the diffusion models trained on billions of image-text pairs. However, such large-scale labeled dataset does not exist for 3D samples and is expensive to acquire.

Inspired by DreamFusion, the common practice in handling text-to-3D tasks is to employ a pretrained text-to-image diffusion model for generating 3D content from text in the wild, circumventing the need for any 3D data. The idea is to optimize a single 3D scene $\theta$ or a distribution of 3D scenes $\mu(\theta)$, such that the distribution induced on images rendered from all views aligns, in terms of KL divergence, with the distribution defined by the pretrained 2D diffusion model.

3D Generative Model

While most of the work in this area utilizes the Score Distillation Sampling (SDS) loss proposed by DreamFusion, a recent contribution by ProlificDreamer introduced the Variational Score Distillation (VSD) loss. Therefore, in the diagram, I use Loss instead of Losssds.

Papers Worth Reading

↓Papers are listed in Descending Time Order.

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
Project Page | arxiv | Github
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
Project Page | arxiv | Github
Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
Project Page | arxiv | Github
Zero-1-to-3: Zero-shot One Image to 3D Object
Project Page | arxiv | Github
Magic3D: High-Resolution Text-to-3D Content Creation
Project Page | arxiv
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
arxiv | Github
DreamFusion: Text-to-3D using 2D Diffusion
Project Page | arxiv | stable-dreamfusion
Highlights
  • proposed Score Distillation Sampling (SDS) loss