No code

DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance

2023 / ACM Transactions on Graphics / DOI 10.1145/3592094

Longwen Zhang Qiwei Qiu Hongyang Lin Qixuan Zhang Shi Cheng Wei Yang Ye Shi Sibei Yang Lan Xu Jingyi Yu

Emerging Metaverse applications demand accessible, accurate and easy-to-use tools for 3D digital human creations in order to depict different cultures and societies as if in the physical world. Recent large-scale vision-language advances pave the way for novices to conveniently customize 3D content. However, the generated CG-friendly assets still cannot represent the desired facial traits for human characteristics. In this paper, we present Dream-Face, a progressive scheme to generate personalized 3D faces under text guidance. It enables layman users to naturally customize 3D facial assets that are compatible with CG pipelines, with desired shapes, textures and fine-grained animation capabilities. From a text input to describe the facial traits, we first introduce a coarse-to-fine scheme to generate the neutral facial geometry with a unified topology. We employ a selection strategy in the CLIP embedding space to generate coarse geometry, and subsequently optimize both the detailed displacements and normals using Score Distillation Sampling (SDS) from the generic Latent Diffusion Model (LDM). Then, for neutral appearance generation, we introduce a dual-path mechanism, which combines the generic LDM with a novel texture LDM to ensure both the diversity and textural specification in the UV space. We also employ a two-stage optimization to perform SDS in both the latent and image spaces to significantly provide compact priors for fine-grained synthesis. It also enables learning the mapping from the compact latent space into physically-based textures (diffuse albedo, specular intensity, normal maps, etc.). Our generated neutral assets naturally support blendshapes-based facial animations, thanks to the unified geometric topology. We further improve the animation ability with personalized deformation characteristics. To this end, we learn the universal expression prior in a latent space with neutral asset conditioning using the cross-identity hypernetwork, we subsequently train a neural facial tracker from video input space into the pre-trained expression space for personalized fine-grained animation. Extensive qualitative and quantitative experiments validate the effectiveness and generalizability of DreamFace. Notably, DreamFace can generate realistic 3D facial assets with physically-based rendering quality and rich animation ability from video footage, even for fashion icons or exotic characters in cartoons and fiction movies.

Citations

References

Implementations

Artifact located

Repro status

Reproducibility Dossier

Artifact locatedConfidence: editor verified / checked Apr 2026

GEOMDIGEST treats reproducibility as an evidence trail: public artifacts, documentation, data, packaging, archival stability, and verification checks. Numeric scores are only exposed for audited records; public pages prioritize the evidence itself.

Evidence

Verified

not yet

Code

not yet

Data

not yet

Docs

not yet

Build checks

supplementary / verified / editor verified

Detected evidence link

Methodology

Improve this dossier

Implementation Index

No implementations indexed yet

This paper is in the knowledge graph, but we have not attached a runnable artifact yet.

Citation Lineage

References3

2022StyleGAN-NADA453 cites 2022AvatarCLIP200 cites 2022Authentic volumetric avatars from a phone scan110 cites

Selected paper

DreamFace: Progressive Generation of Animatable 3D Faces ...

2023 / 59 citations

Cited by3

2024CLAY: A Controllable Large-scale Generative M...75 cites 2025Facial Appearance Capture at Home with Patch-...1 cites 2026Bringing Diversity from Diffusion Models to S...0 cites