AGORA adopts a 3D GAN framework to learn an animatable 3D head generation model from 2D image datasets. The architecture consists of two main components:
The final 3D position of each Gaussian is obtained via 3D lifting: we interpolate a base position from the articulated FLAME mesh at UV coordinates and add the predicted offset. This design anchors the generated 3DGS to the underlying parametric mesh, providing a structured basis for animation.
To enforce expression consistency, we adopt a dual-discrimination scheme. The discriminator is conditioned on the target expression by concatenating the rendered image with a synthetic rendering of the FLAME mesh, where vertices are color-coded by their expression-isolated vertex displacement from the neutral pose. This allows the discriminator to penalize fine-grained deviations in expression.
Generated avatars (seeds 0-32) reenacted by the driving video on the left.
Avatars generated from single images, driven by the video on the left.
Comparison of avatar reenactment methods. Left: Ours, Center: Driving video, Right: Next3D.
This work builds upon GGHEAD, whose UV-based 3D Gaussian generation framework served as the foundation for our approach. We are grateful for their excellent codebase and insights.
We also acknowledge EG3D for its pioneering work in 3D-aware generative models, and Next3D, which inspired key aspects of our methodology. We thank the authors of concurrent work GAIA for providing visual results for comparison.
We are also grateful to the developers of gsplat for their efficient and easy-to-use Gaussian splatting library, which we use for rasterization.
@article{fazylov2025agora,
author = {Fazylov, Ramazan and Zagoruyko, Sergey and Parkin, Aleksandr and Lefkimmiatis, Stamatis and Laptev, Ivan},
title = {{AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars}},
journal = {arXiv preprint arXiv:2512.06438},
year = {2025}
}