Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation Paper • 2412.06016 • Published Dec 8, 2024 • 20
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published about 1 month ago • 86
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published 24 days ago • 18
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published 23 days ago • 21
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 3 days ago • 54