Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published 19 days ago • 76
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 186
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published Jan 14 • 57
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published Jul 12, 2024 • 135
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4, 2024 • 94
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated Jan 17 • 60
Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework Paper • 2406.14783 • Published Jun 20, 2024 • 17
LightIt: Illumination Modeling and Control for Diffusion Models Paper • 2403.10615 • Published Mar 15, 2024 • 17
Bio-Inspired Night Image Enhancement Based on Contrast Enhancement and Denoising Paper • 2307.05447 • Published Jul 11, 2023 • 2
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis Paper • 2403.08764 • Published Mar 13, 2024 • 36