Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 7 days ago • 309
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 12 days ago • 282
FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces Paper • 2501.12909 • Published 12 days ago • 62
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published 19 days ago • 32
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17, 2024 • 51
ViTamin Family Collection Designing Scalable Vision Models in the Vision-language Era. The best performing model is 'jienengchen/ViTamin-XL-384px'. • 16 items • Updated Apr 11, 2024 • 8
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter Paper • 2402.10896 • Published Feb 16, 2024 • 15