Collections
Discover the best community collections!
Collections including paper arxiv:2502.15814
-
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU
Paper • 2403.06504 • Published • 53 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 140 -
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 57
-
gradientai/Llama-3-8B-Instruct-Gradient-1048k
Text Generation • Updated • 4.28k • 680 -
Are Your LLMs Capable of Stable Reasoning?
Paper • 2412.13147 • Published • 92 -
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation
Paper • 2412.11919 • Published • 34 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 97
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 33 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 26 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 123 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22