Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 4 days ago • 40
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 5 days ago • 45
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 12 days ago • 284
Do generative video models learn physical principles from watching videos? Paper • 2501.09038 • Published 20 days ago • 31
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 20 days ago • 271