Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 3 days ago • 34 • 5
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Paper • 2501.18427 • Published 4 days ago • 12 • 2
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding Paper • 2501.16411 • Published 6 days ago • 16 • 3
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding Paper • 2501.18362 • Published 4 days ago • 17 • 2
GuardReasoner: Towards Reasoning-based LLM Safeguards Paper • 2501.18492 • Published 3 days ago • 63 • 3
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation Paper • 2501.16609 • Published 6 days ago • 5 • 2
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 3 days ago • 34 • 5
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published 4 days ago • 16 • 3
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published 3 days ago • 14 • 4
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Paper • 2501.18512 • Published 3 days ago • 21 • 5
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 5 days ago • 27 • 2
Histoires Morales: A French Dataset for Assessing Moral Alignment Paper • 2501.17117 • Published 5 days ago • 3 • 2