O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning Paper • 2501.12570 • Published 2 days ago • 15
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 1 day ago • 43
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 1 day ago • 49
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 1 day ago • 127
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published 4 days ago • 70
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 10 days ago • 47
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published 7 days ago • 35
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 7 days ago • 64
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published 7 days ago • 37
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 9 days ago • 28
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published 13 days ago • 29
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 13 days ago • 59
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 9 days ago • 268
Multi-task retriever fine-tuning for domain-specific and efficient RAG Paper • 2501.04652 • Published 15 days ago • 10
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 15 days ago • 79
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 16 days ago • 80