-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 37 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 45 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 35 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 45
Collections
Discover the best community collections!
Collections including paper arxiv:2412.20993
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 147 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 54 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 46
-
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper • 2501.09747 • Published • 22 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 74 -
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Paper • 2501.06842 • Published • 15 -
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Paper • 2501.03895 • Published • 48
-
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 81 -
PERSE: Personalized 3D Generative Avatars from A Single Portrait
Paper • 2412.21206 • Published • 17 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 35
-
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper • 2408.06195 • Published • 70 -
Thinking LLMs: General Instruction Following with Thought Generation
Paper • 2410.10630 • Published • 18 -
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 15 -
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
Paper • 2412.16849 • Published • 9
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 42 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 54
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 39 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 16