Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data Paper • 2302.12822 • Published Feb 24, 2023
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment Paper • 2304.06767 • Published Apr 13, 2023 • 2
FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation Paper • 2408.12168 • Published Aug 22, 2024
Demystifying Long Chain-of-Thought Reasoning in LLMs Paper • 2502.03373 • Published 23 days ago • 55
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 335
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published Dec 23, 2024 • 46
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published Dec 23, 2024 • 46
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published Dec 23, 2024 • 43
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25, 2024 • 61
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation Paper • 2304.05977 • Published Apr 12, 2023 • 1
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving Paper • 2407.13690 • Published Jun 18, 2024 • 2
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 16
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning Paper • 2402.09136 • Published Feb 14, 2024 • 1
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published Jun 12, 2024 • 16
Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models Paper • 2406.12182 • Published Jun 18, 2024
Automatic Instruction Evolving for Large Language Models Paper • 2406.00770 • Published Jun 2, 2024 • 2
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist Paper • 2407.08733 • Published Jul 11, 2024 • 23
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 16
HARE: HumAn pRiors, a key to small language model Efficiency Paper • 2406.11410 • Published Jun 17, 2024 • 39