SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 7 days ago • 98
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5, 2024 • 65
Zero-shot Model-based Reinforcement Learning using Large Language Models Paper • 2410.11711 • Published Oct 15, 2024 • 8