RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning Paper • 2410.02089 • Published Oct 2 • 10
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 8 items • Updated 6 days ago • 160
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning Paper • 2406.12050 • Published Jun 17 • 18
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21 • 53
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14 • 48
WaveUI Collection WaveUI is a collection of datasets and tools to improve UI object detection • 6 items • Updated Jul 31 • 9