The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 11 days ago • 85
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1, 2024 • 59
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 16 days ago • 245
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 15 days ago • 83
view article Article Fine-tune a SmolLM on domain-specific synthetic data from a LLM By davidberenstein1957 • 21 days ago • 31
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published Dec 23, 2024 • 45
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published about 1 month ago • 95
InternVL2.5-MPO Collection Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 15 days ago • 26
Llama 3.3 (All Versions) Collection Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated 4 days ago • 33
view article Article Use Models from the Hugging Face Hub in LM Studio By yagilb • Nov 28, 2024 • 132
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 7 items • Updated 18 days ago • 33
Multi-Granularity Prediction for Scene Text Recognition Paper • 2209.03592 • Published Sep 8, 2022 • 2
OpenScholar_V1 Collection The set of models, index, data associated with the paper "OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs". • 8 items • Updated Nov 22, 2024 • 31