Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 2 days ago • 44
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 3 days ago • 176
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching Paper • 2412.17153 • Published 20 days ago • 34
Autoregressive Video Generation without Vector Quantization Paper • 2412.14169 • Published 24 days ago • 14
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 23 days ago • 122
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Paper • 2412.07720 • Published Dec 10, 2024 • 30
EXAONE-3.5 Collection EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B. • 10 items • Updated Dec 10, 2024 • 87
Evaluating Language Models as Synthetic Data Generators Paper • 2412.03679 • Published Dec 4, 2024 • 46
Toxic Commons Collection Tools for de-toxifying public domain data, especially multilingual and historical text data and data with OCR errors. • 3 items • Updated Oct 31, 2024 • 5
Common Models Collection The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 28
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens Paper • 2411.17691 • Published Nov 26, 2024 • 11
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 58
ESFT Collection models for paper expert-specialized fine-tuning • 15 items • Updated Aug 16, 2024 • 5
HyenaDNA Models Collection HyenaDNA models usable directly with Hugging Face classes like AutoModel. • 8 items • Updated Nov 14, 2023 • 16
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 113