Mistral-Small-24B-2501 (All Versions) Collection A collection of Mistral's new Small 2501 models including GGUF, 4-bit and more! • 9 items • Updated 1 day ago • 5
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 1 day ago • 202
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 258
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
Phi-4 (All Versions) Collection Microsoft's new Phi-4 models including mini & multimodal in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 7 items • Updated 1 day ago • 42
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7 • 50
Load 4bit models 4x faster Collection Native bitsandbytes 4bit pre quantized models • 25 items • Updated 1 day ago • 55
Qwen 2.5 Coder Collection Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 1 day ago • 26
Llama 3.2 Vision Collection Meta's Llama 3.2 vision models 11B and 90B. Include 4-bit bnb and original versions. • 8 items • Updated 1 day ago • 7
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 27 items • Updated 1 day ago • 54
Vision/multimodal Models Collection Collection of the most popular vision models including Llama 3.2, LlaVa, Qwen2 VL, Pixtral, PaliGemma and more! • 25 items • Updated 1 day ago • 6
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 22 items • Updated 1 day ago • 53
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 20 items • Updated Jan 15 • 118
Gated Slot Attention for Efficient Linear-Time Sequence Modeling Paper • 2409.07146 • Published Sep 11, 2024 • 20
Attention Heads of Large Language Models: A Survey Paper • 2409.03752 • Published Sep 5, 2024 • 89
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14, 2024 • 59
view article Article Indexify: Bringing HuggingFace Models to Real-Time Pipelines for Production Applications By rishiraj • May 31, 2024 • 7
Blackhole Collection A black hole with lots of high-quality dialogue datasets in many fields, and multilingual helps to train LLMs with SFT and DPO methods easier. • 32 items • Updated Aug 18, 2024 • 6