DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 12 days ago • 284
view article Article Exploring Synthetic Data Generation with DataDreamer By asoria • 13 days ago • 6
view article Article A Beginner-Friendly PyTorch Tutorial: Build and Train Your First Model By dvgodoy • 14 days ago • 4
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated about 21 hours ago • 137
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) By ariG23498 • 15 days ago • 13
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese Paper • 2408.12480 • Published Aug 22, 2024 • 21
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 145
view article Article Building an AI-powered search engine from scratch By as-cle-bert • Dec 12, 2024 • 9
view article Article 🦸🏻#2: Your Go-To Vocabulary to Navigate the World of AI Agents and Agentic Workflows By Kseniase • Dec 28, 2024 • 10
view article Article 🐺🐦⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark By wolfram • Jan 2 • 39