view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • 2 days ago • 6
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 584
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation Paper • 2406.02347 • Published Jun 4, 2024 • 3
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 135