InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 16 days ago • 142
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning Paper • 2502.06060 • Published 19 days ago • 32
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 334