siyeng feng

siyengfeng

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 hours ago

Autonomy-of-Experts Models

upvoted a paper about 10 hours ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

upvoted a paper about 10 hours ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

View all activity

Organizations

None yet

siyengfeng's activity

upvoted a paper about 2 hours ago

Autonomy-of-Experts Models

Paper • 2501.13074 • Published 1 day ago • 29

upvoted 2 papers about 10 hours ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 2 days ago • 33

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 1 day ago • 94

upvoted 2 papers about 20 hours ago

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published 2 days ago • 31

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published 4 days ago • 19

upvoted an article about 21 hours ago

Article

Process Reinforcement through Implicit Rewards

•

21 days ago

• 19

upvoted 2 papers 1 day ago

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Paper • 2501.10893 • Published 5 days ago • 20

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 3 days ago • 68

upvoted 3 papers 3 days ago

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 33

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 7 days ago • 93

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 78

upvoted 2 papers 7 days ago

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published 15 days ago • 50

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published 29 days ago • 95

upvoted a paper 8 days ago

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published 13 days ago • 75

upvoted 6 papers 9 days ago

o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published Nov 29, 2024 • 43

O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

Paper • 2501.06458 • Published 13 days ago • 29

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 15 days ago • 89