Yihua Zhang

NormalUhr

AI & ML interests

None yet

Recent Activity

Organizations

OPTML Group @ MSU's profile picture

NormalUhr's activity

published an article about 9 hours ago
view article
Article

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

By NormalUhr
published an article 17 days ago
view article
Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

By NormalUhr
6
published an article 21 days ago
view article
Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By NormalUhr
44
published an article 24 days ago
view article
Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

By NormalUhr
2
published an article 24 days ago
view article
Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

By NormalUhr
11
published an article 24 days ago
view article
Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

By NormalUhr
5