view article Article DualPipe could be better without the Dual By ufotalent • about 17 hours ago • 9
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published 3 days ago • 46
CodeDPO/qwen25-coder-inst-7b-reinforce-plus_v2_mini_processed_r1_cold_start Updated 2 days ago • 17
CodeDPO/qwen25-coder-inst-7b-reinforce-plus_v2_mini_processed_r1_cold_start Updated 2 days ago • 17
MoBA: Mixture of Block Attention for Long-Context LLMs Paper • 2502.13189 • Published 10 days ago • 12
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 8 days ago • 92