microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated about 5 hours ago • 7.35k • 514
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published 5 days ago • 22
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published 3 days ago • 54
Expect the Unexpected: FailSafe Long Context QA for Finance Paper • 2502.06329 • Published 18 days ago • 124
InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback Paper • 2502.15027 • Published 8 days ago • 6
SIFT: Grounding LLM Reasoning in Contexts via Stickers Paper • 2502.14922 • Published 9 days ago • 28
Sky-T1-7B Collection A series of 7B models trained with different recipes and the corresponding training data. • 8 items • Updated 15 days ago • 5
Running 1.79k 1.79k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Process Reward Models Collection Model and Datasets for Qwen 2.5 Math PRM 7B • 6 items • Updated 10 days ago • 1
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published 14 days ago • 30
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning Paper • 2502.04689 • Published 22 days ago • 7
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models Paper • 2502.04404 • Published 23 days ago • 22