SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 6 days ago • 88
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization Paper • 2403.17031 • Published Mar 24, 2024 • 5
Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos Paper • 2501.13826 • Published 11 days ago • 22
Solving math word problems with process- and outcome-based feedback Paper • 2211.14275 • Published Nov 25, 2022 • 8
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1 • 99
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 52
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12, 2024 • 64
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Paper • 2306.01116 • Published Jun 1, 2023 • 33
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Paper • 2411.14982 • Published Nov 22, 2024 • 16
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution Paper • 2409.12961 • Published Sep 19, 2024 • 25
Oryx-1.5 Collection Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution • 4 items • Updated 19 days ago • 5
Oryx Collection Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding • 6 items • Updated Dec 11, 2024 • 16
LongVA Collection Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13
Quantifying the Carbon Emissions of Machine Learning Paper • 1910.09700 • Published Oct 21, 2019 • 13
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published Oct 17, 2024 • 75