Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.08313

2025 January Papers 🧐

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 12 days ago • 281
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 99
The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 21 days ago • 89

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 147
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20, 2024 • 13
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 54
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 46

SciML/PhysicsML papers

PINNACLE: PINN Adaptive ColLocation and Experimental points selection

Paper • 2404.07662 • Published Apr 11, 2024
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271

LLM Architecture

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 17 days ago • 103
ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 79
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling

Paper • 2412.15084 • Published Dec 19, 2024 • 13
The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 21 days ago • 89

Attention improvement metrics

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 26 days ago • 250
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 125
Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139

LLM Pretraining

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271

Model Architecturea

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 19 days ago • 271

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs