-
OpenAI o1 System Card
Paper • 2412.16720 • Published • 31 -
LearnLM: Improving Gemini for Learning
Paper • 2412.16429 • Published • 22 -
NILE: Internal Consistency Alignment in Large Language Models
Paper • 2412.16686 • Published • 8 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38
Sheikh Jubair
sheikhjubair
AI & ML interests
None yet
Recent Activity
updated
a collection
10 days ago
reasoning-agentic
updated
a collection
10 days ago
reasoning-agentic
updated
a collection
10 days ago
reasoning-agentic
Organizations
None yet
Collections
3
-
InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning
Paper • 2408.07089 • Published • 14 -
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
Paper • 2409.16191 • Published • 42 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 136 -
Self-Boosting Large Language Models with Synthetic Preference Data
Paper • 2410.06961 • Published • 16
models
None public yet
datasets
None public yet