Scaling Laws for Floating Point Quantization Training Paper • 2501.02423 • Published about 1 month ago • 25
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use Paper • 2501.02506 • Published about 1 month ago • 11
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning Paper • 2501.03226 • Published 29 days ago • 37
Test-time Computing: from System-1 Thinking to System-2 Thinking Paper • 2501.02497 • Published about 1 month ago • 41