view article Article TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation By imomayiz • 16 days ago • 24
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference 10 days ago • 60
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 11 days ago • 121
view article Article CO₂ Emissions and Models Performance: Insights from the Open LLM Leaderboard 17 days ago • 14
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards Paper • 2402.01781 • Published Feb 1, 2024 • 2
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 17 days ago • 89
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 24 days ago • 48
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 78
Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation Paper • 2412.15255 • Published Dec 15, 2024 • 3
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 18 days ago • 81
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks • 6 items • Updated Dec 13, 2024 • 10
🧪 FineWeb v1 data experiments Collection Ablation models trained for our data experiments. • 22 items • Updated Jun 12, 2024 • 4
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 35
AraDICE Collection AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs • 12 items • Updated Dec 13, 2024 • 4
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 132
view article Article Rethinking Backpropagation: Thoughts on What's Wrong with Backpropagation By Jaward • Dec 2, 2024 • 5