On Teacher Hacking in Language Model Distillation Paper β’ 2502.02671 β’ Published 24 days ago β’ 17
π§ Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community β’ 12 items β’ Updated 9 days ago β’ 84
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21, 2024 β’ 30
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. β’ 46 items β’ Updated 3 days ago β’ 535
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) β’ 13 items β’ Updated Nov 18, 2024 β’ 204
Standard-format-preference-dataset Collection We collect the open-source datasets and process them into the standard format. β’ 14 items β’ Updated May 8, 2024 β’ 24
FP8 LLMs for vLLM Collection Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! β’ 44 items β’ Updated Oct 17, 2024 β’ 66
Korean Datasets I've released so far. Collection μ§κΈκΉμ§ μ λ‘λν νκ΅μ΄ λ°μ΄ν°μ μ½λ μ μ λλ€. β’ 8 items β’ Updated May 24, 2024 β’ 17