MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Paper • 2410.07095 • Published Oct 9, 2024 • 6
Aria: An Open Multimodal Native Mixture-of-Experts Model Paper • 2410.05993 • Published Oct 8, 2024 • 108
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27, 2024 • 92
Whisper Release Collection Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 93
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 114
Handbook v0.1 models and datasets Collection Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24
DPO vs KTO vs IPO Collection A collection of datasets and models used for the Aligning LLMs with Direct Preference Optimization Methods blogpost • 2 items • Updated Jan 16, 2024 • 12
Constitutional AI Collection A collection of datasets and models that accompany the Constitutional AI recipe. See hf.co/blog/constitutional-ai for more details. • 9 items • Updated Feb 1, 2024 • 5
Tulu V2 Suite Collection The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2" • 19 items • Updated 5 days ago • 43
Paloma Collection Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains • 8 items • Updated 5 days ago • 15
OLMo Suite Collection Artifacts for the first set of OLMo models. • 18 items • Updated 5 days ago • 70