PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated 18 days ago • 64
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 16 days ago • 91
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 18 days ago • 71
OLMoE (November 2024) Collection Artifacts for open mixture-of-experts language models. • 13 items • Updated 18 days ago • 29
OLMoE (January 2025) Collection Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app • 10 items • Updated 17 days ago • 9
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published Nov 21, 2024 • 30
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 18 days ago • 297
OLMo Suite Collection Artifacts for the first set of OLMo models. • 18 items • Updated 18 days ago • 71
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31, 2024 • 62
Paloma Collection Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains • 8 items • Updated 18 days ago • 15