view article Article Welcome FalconMamba: The first strong attention-free 7B model Aug 12, 2024 • 108
OLMoE Collection Artifacts for open mixture-of-experts language models. • 13 items • Updated 5 days ago • 29
Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models Paper • 2108.08877 • Published Aug 19, 2021 • 2
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 69
GritLM Collection Generative Representational Instruction Tuning (GRIT) • 64 items • Updated Apr 17, 2024 • 7
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension Paper • 1910.13461 • Published Oct 29, 2019 • 3
Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models Paper • 2405.05374 • Published May 8, 2024 • 2
Training data-efficient image transformers & distillation through attention Paper • 2012.12877 • Published Dec 23, 2020 • 2
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22, 2024 • 126