Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.00838

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 143
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20 • 11
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 50
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 44

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Paper • 2406.06563 • Published Jun 3 • 17

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Paper • 2405.19327 • Published May 29 • 46
LLM360/K2

Text Generation • Updated Jul 29 • 609 • 80
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
LLM360: Towards Fully Transparent Open-Source LLMs

Paper • 2312.06550 • Published Dec 11, 2023 • 56

Foundation Models

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8 • 60
StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 29
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 56

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 104
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28 • 39
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27 • 52
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Paper • 2403.18814 • Published Mar 27 • 44

Issa collection 1

allenai/basic_arithmetic

Viewer • Updated Mar 8 • 3k • 27 • 1
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 13 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1 • 21
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 143
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 25

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs