Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.10241

Zero Bubble Pipeline Parallelism

Paper • 2401.10241 • Published Nov 30, 2023 • 22

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 52
Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28 • 18
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

Paper • 2402.15220 • Published Feb 23 • 19
Linear Transformers are Versatile In-Context Learners

Paper • 2402.14180 • Published Feb 21 • 6

Pipeline Parallelism

Zero Bubble Pipeline Parallelism

Paper • 2401.10241 • Published Nov 30, 2023 • 22
Pipeline Parallelism with Controllable Memory

Paper • 2405.15362 • Published May 24 • 3

Zero Bubble Pipeline Parallelism

Paper • 2401.10241 • Published Nov 30, 2023 • 22

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Paper • 2401.02994 • Published Jan 4 • 47
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Paper • 2401.05566 • Published Jan 10 • 25
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 64
Zero Bubble Pipeline Parallelism

Paper • 2401.10241 • Published Nov 30, 2023 • 22

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 258
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

Paper • 2312.12456 • Published Dec 16, 2023 • 41
Accelerating LLM Inference with Staged Speculative Decoding

Paper • 2308.04623 • Published Aug 8, 2023 • 23
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

Paper • 2208.07339 • Published Aug 15, 2022 • 4

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 16
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 9
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 11
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 47

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs