Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.03853

23\ Info - pages

TinyGSM: achieving >80% on GSM8k with small language models

Paper • 2312.09241 • Published Dec 14, 2023 • 37
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 62
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Paper • 2403.18795 • Published Mar 27 • 18
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

Paper • 2404.04478 • Published Apr 6 • 12

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138
SparQ Attention: Bandwidth-Efficient LLM Inference

Paper • 2312.04985 • Published Dec 8, 2023 • 38
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31 • 59
Neural Network Diffusion

Paper • 2402.13144 • Published Feb 20 • 94

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Paper • 2311.09257 • Published Nov 14, 2023 • 45
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Paper • 2310.04378 • Published Oct 6, 2023 • 19
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44
Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 118

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs