Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.19928

Papers - Huawei

DiJiang: Efficient Large Language Models through Compact Kernelization

Paper • 2403.19928 • Published Mar 29 • 10
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 18
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Paper • 2404.09204 • Published Apr 14 • 10
SAGS: Structure-Aware 3D Gaussian Splatting

Paper • 2404.19149 • Published Apr 29 • 13

Papers - Training a 2.8B Model in 38 days

DiJiang: Efficient Large Language Models through Compact Kernelization

Paper • 2403.19928 • Published Mar 29 • 10

Papers - Training Research

Measuring the Effects of Data Parallelism on Neural Network Training

Paper • 1811.03600 • Published Nov 8, 2018 • 2
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

Paper • 1804.04235 • Published Apr 11, 2018 • 2
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Paper • 1905.11946 • Published May 28, 2019 • 3
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7 • 62

Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29 • 49
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 52
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1 • 44
Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29 • 22

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 53
APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding

Paper • 2401.06761 • Published Jan 12 • 1
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Paper • 2401.02669 • Published Jan 5 • 14
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 50

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs