Collections
Discover the best community collections!
Collections including paper arxiv:2408.04619
-
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks
Paper • 2406.12925 • Published • 22 -
Scaling Laws for Linear Complexity Language Models
Paper • 2406.16690 • Published • 22 -
DiffusionPDE: Generative PDE-Solving Under Partial Observation
Paper • 2406.17763 • Published • 23 -
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Paper • 2407.01494 • Published • 13
-
Large Language Model Unlearning via Embedding-Corrupted Prompts
Paper • 2406.07933 • Published • 7 -
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper • 2406.02657 • Published • 36 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 18 -
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 30
-
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Paper • 2311.17049 • Published -
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 13 -
A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Paper • 2303.17376 • Published -
Sigmoid Loss for Language Image Pre-Training
Paper • 2303.15343 • Published • 4
-
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 63 -
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Paper • 2404.07839 • Published • 41 -
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Paper • 2404.05892 • Published • 31 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 138