-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 11 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 50 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 44
Collections
Discover the best community collections!
Collections including paper arxiv:2410.18745
-
Visual Context Window Extension: A New Perspective for Long Video Understanding
Paper • 2409.20018 • Published • 8 -
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
Paper • 2409.02889 • Published • 54 -
Long Context Transfer from Language to Vision
Paper • 2406.16852 • Published • 32 -
lmms-lab/LongVA-7B-DPO
Text Generation • Updated • 4.14k • 7
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 138 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 133 -
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Paper • 2409.02795 • Published • 72 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 87
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 53 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 51 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 40 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 50
-
LLoCO: Learning Long Contexts Offline
Paper • 2404.07979 • Published • 20 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 111 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 15 -
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Paper • 2401.18058 • Published • 21
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 18 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 53 -
CRAG -- Comprehensive RAG Benchmark
Paper • 2406.04744 • Published • 41 -
Transformers meet Neural Algorithmic Reasoners
Paper • 2406.09308 • Published • 43
-
A Language Model's Guide Through Latent Space
Paper • 2402.14433 • Published • 1 -
The Hidden Space of Transformer Language Adapters
Paper • 2402.13137 • Published -
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models
Paper • 2402.16438 • Published -
AtP*: An efficient and scalable method for localizing LLM behaviour to components
Paper • 2403.00745 • Published • 11
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 82 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 82