FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published May 2 • 24
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23 • 34
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture Paper • 2405.18991 • Published May 29 • 12
The Prompt Report: A Systematic Survey of Prompting Techniques Paper • 2406.06608 • Published Jun 6 • 53
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10 • 64
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Paper • 2406.12034 • Published Jun 17 • 13
A Closer Look into Mixture-of-Experts in Large Language Models Paper • 2406.18219 • Published Jun 26 • 15
DiffusionPDE: Generative PDE-Solving Under Partial Observation Paper • 2406.17763 • Published Jun 25 • 23
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data Paper • 2406.18790 • Published Jun 26 • 33
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps Paper • 2407.07071 • Published Jul 9 • 11
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications Paper • 2408.11878 • Published Aug 20 • 50
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models Paper • 2408.15915 • Published Aug 28 • 19
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers Paper • 2409.04109 • Published Sep 6 • 43
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19 • 134
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Paper • 2409.12903 • Published Sep 19 • 21
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts Paper • 2409.16040 • Published Sep 24 • 13
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Paper • 2409.20566 • Published Sep 30 • 51
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published 26 days ago • 48
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published 5 days ago • 44