FLAME: Factuality-Aware Alignment for Large Language Models
Paper
•
2405.01525
•
Published
•
26
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale
Synthetic Data
Paper
•
2405.14333
•
Published
•
37
Transformers Can Do Arithmetic with the Right Embeddings
Paper
•
2405.17399
•
Published
•
52
EasyAnimate: A High-Performance Long Video Generation Method based on
Transformer Architecture
Paper
•
2405.18991
•
Published
•
12
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper
•
2406.06608
•
Published
•
58
Autoregressive Model Beats Diffusion: Llama for Scalable Image
Generation
Paper
•
2406.06525
•
Published
•
67
Transformers meet Neural Algorithmic Reasoners
Paper
•
2406.09308
•
Published
•
44
Self-MoE: Towards Compositional Large Language Models with
Self-Specialized Experts
Paper
•
2406.12034
•
Published
•
15
A Closer Look into Mixture-of-Experts in Large Language Models
Paper
•
2406.18219
•
Published
•
16
DiffusionPDE: Generative PDE-Solving Under Partial Observation
Paper
•
2406.17763
•
Published
•
24
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
Paper
•
2406.18790
•
Published
•
34
Controlling Space and Time with Diffusion Models
Paper
•
2407.07860
•
Published
•
16
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in
Large Language Models Using Only Attention Maps
Paper
•
2407.07071
•
Published
•
12
Open-FinLLMs: Open Multimodal Large Language Models for Financial
Applications
Paper
•
2408.11878
•
Published
•
54
Leveraging Open Knowledge for Advancing Task Expertise in Large Language
Models
Paper
•
2408.15915
•
Published
•
19
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with
100+ NLP Researchers
Paper
•
2409.04109
•
Published
•
44
Training Language Models to Self-Correct via Reinforcement Learning
Paper
•
2409.12917
•
Published
•
136
Scaling Smart: Accelerating Large Language Model Pre-training with Small
Model Initialization
Paper
•
2409.12903
•
Published
•
22
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of
Experts
Paper
•
2409.16040
•
Published
•
13
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Paper
•
2409.20566
•
Published
•
55
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper
•
2410.10814
•
Published
•
49
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM
Quantization
Paper
•
2411.02355
•
Published
•
47
POINTS1.5: Building a Vision-Language Model towards Real World
Applications
Paper
•
2412.08443
•
Published
•
38
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity
Visual Descriptions
Paper
•
2412.08737
•
Published
•
53
Multimodal Latent Language Modeling with Next-Token Diffusion
Paper
•
2412.08635
•
Published
•
44
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper
•
2412.10360
•
Published
•
139
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained
Evidence within Generation
Paper
•
2412.11919
•
Published
•
33
Smaller Language Models Are Better Instruction Evolvers
Paper
•
2412.11231
•
Published
•
27
Learned Compression for Compressed Learning
Paper
•
2412.09405
•
Published
•
13
Paper
•
2412.13501
•
Published
•
25
RobustFT: Robust Supervised Fine-tuning for Large Language Models under
Noisy Response
Paper
•
2412.14922
•
Published
•
85
YuLan-Mini: An Open Data-efficient Language Model
Paper
•
2412.17743
•
Published
•
64
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive
Survey
Paper
•
2412.18619
•
Published
•
54
Task Preference Optimization: Improving Multimodal Large Language Models
with Vision Task Alignment
Paper
•
2412.19326
•
Published
•
18
LUSIFER: Language Universal Space Integration for Enhanced Multilingual
Embeddings with Large Language Models
Paper
•
2501.00874
•
Published
•
12
Personalized Graph-Based Retrieval for Large Language Models
Paper
•
2501.02157
•
Published
•
28
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
•
2501.03262
•
Published
•
87
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video
Generation Control
Paper
•
2501.03847
•
Published
•
23
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper
•
2501.04306
•
Published
•
33
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper
•
2501.05366
•
Published
•
79
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Paper
•
2501.06282
•
Published
•
39
Transformer^2: Self-adaptive LLMs
Paper
•
2501.06252
•
Published
•
50
ChemAgent: Self-updating Library in Large Language Models Improves
Chemical Reasoning
Paper
•
2501.06590
•
Published
•
8
deepseek-ai/DeepSeek-V3
Text Generation
•
Updated
•
209k
•
2.26k
Learnings from Scaling Visual Tokenizers for Reconstruction and
Generation
Paper
•
2501.09755
•
Published
•
33
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
Paper
•
2501.08617
•
Published
•
10
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
•
2501.09686
•
Published
•
35
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper
•
2501.08983
•
Published
•
19
Evolving Deeper LLM Thinking
Paper
•
2501.09891
•
Published
•
95
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial
Network for High-Fidelity Speech Super-Resolution
Paper
•
2501.10045
•
Published
•
8
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D
Assets Generation
Paper
•
2501.12202
•
Published
•
25
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video
Understanding
Paper
•
2501.13106
•
Published
•
57
Autonomy-of-Experts Models
Paper
•
2501.13074
•
Published
•
34