HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5 • 64
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4 • 46
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation Paper • 2410.23090 • Published Oct 30 • 54
RARe: Retrieval Augmented Retrieval with In-Context Examples Paper • 2410.20088 • Published Oct 26 • 5
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22 • 89
LOGO -- Long cOntext aliGnment via efficient preference Optimization Paper • 2410.18533 • Published Oct 24 • 42
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published Oct 24 • 40
MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms Paper • 2410.18977 • Published Oct 24 • 14
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Paper • 2411.04952 • Published Nov 7 • 28
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7 • 111
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published 20 days ago • 55
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published 20 days ago • 104
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 21 days ago • 118
Imagine360: Immersive 360 Video Generation from Perspective Anchor Paper • 2412.03552 • Published 21 days ago • 26
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment Paper • 2412.13746 • Published 7 days ago • 8
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published 8 days ago • 40
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation Paper • 2412.10704 • Published 11 days ago • 14
SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner Paper • 2412.10533 • Published 12 days ago • 5
When to Speak, When to Abstain: Contrastive Decoding with Abstention Paper • 2412.12527 • Published 9 days ago • 4
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published 9 days ago • 33
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published 10 days ago • 24
VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping Paper • 2412.11279 • Published 10 days ago • 12
SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video Paper • 2412.09982 • Published 12 days ago • 7
TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning Paper • 2412.10447 • Published 14 days ago • 5
Whisper-GPT: A Hybrid Representation Audio Large Language Model Paper • 2412.11449 • Published 9 days ago • 4
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training Paper • 2412.11863 • Published 9 days ago • 2
Reliable, Reproducible, and Really Fast Leaderboards with Evalica Paper • 2412.11314 • Published 10 days ago • 2