new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Jul 2

Submitted by

dongguanting

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

·
18 authors

Submitted by

hba123

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

·
22 authors

Submitted by

manu

ColPali: Efficient Document Retrieval with Vision Language Models

·
6 authors

Submitted by

freesunshine0316

LiteSearch: Efficacious Tree Search for LLM

·
8 authors

Submitted by

SivilTaram

RegMix: Data Mixture as Regression for Language Model Pre-training

·
8 authors

Submitted by

leonardPKU

MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation

·
16 authors

Submitted by

idanlevy

Wavelets Are All You Need for Autoregressive Image Generation

·
4 authors

Submitted by

AJZhou

Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning

·
7 authors

Submitted by

wanghaofan

InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation

·
6 authors

Submitted by

Koi953215

DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models

·
6 authors

Submitted by

omergoldman

Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP

·
6 authors

Submitted by

naoyuki82

E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS

·
13 authors

Submitted by

yingtai

RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

·
10 authors

Submitted by

ydeng9

MIRAI: Evaluating LLM Agents for Event Forecasting

·
7 authors

Submitted by

zhwang4ai

OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents

·
10 authors

Submitted by

LXT

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

·
7 authors

Submitted by

Neph0s

Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs

·
6 authors

Submitted by

Shijie

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

·
7 authors

Submitted by

wanchichen

Towards Robust Speech Representation Learning for Thousands of Languages

·
10 authors

Submitted by

akhaliq

SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix

·
8 authors

Submitted by

davanstrien

Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER

·
5 authors

Submitted by

gsarti

Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

·
4 authors

Submitted by

BFauber

Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models

·
1 authors

Submitted by

iliashum

UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI

·
9 authors

Submitted by

hank0316

DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging

·
4 authors

Submitted by

JRQi

The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models

·
7 authors

Submitted by

TianyiQ

ProgressGym: Alignment with a Millennium of Moral Progress

·
6 authors