arxiv:2501.05037
Yuxuan Wang PRO
ColorfulAI
AI & ML interests
Multimodal Learning
Recent Activity
updated
a dataset
1 day ago
ColorfulAI/NeedleInAVideoHaystack
authored
a paper
8 days ago
VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic
Understanding with Scene and Topic Transitions
authored
a paper
8 days ago
Collaborative Reasoning on Multi-Modal Semantic Graphs for
Video-Grounded Dialogue Generation