DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? Paper • 2409.07703 • Published Sep 12 • 66
FAITHSCORE: Evaluating Hallucinations in Large Vision-Language Models Paper • 2311.01477 • Published Nov 2, 2023 • 1