Challenges in Trustworthy Human Evaluation of Chatbots Paper • 2412.04363 • Published 19 days ago • 2 • 2
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18 • 43
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18 • 43
UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations Paper • 2311.08469 • Published Nov 14, 2023 • 10