Spaces:
Running
Running
metadata
title: Verbal Reasoning Challenge
emoji: 🤔
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.15.0
app_file: app.py
pinned: false
license: bsd-3-clause
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models
This application presents the results of several models that we have evaluated on a verbal reasoning challenge (Papers, ArXiv). The overall results are below. Use the tabs above to explore the results in more detail.