Spaces:

autoevaluate
/

model-evaluator

Runtime error

App Files Files Community

Availability to evaluate LLMs like in the HF blog post

#24

by sjrhuschlee - opened Oct 4, 2022

Discussion

sjrhuschlee

Oct 4, 2022

Hello, I just saw thi blog post https://huggingface.co./blog/zero-shot-eval-on-the-hub which I am really excited about! I wanted to ask when the example in the blog post will be available to access in the HF leader boards?

lewtun

Evaluation on the Hub org Oct 5, 2022

Hi @sjrlee ! Great feature request idea - gently pinging @Tristan to add this task to the leaderboards so people can view the scores from the LLM evaluations on https://huggingface.co./spaces/autoevaluate/leaderboards?dataset=-any-

lewtun

Evaluation on the Hub org Oct 5, 2022

Hi @sjrlee you can find the evaluations coming from the zeroshot pipeline under the text-generation task on the leaderboards, e.g. https://huggingface.co./spaces/autoevaluate/leaderboards?dataset=mathemakitten%2Fwinobias_antistereotype_test

Feel free to close this issue if that addresses your query

sjrhuschlee

Oct 6, 2022

Thanks, @lewtun ! That is great to see. Before I close this I wanted to ask if there are plans to add the results of all the model sizes shown in the blog post?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment