Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1122

MMLU Average Score

#100

by paopao0226 - opened Jul 8, 2023

Discussion

paopao0226

Jul 8, 2023

Thanks for your detailed information on the "About" board. And I am also confused that there is only one score of MMLU on the leaderboard, but MMLU's calculation needs 57 tasks. So how to mix these tasks' score to single one? Just add them and divide 57(the number of tasks)? Or any trick of calculation? Thanks.

clefourrier

Open LLM Leaderboard org Jul 10, 2023

If you run the harness as mentioned, it will provide an average score at the end :)

paopao0226

Jul 11, 2023

@clefourrier But when I run the harness as mentioned, It just has the results of subtasks.
python main.py --model=hf-causal-experimental --model_args="pretrained=<model_path>,use_accelerate=True" --num_fewshot=5 --device=cuda --task=hendrycksTest-* --batch_size=4 --output_path=<output_path>
here is the instrument that I run :(

clefourrier

Open LLM Leaderboard org Jul 11, 2023

Don't you have an "all" value at the end of the table displayed or in the files saved?

paopao0226

Jul 11, 2023

@clefourrier sadly, no:(((

clefourrier

Open LLM Leaderboard org Jul 11, 2023

Ha, my bad, sorry, it's an internal thing we added for logging!
We just do an average :)

paopao0226

Jul 11, 2023

@clefourrier Okkkk, thanks! Hoping you have a good time!

paopao0226 changed discussion status to closed Jul 11, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment