Spaces:
Running
on
CPU Upgrade
Failed orca_mini_v8_* Evaluation
Opening new discussion, as suggested in previous comment on another discussion:
Hi @alozowski ,
Happy Monday, just reaching out to make sense out of following eval requests commits for model "pankajmathur/orca_mini_v8_0_70b", the below commit shows file rename and changes from wrong "params": 35.277,
https://huggingface.co./datasets/open-llm-leaderboard/requests/commit/5660c4c4b9156fa0f15d99be7eee061d5de24764#d2h-741276
Does the model failed to evaluate and these changes reflect re submission for evaluation again?
If it is true, can we submit "pankajmathur/orca_mini_v8_1_70b" again too, as It shows it is failed too?
https://huggingface.co./datasets/open-llm-leaderboard/requests/commit/8b40ba212c48dc470be4f661b67cc085ed456477#d2h-702908
Is there any reason they are failing? Just for background, I have successfully evaluated both of them on my own servers, before submitting them to HF Open LLM LB, using:
https://huggingface.co./docs/leaderboards/open_llm_leaderboard/about#reproducibility
lm_eval --model hf --model_args pretrained=pankajmathur/orca_mini_v8_1_70b,dtype=bfloat16,parallelize=True --tasks leaderboard --output_path lm_eval_results/leaderboard --batch_size auto
and these results are now updated for both model cards:
https://huggingface.co./pankajmathur/orca_mini_v8_0_70b
https://huggingface.co./pankajmathur/orca_mini_v8_1_70b
Again, thanks again for helping out on this really appreciated.
Regards,
Pankaj