Spaces:

upstage
/

open-ko-llm-leaderboard

Running on CPU Upgrade

Sean Cho commited on Sep 26, 2023

Commit

59ba8a1

1 Parent(s): c770c05

update text

Files changed (1) hide show

src/assets/text_content.py CHANGED Viewed

@@ -32,7 +32,9 @@ Please provide information about the model through an issue! 🤩
 ## How it works
-📈 We have set up a benchmark using datasets translated into Korean, and applied variations by human experts, from the four tasks (HellaSwag, MMLU, Arc, Truthful QA) operated by HuggingFace OpenLLM. We have also added a new dataset prepared from scratch.
 - Ko-HellaSwag (provided by __[Upstage](https://www.upstage.ai/)__, machine translation)
 - Ko-MMLU (provided by __[Upstage](https://www.upstage.ai/)__, human translation and variation)
 - Ko-Arc (provided by __[Upstage](https://www.upstage.ai/)__, human translation and variation)
@@ -66,7 +68,11 @@ tokenizer = AutoTokenizer.from_pretrained("your model name", revision=revision)
 ```
 If this step fails, follow the error messages to debug your model before submitting it. It's likely your model has been improperly uploaded.
 ⚠️ Make sure your model is public!
 ⚠️ If your model needs use_remote_code=True, we do not support this option yet but we are working on adding it, stay posted!
 ### 2️⃣ Convert your model weights to [safetensors](https://huggingface.co/docs/safetensors/index)

 ## How it works
+📈 We evaluate models using the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness), a unified framework to test generative language models on a large number of different evaluation tasks.
+We have set up a benchmark using datasets translated into Korean, and applied variations by human experts, from the four tasks (HellaSwag, MMLU, Arc, Truthful QA) operated by HuggingFace OpenLLM. We have also added a new dataset prepared from scratch.
 - Ko-HellaSwag (provided by __[Upstage](https://www.upstage.ai/)__, machine translation)
 - Ko-MMLU (provided by __[Upstage](https://www.upstage.ai/)__, human translation and variation)
 - Ko-Arc (provided by __[Upstage](https://www.upstage.ai/)__, human translation and variation)
 ```
 If this step fails, follow the error messages to debug your model before submitting it. It's likely your model has been improperly uploaded.
 ⚠️ Make sure your model is public!
+⚠️ Maker sure your model runs with [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness)
 ⚠️ If your model needs use_remote_code=True, we do not support this option yet but we are working on adding it, stay posted!
 ### 2️⃣ Convert your model weights to [safetensors](https://huggingface.co/docs/safetensors/index)