rhysjones
/

phi-2-orange

@@ -35,7 +35,7 @@ Evaluations done using mlabonne's usefull [Colab notebook llm-autoeval](https://
 Also check out the alternative leaderboard at [Yet_Another_LLM_Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard)
 |                             Model                              |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
 |----------------------------------------------------------------|------:|------:|---------:|-------:|------:|
-|[phi-2-orange](https://huggingface.co/rhysjones/phi-2-orange)|  **33.29**|  71.39|      49.9|   37.14|  **47.93**|
 |[phi-2-dpo](https://huggingface.co/lxuechen/phi-2-dpo)|  30.39|  **71.68**|     **50.75**|    34.9|  46.93|
-|[dolphin-2_6-phi-2](https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2)|  33.12|  69.85|     47.39|    **37.2**|  46.89|
 |[phi-2](https://huggingface.co/microsoft/phi-2)|  27.98|   70.8|     44.43|   35.21|  44.61|

 Also check out the alternative leaderboard at [Yet_Another_LLM_Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard)
 |                             Model                              |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
 |----------------------------------------------------------------|------:|------:|---------:|-------:|------:|
+|[phi-2-orange](https://huggingface.co/rhysjones/phi-2-orange)|  **33.37**|  71.33|      49.87|   **37.3**|  **47.97**|
 |[phi-2-dpo](https://huggingface.co/lxuechen/phi-2-dpo)|  30.39|  **71.68**|     **50.75**|    34.9|  46.93|
+|[dolphin-2_6-phi-2](https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2)|  33.12|  69.85|     47.39|    37.2|  46.89|
 |[phi-2](https://huggingface.co/microsoft/phi-2)|  27.98|   70.8|     44.43|   35.21|  44.61|