large-traversaal
/

Phi-4-Hindi

instruction-tuning

Model card Files Files and versions Community

1024m commited on 2 days ago

Commit

dd01bb2

·

verified ·

1 Parent(s): 524bb32

Update README.md

Files changed (1) hide show

README.md +9 -0

README.md CHANGED Viewed

@@ -115,6 +115,7 @@ trained on a mixed language dataset.
 - ~1% better performance on English Tasks compared to the original (average benchmark scores)
 - ~4% better performance on Hindi Tasks compared to the original (average benchmark scores)
 - ~10% less emissions than the original (as reported on benchmark evaluations like open-llm-leaderboard)
 ### Model Details:
@@ -253,6 +254,14 @@ Unlike distillation from reasoining or CoT models which produced unnecessarily l
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/645c60dd7d655680b57ddbff/vgNk0bKthxNsxO0oAdaPD.png)
 ### Team
 - Ram Mohan Rao Kadiyala

 - ~1% better performance on English Tasks compared to the original (average benchmark scores)
 - ~4% better performance on Hindi Tasks compared to the original (average benchmark scores)
 - ~10% less emissions than the original (as reported on benchmark evaluations like open-llm-leaderboard)
+- Less Biases due to ordering of choices while answering MCQs
 ### Model Details:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/645c60dd7d655680b57ddbff/vgNk0bKthxNsxO0oAdaPD.png)
+### Model Responses vs Order of Choices in MCQs
+As benchmarks like MMLU-Pro have upto 10 choices, while most training datasets consist of typically 4-5 choices, we modified the ordering and labelling of choices i.e re-ordering choices to create an imbalance opposing the original model's choice distribution, replacement of labels from A/B/C/D to a/b/c/d or 1/2/3/4 or w/x/y/z etc.. in 5% of the MCQ samples for better robustness
+This resulted in less bias towards the earlier choices among MCQs as compared to the original phi-4. The below images are a distution of choices selected by the model while being evaluated over MMLU-pro
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/645c60dd7d655680b57ddbff/5DYCkLHpdk2jaTsALcwN8.png)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/645c60dd7d655680b57ddbff/hhNNE4s8mALYsxdVf-UCq.png)
 ### Team
 - Ram Mohan Rao Kadiyala