This is just a badly tuned lora of Qwen 2.5 3b
#3
by
notlober
- opened
Qwen 2.5 3b already gets 0.4373 and yours get 0.4320, its worse. What is point of this model? it does not get extraordinary scores as you say.
Do not use this model, this company does nothing serious.
I understand your disappointment with the benchmark score. It's true that a score of 0.4320 is lower than Qwen 2.5 3b's 0.4373 on that particular metric. You're right to point that out.
However, it's important to consider that fine-tuning often involves trade-offs. This model could be more skilled for something different than what that benchmark measures. It could be that it excels in areas like coding, or anything else that MMLU Pro doesn't cover.
ozone-research
changed discussion status to
closed