This is just a badly tuned lora of Qwen 2.5 3b

by notlober - opened 15 days ago

15 days ago

Qwen 2.5 3b already gets 0.4373 and yours get 0.4320, its worse. What is point of this model? it does not get extraordinary scores as you say.

Do not use this model, this company does nothing serious.

ozone-research

Owner 15 days ago

I understand your disappointment with the benchmark score. It's true that a score of 0.4320 is lower than Qwen 2.5 3b's 0.4373 on that particular metric. You're right to point that out.

However, it's important to consider that fine-tuning often involves trade-offs. This model could be more skilled for something different than what that benchmark measures. It could be that it excels in areas like coding, or anything else that MMLU Pro doesn't cover.

ozone-research changed discussion status to closed 15 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment