Leaderboard benchmark?

#5
by djuna - opened

I'm curious how is it compare to the original Mistral small.

@ehartford I evaluated the MATH500 score for this:
Dolphin3-R1: 87%
Mistral-Small3: 70%

Cognitive Computations org

I'm not engaging with huggingface leaderboard.
They can eval it, or not. Not my concern.

ehartford changed discussion status to closed

Sign up or log in to comment