Leaderboard benchmark?

by djuna - opened 7 days ago

djuna

7 days ago

I'm curious how is it compare to the original Mistral small.

PSM24

5 days ago

@ehartford I evaluated the MATH500 score for this:
Dolphin3-R1: 87%
Mistral-Small3: 70%

Cognitive Computations org 5 days ago

I'm not engaging with huggingface leaderboard.
They can eval it, or not. Not my concern.

ehartford changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment