9 4 12

kyle PRO

kaikaidai

kaikaidai

AI & ML interests

None yet

Recent Activity

updated a Space 8 days ago

AtlaAI/selene

updated a Space 9 days ago

kaikaidai/Sandbox_Test

updated a model 11 days ago

AtlaAI/Selene-1-Mini-Llama-3.1-8B

View all activity

Organizations

Posts 1

Post

1074

📈 Early results on the 8B evaluation model we've been training...

@NinaCalvi wrote about the progress we've made this quarter towards training the best 'LLM-as-a-judge' evaluator. We've significantly improved against the baseline and are approaching state-of-the-art evaluation performance with an 8B model.

Next up: training Llama-3.1-70B 👀

Here's the full article: https://www.atla-ai.com/post/evaluating-the-evaluator

View all Posts

Articles 3

Article

Upload and analyze datasets with evaluation criteria

models

None public yet

datasets

None public yet

kyle PRO

AI & ML interests

Recent Activity

Organizations

Posts 1

Articles 3

Selene 1 Mini: the best small language model-as-a-judge

Papers 1

spaces 1

Sandbox Test

models

datasets