4 101

Kh

raidhon

AI & ML interests

Fine-tuning, Dataset creation, Time Series

Recent Activity

liked a model 12 days ago

hexgrad/Kokoro-82M

liked a model 13 days ago

bytedance-research/UI-TARS-72B-DPO

reacted to onekq's post with 🔥 15 days ago

🐋DeepSeek 🐋 is the real OpenAI 😯

View all activity

Organizations

None yet

raidhon's activity

liked a model 12 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 3 days ago • 129k • • 2.76k

liked a model 13 days ago

bytedance-research/UI-TARS-72B-DPO

Image-Text-to-Text • Updated 10 days ago • 8.21k • 76

reacted to onekq's post with 🔥 15 days ago

Post

4665

🐋DeepSeek 🐋 is the real OpenAI 😯

6 replies

liked a dataset 22 days ago

NovaSky-AI/Sky-T1_data_17k

Viewer • Updated 21 days ago • 16.4k • 4.22k • 168

New activity in Qwen/QwQ-32B-Preview about 2 months ago

Can't reproduce the evaluation result of GPQA dataset

#47 opened about 2 months ago by

Rinn000

liked a model 4 months ago

rhymes-ai/Aria

Image-Text-to-Text • Updated 8 days ago • 34.6k • 610

liked a dataset 5 months ago

KbsdJames/Omni-MATH

Viewer • Updated Oct 12, 2024 • 4.43k • 2.31k • 68

replied to m-ric's post 5 months ago

Yes, it's been tested, and it's false. It's even worse than the regular LLAMA 3.1 70b. It's even funny to compare it to Claude.
https://www.reddit.com/r/LocalLLaMA/s/BH5A2ngyui

liked 2 models 9 months ago

imone/Llama-3-8B-fixed-special-embedding

Text Generation • Updated Apr 25, 2024 • 38 • 17

Xenova/gpt-4o

Updated May 13, 2024 • 58

replied to hrishbhdalal's post 9 months ago

Yeah, I was thinking the same thing. A large vocabulary does improve the performance of smaller LLMs and judging by the GPT-4o the same is true for larger LLM. Give it a try. I'm just doing this for small size models up to 3B parameters.

liked a model 9 months ago