Kh

raidhon

AI & ML interests

Fine-tuning, Dataset creation, Time Series

Recent Activity

liked a model 12 days ago
hexgrad/Kokoro-82M
liked a model 13 days ago
bytedance-research/UI-TARS-72B-DPO
View all activity

Organizations

None yet

raidhon's activity

reacted to onekq's post with ๐Ÿ”ฅ 15 days ago
view post
Post
4665
๐Ÿ‹DeepSeek ๐Ÿ‹ is the real OpenAI ๐Ÿ˜ฏ
ยท
New activity in Qwen/QwQ-32B-Preview about 2 months ago
replied to m-ric's post 5 months ago
replied to hrishbhdalal's post 9 months ago
view reply

Yeah, I was thinking the same thing. A large vocabulary does improve the performance of smaller LLMs and judging by the GPT-4o the same is true for larger LLM. Give it a try. I'm just doing this for small size models up to 3B parameters.