CHONG YOE YAT

CHONGYOEYAT

AI & ML interests

LLM,CNN

Recent Activity

reacted to merve's post with πŸ‘ 3 days ago
This week in open AI was πŸ”₯ Let's recap! πŸ€— https://huggingface.co./collections/merve/january-31-releases-679a10669bd4030090c5de4d LLMs πŸ’¬ > Huge: AllenAI released new TΓΌlu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B πŸ”₯ > Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱 > Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license πŸ”₯ > Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens > Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages > OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset VLMs & vision πŸ‘€ > Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization πŸ”₯ > NVIDIA released new series of Eagle2 models with 1B and 9B sizes > DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license > BEN2 is a new background removal model with MIT license! Audio πŸ—£οΈ > YuE is a new open-source music generation foundation model, lyrics-to-song generation Codebase πŸ‘©πŸ»β€πŸ’» > We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision > Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co./blog/open-r1
liked a model 3 days ago
PramaLLC/BEN2
liked a model 3 days ago
hexgrad/Kokoro-82M
View all activity

Organizations

None yet

CHONGYOEYAT's activity

reacted to merve's post with πŸ‘ 3 days ago
view post
Post
3407
This week in open AI was πŸ”₯ Let's recap! πŸ€— merve/january-31-releases-679a10669bd4030090c5de4d
LLMs πŸ’¬
> Huge: AllenAI released new TΓΌlu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B πŸ”₯
> Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱
> Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license πŸ”₯
> Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens
> Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages
> OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset

VLMs & vision πŸ‘€
> Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization πŸ”₯
> NVIDIA released new series of Eagle2 models with 1B and 9B sizes
> DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license
> BEN2 is a new background removal model with MIT license!

Audio πŸ—£οΈ
> YuE is a new open-source music generation foundation model, lyrics-to-song generation

Codebase πŸ‘©πŸ»β€πŸ’»
> We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision
> Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co./blog/open-r1
  • 1 reply
Β·