3 76 148

YangWang92

yangwang92

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

alpindale/Meta-Llama-3.1-405B-Instruct-v16-k65536-256-woft-perm

liked a model 3 days ago

nvidia/DeepSeek-R1-FP4

liked a model 5 days ago

WebOrganizer/TopicClassifier

View all activity

Organizations

yangwang92's activity

liked a model 2 days ago

alpindale/Meta-Llama-3.1-405B-Instruct-v16-k65536-256-woft-perm

Updated 3 days ago • 32 • 2

liked a model 3 days ago

nvidia/DeepSeek-R1-FP4

Text Generation • Updated 2 days ago • 1.74k • 159

liked a model 5 days ago

WebOrganizer/TopicClassifier

Text Classification • Updated 9 days ago • 74 • 5

liked 2 datasets 5 days ago

allenai/coconot

Viewer • Updated Jul 18, 2024 • 13.8k • 610 • 7

allenai/tulu-3-sft-mixture

Viewer • Updated Dec 2, 2024 • 939k • 4.62k • 116

upvoted a paper 8 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 9 days ago • 150

liked a Space 9 days ago

1.78k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper 11 days ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published 14 days ago • 50

liked 2 datasets 11 days ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

Viewer • Updated 8 days ago • 110k • 4.77k • 404

HuggingFaceFW/fineweb-edu

Viewer • Updated 28 days ago • 3.3B • 516k • 637

liked 2 models 11 days ago

stepfun-ai/stepvideo-t2v

Text-to-Video • Updated 10 days ago • 1.55k • 397

stepfun-ai/stepvideo-t2v-turbo

Updated 12 days ago • 81

upvoted a collection 16 days ago

CodeI/O

Collection

Collection for CodeI/O @ https://codei-o.github.io/ • 15 items • Updated 16 days ago • 6

upvoted a paper 16 days ago

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published 17 days ago • 45

upvoted 2 articles 17 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 782

Article

Open R1: Update #2

and 6 others •

18 days ago

• 191

upvoted a paper 17 days ago

Matryoshka Quantization

Paper • 2502.06786 • Published 18 days ago • 29

liked a dataset 18 days ago

agentica-org/DeepScaleR-Preview-Dataset

Viewer • Updated 18 days ago • 40.3k • 1.89k • 68

upvoted a paper 18 days ago

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Paper • 2502.05003 • Published 21 days ago • 41

liked a dataset 24 days ago

PRIME-RL/Eurus-2-Rollout

Viewer • Updated Jan 13 • 300k • 177 • 2