dinachen
DeeLearning
AI & ML interests
None yet
Recent Activity
published
a model
6 days ago
DeeLearning/Qwen2.5-Math-1.5B-Distill-114k
published
a model
8 days ago
DeeLearning/DeepSeek-R1-Distill-Qwen-1.5B-GRPO
updated
a model
11 days ago
DeeLearning/Qwen2.5-1.5B-Open-R1-Distill
Organizations
None yet
DeeLearning's activity
CheckpointingException | nvidia/Llama3-70B-SteerLM-RM NOT a distributed checkpoint of Megatron
1
#4 opened 4 months ago
by
DeeLearning
![](https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/m07W3yCdrvAIJX7IzKYVZ.png)