About Training Detail
1
#4 opened 5 days ago
by
XinC6
different max_position_embeddings and rope_theta in and OpenR1-Qwen-7B-SFT and it's base Qwen2.5-Math-7B-Instruct ?
1
#3 opened 6 days ago
by
zhuzhuyue
About initial Model
#2 opened 12 days ago
by
wilye
training code
2
#1 opened 27 days ago
by
Ping404
