Qwen2.5-1.5B-Open-R1-GRPO / train_results.json
Julian-Sheeper's picture
Model save
740915c verified
{
"total_flos": 0.0,
"train_loss": 1.029338818625547e-05,
"train_runtime": 9222.5345,
"train_samples": 100,
"train_samples_per_second": 0.011,
"train_steps_per_second": 0.001
}