Llama-3.1-8B_auto
Collection
36 items
•
Updated
This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct on the GaetanMichelet/chat-60_ft_task-2_auto and the GaetanMichelet/chat-120_ft_task-2_auto datasets. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.5395 | 0.9091 | 5 | 1.5119 |
1.4907 | 2.0 | 11 | 1.4172 |
1.3412 | 2.9091 | 16 | 1.3288 |
1.2282 | 4.0 | 22 | 1.2153 |
1.1141 | 4.9091 | 27 | 1.1136 |
1.0131 | 6.0 | 33 | 1.0540 |
1.0044 | 6.9091 | 38 | 1.0380 |
0.9748 | 8.0 | 44 | 1.0223 |
0.937 | 8.9091 | 49 | 1.0142 |
0.9481 | 10.0 | 55 | 1.0053 |
0.9023 | 10.9091 | 60 | 1.0011 |
0.8716 | 12.0 | 66 | 0.9987 |
0.849 | 12.9091 | 71 | 0.9982 |
0.836 | 14.0 | 77 | 1.0032 |
0.7365 | 14.9091 | 82 | 1.0176 |
0.7495 | 16.0 | 88 | 1.0259 |
0.6882 | 16.9091 | 93 | 1.0410 |
0.6301 | 18.0 | 99 | 1.0708 |
0.6114 | 18.9091 | 104 | 1.1067 |
0.591 | 20.0 | 110 | 1.1334 |
Base model
meta-llama/Llama-3.1-8B