llama3.1_8b_bwgenerator

Files changed (3) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1856
 ## Model description
@@ -45,23 +45,25 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.8017        | 0.2585 | 10   | 0.8807          |
-| 0.5653        | 0.5170 | 20   | 0.3488          |
-| 0.3026        | 0.7754 | 30   | 0.2705          |
-| 0.255         | 1.0339 | 40   | 0.2418          |
-| 0.2307        | 1.2924 | 50   | 0.2223          |
-| 0.2157        | 1.5509 | 60   | 0.2099          |
-| 0.206         | 1.8094 | 70   | 0.2021          |
-| 0.1978        | 2.0679 | 80   | 0.1962          |
-| 0.1913        | 2.3263 | 90   | 0.1910          |
-| 0.1882        | 2.5848 | 100  | 0.1877          |
-| 0.1861        | 2.8433 | 110  | 0.1856          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1193
 ## Model description
 - total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 2
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.2338        | 0.1456 | 20   | 0.3871          |
+| 0.3226        | 0.2911 | 40   | 0.2794          |
+| 0.2612        | 0.4367 | 60   | 0.2409          |
+| 0.2238        | 0.5822 | 80   | 0.2055          |
+| 0.1848        | 0.7278 | 100  | 0.1625          |
+| 0.1505        | 0.8733 | 120  | 0.1424          |
+| 0.1382        | 1.0189 | 140  | 0.1347          |
+| 0.1319        | 1.1644 | 160  | 0.1311          |
+| 0.1281        | 1.3100 | 180  | 0.1265          |
+| 0.1248        | 1.4555 | 200  | 0.1237          |
+| 0.1228        | 1.6011 | 220  | 0.1221          |
+| 0.1205        | 1.7466 | 240  | 0.1200          |
+| 0.1201        | 1.8922 | 260  | 0.1193          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c838c4a2f8eba919eae620cb00e8e19ff58b8c3a5286160e3cc7b8b8526930c5
 size 6832728

 version https://git-lfs.github.com/spec/v1
+oid sha256:4f69449384f7f9777816d641acffe5fc1721bd9cb8f35df65a278fd43fe70fa8
 size 6832728

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d19cb981a2f30ca18fe5cd1d3364688db63eda07bd0aa8a226d9f92a0e824a1a
 size 5560

 version https://git-lfs.github.com/spec/v1
+oid sha256:b369b6ad9bb7b80e5b5359a2eda8a6264f5ef7562c67f966d3aa7831507cce5c
 size 5560