End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.8493
 ## Model description
@@ -52,17 +52,17 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.8703        | 1.0   | 394  | 1.8750          |
-| 0.7032        | 2.0   | 788  | 1.8472          |
-| 0.6907        | 3.0   | 1182 | 1.8444          |
-| 0.7402        | 4.0   | 1576 | 1.8483          |
-| 0.6788        | 5.0   | 1970 | 1.8493          |
 ### Framework versions
-- PEFT 0.11.1
-- Transformers 4.44.0
-- Pytorch 2.3.1
-- Datasets 2.20.0
 - Tokenizers 0.19.1

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6392
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.561         | 1.0   | 394  | 1.5800          |
+| 0.5974        | 2.0   | 788  | 1.5667          |
+| 0.5388        | 3.0   | 1182 | 1.5681          |
+| 0.496         | 4.0   | 1576 | 1.5996          |
+| 0.4139        | 5.0   | 1970 | 1.6392          |
 ### Framework versions
+- PEFT 0.12.1.dev0
+- Transformers 4.45.0.dev0
+- Pytorch 2.4.1+cu121
+- Datasets 3.0.0
 - Tokenizers 0.19.1

adapter_config.json CHANGED Viewed

@@ -10,17 +10,18 @@
   "layers_pattern": null,
   "layers_to_transform": null,
   "loftq_config": {},
-  "lora_alpha": 32,
   "lora_dropout": 0.05,
   "megatron_config": null,
   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
-  "r": 16,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "lm_head"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "layers_pattern": null,
   "layers_to_transform": null,
   "loftq_config": {},
+  "lora_alpha": 16,
   "lora_dropout": 0.05,
   "megatron_config": null,
   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
+  "r": 8,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "gate_proj",
+    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f8dbd0f2aff0fefbdb352d1c94a22ee1aa6bcb08ab4181e3b6b5385d5e2bd852
-size 1054925248

 version https://git-lfs.github.com/spec/v1
+oid sha256:653c17a00f1b1b83258a5941869da3e55c043965851913b1b193ffe68a4f9b33
+size 2128659168

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4cc34c9517b435487c2b71647d20346b334d2cf646c60452cb01e4ee0b381927
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:b3f9c4081453a7ff67aa451b1b5484e84eb9bda7980f390466844f515273b467
 size 5432