NanQiangHF commited on
Commit
7bd5125
1 Parent(s): c878da3

NanQiangHF/llama3_8b_instruct_BWRM

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.3034
20
 
21
  ## Model description
22
 
@@ -35,7 +35,7 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 0.0003
39
  - train_batch_size: 32
40
  - eval_batch_size: 32
41
  - seed: 42
@@ -49,22 +49,22 @@ The following hyperparameters were used during training:
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:------:|:----:|:---------------:|
52
- | 0.7965 | 0.1840 | 20 | 0.5969 |
53
- | 0.6001 | 0.3680 | 40 | 0.5778 |
54
- | 0.5739 | 0.5520 | 60 | 0.5419 |
55
- | 0.5411 | 0.7361 | 80 | 0.4995 |
56
- | 0.5147 | 0.9201 | 100 | 0.4758 |
57
- | 0.45 | 1.1041 | 120 | 0.4148 |
58
- | 0.4145 | 1.2881 | 140 | 0.4171 |
59
- | 0.4011 | 1.4721 | 160 | 0.3753 |
60
- | 0.371 | 1.6561 | 180 | 0.4154 |
61
- | 0.3702 | 1.8401 | 200 | 0.3424 |
62
- | 0.3438 | 2.0242 | 220 | 0.3332 |
63
- | 0.3298 | 2.2082 | 240 | 0.3231 |
64
- | 0.3185 | 2.3922 | 260 | 0.3174 |
65
- | 0.3127 | 2.5762 | 280 | 0.3130 |
66
- | 0.3073 | 2.7602 | 300 | 0.3060 |
67
- | 0.3033 | 2.9442 | 320 | 0.3034 |
68
 
69
 
70
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.2710
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 0.0005
39
  - train_batch_size: 32
40
  - eval_batch_size: 32
41
  - seed: 42
 
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:------:|:----:|:---------------:|
52
+ | 0.8058 | 0.1840 | 20 | 0.5918 |
53
+ | 0.5986 | 0.3680 | 40 | 0.5643 |
54
+ | 0.5513 | 0.5520 | 60 | 0.5113 |
55
+ | 0.5039 | 0.7361 | 80 | 0.4433 |
56
+ | 0.4539 | 0.9201 | 100 | 0.4424 |
57
+ | 0.4083 | 1.1041 | 120 | 0.4024 |
58
+ | 0.3823 | 1.2881 | 140 | 0.3805 |
59
+ | 0.3644 | 1.4721 | 160 | 0.3400 |
60
+ | 0.336 | 1.6561 | 180 | 0.3206 |
61
+ | 0.3314 | 1.8401 | 200 | 0.3185 |
62
+ | 0.3105 | 2.0242 | 220 | 0.3078 |
63
+ | 0.2929 | 2.2082 | 240 | 0.2948 |
64
+ | 0.2855 | 2.3922 | 260 | 0.2831 |
65
+ | 0.2787 | 2.5762 | 280 | 0.2821 |
66
+ | 0.2717 | 2.7602 | 300 | 0.2757 |
67
+ | 0.2703 | 2.9442 | 320 | 0.2710 |
68
 
69
 
70
  ### Framework versions
adapter_config.json CHANGED
@@ -25,8 +25,8 @@
25
  "rank_pattern": {},
26
  "revision": null,
27
  "target_modules": [
28
- "v_proj",
29
- "q_proj"
30
  ],
31
  "task_type": null,
32
  "use_dora": false,
 
25
  "rank_pattern": {},
26
  "revision": null,
27
  "target_modules": [
28
+ "q_proj",
29
+ "v_proj"
30
  ],
31
  "task_type": null,
32
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d944ffb20ad72738fe6fb132bcbeb89f2037e49569828d98a760aa9c4faa1f40
3
  size 6849208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a723410b8b5283b2ca0dfd2497b87c4c0bbaa9f452f6faa5a6b01de93dcf2c64
3
  size 6849208
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7d5ac3b902f27c7b4b9913bc96bbb157a2688a50a705bb834dd6179b666b8445
3
  size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:56e6500f2b35510d5ad2fde91222b70fdc3ba67e3b05a808223e14ee7b01385f
3
  size 5304