EllieS commited on
Commit
c38fd08
1 Parent(s): 0826bd0

Model save

Browse files
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - trl
6
+ - dpo
7
+ - generated_from_trainer
8
+ base_model: alignment-handbook/zephyr-7b-sft-full
9
+ model-index:
10
+ - name: zephyr-7b-dpo-selfgen
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # zephyr-7b-dpo-selfgen
18
+
19
+ This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.0000
22
+ - Rewards/chosen: -6.6466
23
+ - Rewards/rejected: -19.5106
24
+ - Rewards/accuracies: 1.0
25
+ - Rewards/margins: 12.8639
26
+ - Logps/rejected: -1996.6047
27
+ - Logps/chosen: -731.7379
28
+ - Logits/rejected: -2.0588
29
+ - Logits/chosen: -2.4883
30
+
31
+ ## Model description
32
+
33
+ More information needed
34
+
35
+ ## Intended uses & limitations
36
+
37
+ More information needed
38
+
39
+ ## Training and evaluation data
40
+
41
+ More information needed
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - learning_rate: 5e-06
49
+ - train_batch_size: 2
50
+ - eval_batch_size: 8
51
+ - seed: 42
52
+ - distributed_type: multi-GPU
53
+ - num_devices: 2
54
+ - gradient_accumulation_steps: 2
55
+ - total_train_batch_size: 8
56
+ - total_eval_batch_size: 16
57
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
58
+ - lr_scheduler_type: cosine
59
+ - lr_scheduler_warmup_ratio: 0.1
60
+ - num_epochs: 3.0
61
+
62
+ ### Training results
63
+
64
+ | Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
65
+ |:-------------:|:-----:|:-----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:|
66
+ | 0.0241 | 0.42 | 7000 | -2.8328 | -2.8312 | -143.5124 | -856.1008 | 0.0101 | 1.0 | -0.7644 | 7.3411 | -8.1055 |
67
+ | 0.0001 | 0.83 | 14000 | -2.3450 | -1.9435 | -714.5292 | -1741.5647 | 0.0002 | 1.0 | -6.4745 | 10.4856 | -16.9602 |
68
+ | 0.0003 | 1.25 | 21000 | -2.4293 | -2.0264 | -695.5377 | -1973.5151 | 0.0001 | 1.0 | -6.2846 | 12.9950 | -19.2797 |
69
+ | 0.0 | 1.67 | 28000 | -2.5393 | -2.1793 | -619.2334 | -1821.8682 | 0.0001 | 1.0 | -5.5216 | 12.2416 | -17.7632 |
70
+ | 0.0001 | 2.09 | 35000 | -2.4633 | -1.9800 | -817.4478 | -2071.8862 | 0.0000 | 1.0 | -7.5037 | 12.7596 | -20.2634 |
71
+ | 0.0 | 2.5 | 42000 | -2.4883 | -2.0593 | -730.7642 | -2000.8484 | 0.0000 | 1.0 | -6.6369 | 12.9161 | -19.5530 |
72
+ | 0.0001 | 2.92 | 49000 | -2.4895 | -2.0591 | -732.9475 | -1999.9326 | 0.0000 | 1.0 | -6.6587 | 12.8851 | -19.5438 |
73
+
74
+
75
+ ### Framework versions
76
+
77
+ - PEFT 0.7.1
78
+ - Transformers 4.36.2
79
+ - Pytorch 2.1.2+cu121
80
+ - Datasets 2.14.6
81
+ - Tokenizers 0.15.2
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fed9feb9660c79541e4f7d840b782f1007830b67bfe2e0c1bc0e52620e9a8af1
3
  size 83946192
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b3c4aba4f624a5d2286c1af771b09fbf8375644a29a54e000ce1707bb811672a
3
  size 83946192
all_results.json ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "eval_logits/chosen": -2.4882912635803223,
4
+ "eval_logits/rejected": -2.058779001235962,
5
+ "eval_logps/chosen": -731.7378540039062,
6
+ "eval_logps/rejected": -1996.604736328125,
7
+ "eval_loss": 3.1195009796647355e-05,
8
+ "eval_rewards/accuracies": 1.0,
9
+ "eval_rewards/chosen": -6.646634101867676,
10
+ "eval_rewards/margins": 12.863931655883789,
11
+ "eval_rewards/rejected": -19.51056480407715,
12
+ "eval_runtime": 4.5785,
13
+ "eval_samples": 5,
14
+ "eval_samples_per_second": 1.092,
15
+ "eval_steps_per_second": 0.218,
16
+ "train_loss": 0.0,
17
+ "train_runtime": 0.0617,
18
+ "train_samples": 134157,
19
+ "train_samples_per_second": 6527181.256,
20
+ "train_steps_per_second": 815915.902
21
+ }
eval_results.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "eval_logits/chosen": -2.4882912635803223,
4
+ "eval_logits/rejected": -2.058779001235962,
5
+ "eval_logps/chosen": -731.7378540039062,
6
+ "eval_logps/rejected": -1996.604736328125,
7
+ "eval_loss": 3.1195009796647355e-05,
8
+ "eval_rewards/accuracies": 1.0,
9
+ "eval_rewards/chosen": -6.646634101867676,
10
+ "eval_rewards/margins": 12.863931655883789,
11
+ "eval_rewards/rejected": -19.51056480407715,
12
+ "eval_runtime": 4.5785,
13
+ "eval_samples": 5,
14
+ "eval_samples_per_second": 1.092,
15
+ "eval_steps_per_second": 0.218
16
+ }
runs/Feb17_06-00-13_586cb8b6da8c/events.out.tfevents.1708149717.586cb8b6da8c.6725.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c7b8ba58c5d9ea0b19acc17b6306d60366fcb53f8d7849a417b11cfefdda48a6
3
- size 3128162
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fdc8ce747bcf9ccb9ed38169256ecfcba6ef64db63e3fea28274482c57b72d4b
3
+ size 3237635
runs/Feb19_15-17-29_586cb8b6da8c/events.out.tfevents.1708355952.586cb8b6da8c.7306.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e4ef1da8969e62e00f02ffb26062114a618392759c5e4889d8d11710778dcf7
3
+ size 4922
runs/Feb19_15-21-21_586cb8b6da8c/events.out.tfevents.1708356184.586cb8b6da8c.7413.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e8a86ccc5830948f3ab95b9d739ea7beaa4137f8a503b7f0b3598bea7d661a9
3
+ size 4922
runs/Feb19_15-21-21_586cb8b6da8c/events.out.tfevents.1708356189.586cb8b6da8c.7413.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a704804a70eb9a16ed585b9bf3634cf4df63d3d3008e5e18a8dd011691429b2b
3
+ size 841
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "train_loss": 0.0,
4
+ "train_runtime": 0.0617,
5
+ "train_samples": 134157,
6
+ "train_samples_per_second": 6527181.256,
7
+ "train_steps_per_second": 815915.902
8
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:341f4436f67e1ed83a11ac2b26478385b031fe4ce66193ac7d0f2c822f5e5bb5
3
  size 4856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:968c952e8a64d00415def5b84eb67205d6d93d788281eac4b631b0ef3a5f9357
3
  size 4856