LeeSB commited on
Commit
0ee6e36
1 Parent(s): 0eeda98

Model save

Browse files
Files changed (4) hide show
  1. README.md +14 -12
  2. all_results.json +5 -5
  3. train_results.json +5 -5
  4. trainer_state.json +0 -0
README.md CHANGED
@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.6710
22
- - Rewards/chosen: 0.0193
23
- - Rewards/rejected: -0.0357
24
- - Rewards/accuracies: 0.5900
25
- - Rewards/margins: 0.0550
26
- - Logps/rejected: -266.8149
27
- - Logps/chosen: -283.2045
28
- - Logits/rejected: -2.5703
29
- - Logits/chosen: -2.7049
30
 
31
  ## Model description
32
 
@@ -45,13 +45,15 @@ More information needed
45
  ### Training hyperparameters
46
 
47
  The following hyperparameters were used during training:
48
- - learning_rate: 5e-07
49
- - train_batch_size: 4
50
  - eval_batch_size: 8
51
  - seed: 42
52
  - distributed_type: multi-GPU
 
53
  - gradient_accumulation_steps: 4
54
  - total_train_batch_size: 16
 
55
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
  - lr_scheduler_type: cosine
57
  - lr_scheduler_warmup_ratio: 0.1
@@ -61,7 +63,7 @@ The following hyperparameters were used during training:
61
 
62
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
63
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
64
- | 0.4786 | 1.0 | 877 | 0.6710 | 0.0193 | -0.0357 | 0.5900 | 0.0550 | -266.8149 | -283.2045 | -2.5703 | -2.7049 |
65
 
66
 
67
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.3553
22
+ - Rewards/chosen: -6.3145
23
+ - Rewards/rejected: -9.0890
24
+ - Rewards/accuracies: 0.8223
25
+ - Rewards/margins: 2.7746
26
+ - Logps/rejected: -1238.5492
27
+ - Logps/chosen: -1017.7702
28
+ - Logits/rejected: -2.0087
29
+ - Logits/chosen: -2.0557
30
 
31
  ## Model description
32
 
 
45
  ### Training hyperparameters
46
 
47
  The following hyperparameters were used during training:
48
+ - learning_rate: 5e-06
49
+ - train_batch_size: 2
50
  - eval_batch_size: 8
51
  - seed: 42
52
  - distributed_type: multi-GPU
53
+ - num_devices: 2
54
  - gradient_accumulation_steps: 4
55
  - total_train_batch_size: 16
56
+ - total_eval_batch_size: 16
57
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
58
  - lr_scheduler_type: cosine
59
  - lr_scheduler_warmup_ratio: 0.1
 
63
 
64
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
65
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
66
+ | 0.3948 | 1.0 | 9121 | 0.3553 | -6.3145 | -9.0890 | 0.8223 | 2.7746 | -1238.5492 | -1017.7702 | -2.0087 | -2.0557 |
67
 
68
 
69
  ### Framework versions
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 0.542221034538243,
4
- "train_runtime": 10885.06,
5
- "train_samples": 14031,
6
- "train_samples_per_second": 1.289,
7
- "train_steps_per_second": 0.081
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 0.35941912531329934,
4
+ "train_runtime": 141935.9761,
5
+ "train_samples": 145937,
6
+ "train_samples_per_second": 1.028,
7
+ "train_steps_per_second": 0.064
8
  }
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 0.542221034538243,
4
- "train_runtime": 10885.06,
5
- "train_samples": 14031,
6
- "train_samples_per_second": 1.289,
7
- "train_steps_per_second": 0.081
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 0.35941912531329934,
4
+ "train_runtime": 141935.9761,
5
+ "train_samples": 145937,
6
+ "train_samples_per_second": 1.028,
7
+ "train_steps_per_second": 0.064
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff