Model save

Browse files

Files changed (4) hide show

README.md +14 -12
all_results.json +5 -5
train_results.json +5 -5
trainer_state.json +0 -0

README.md CHANGED Viewed

@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6710
-- Rewards/chosen: 0.0193
-- Rewards/rejected: -0.0357
-- Rewards/accuracies: 0.5900
-- Rewards/margins: 0.0550
-- Logps/rejected: -266.8149
-- Logps/chosen: -283.2045
-- Logits/rejected: -2.5703
-- Logits/chosen: -2.7049
 ## Model description
@@ -45,13 +45,15 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-07
-- train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
@@ -61,7 +63,7 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.4786        | 1.0   | 877  | 0.6710          | 0.0193         | -0.0357          | 0.5900             | 0.0550          | -266.8149      | -283.2045    | -2.5703         | -2.7049       |
 ### Framework versions

 This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3553
+- Rewards/chosen: -6.3145
+- Rewards/rejected: -9.0890
+- Rewards/accuracies: 0.8223
+- Rewards/margins: 2.7746
+- Logps/rejected: -1238.5492
+- Logps/chosen: -1017.7702
+- Logits/rejected: -2.0087
+- Logits/chosen: -2.0557
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
+- num_devices: 2
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
+- total_eval_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.3948        | 1.0   | 9121 | 0.3553          | -6.3145        | -9.0890          | 0.8223             | 2.7746          | -1238.5492     | -1017.7702   | -2.0087         | -2.0557       |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.542221034538243,
-    "train_runtime": 10885.06,
-    "train_samples": 14031,
-    "train_samples_per_second": 1.289,
-    "train_steps_per_second": 0.081
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.35941912531329934,
+    "train_runtime": 141935.9761,
+    "train_samples": 145937,
+    "train_samples_per_second": 1.028,
+    "train_steps_per_second": 0.064
 }

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.542221034538243,
-    "train_runtime": 10885.06,
-    "train_samples": 14031,
-    "train_samples_per_second": 1.289,
-    "train_steps_per_second": 0.081
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.35941912531329934,
+    "train_runtime": 141935.9761,
+    "train_samples": 145937,
+    "train_samples_per_second": 1.028,
+    "train_steps_per_second": 0.064
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff