silviasapora commited on
Commit
9dc97f6
·
verified ·
1 Parent(s): df52d15

Model save

Browse files
README.md CHANGED
@@ -3,16 +3,9 @@ library_name: transformers
3
  license: gemma
4
  base_model: google/gemma-7b
5
  tags:
6
- - alignment-handbook
7
  - trl
8
  - orpo
9
  - generated_from_trainer
10
- - trl
11
- - orpo
12
- - alignment-handbook
13
- - generated_from_trainer
14
- datasets:
15
- - argilla/dpo-mix-7k
16
  model-index:
17
  - name: gemma-7b-orpo-low-quality
18
  results: []
@@ -23,20 +16,20 @@ should probably proofread and complete it, then remove this comment. -->
23
 
24
  # gemma-7b-orpo-low-quality
25
 
26
- This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co/google/gemma-7b) on the argilla/dpo-mix-7k dataset.
27
  It achieves the following results on the evaluation set:
28
- - Loss: 1.6395
29
- - Rewards/chosen: -0.0601
30
- - Rewards/rejected: -0.0755
31
- - Rewards/accuracies: 0.6029
32
- - Rewards/margins: 0.0153
33
- - Logps/rejected: -1.5091
34
- - Logps/chosen: -1.2026
35
- - Logits/rejected: 275.9735
36
- - Logits/chosen: 286.3763
37
- - Nll Loss: 1.5847
38
- - Log Odds Ratio: -0.6702
39
- - Log Odds Chosen: 0.4438
40
 
41
  ## Model description
42
 
@@ -60,10 +53,10 @@ The following hyperparameters were used during training:
60
  - eval_batch_size: 1
61
  - seed: 42
62
  - distributed_type: multi-GPU
63
- - num_devices: 2
64
- - gradient_accumulation_steps: 8
65
  - total_train_batch_size: 32
66
- - total_eval_batch_size: 2
67
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
68
  - lr_scheduler_type: inverse_sqrt
69
  - lr_scheduler_warmup_steps: 100
@@ -73,14 +66,14 @@ The following hyperparameters were used during training:
73
 
74
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
75
  |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
76
- | 1.4933 | 0.9976 | 157 | 1.4686 | -0.0501 | -0.0608 | 0.5776 | 0.0107 | -1.2166 | -1.0023 | 307.1602 | 318.2524 | 1.4127 | -0.6558 | 0.3240 |
77
- | 1.036 | 1.9952 | 314 | 1.4194 | -0.0493 | -0.0612 | 0.5668 | 0.0118 | -1.2231 | -0.9867 | 302.5974 | 312.9305 | 1.3670 | -0.6609 | 0.3487 |
78
- | 0.56 | 2.9929 | 471 | 1.6395 | -0.0601 | -0.0755 | 0.6029 | 0.0153 | -1.5091 | -1.2026 | 275.9735 | 286.3763 | 1.5847 | -0.6702 | 0.4438 |
79
 
80
 
81
  ### Framework versions
82
 
83
  - Transformers 4.44.2
84
- - Pytorch 2.3.1+cu121
85
- - Datasets 2.20.0
86
  - Tokenizers 0.19.1
 
3
  license: gemma
4
  base_model: google/gemma-7b
5
  tags:
 
6
  - trl
7
  - orpo
8
  - generated_from_trainer
 
 
 
 
 
 
9
  model-index:
10
  - name: gemma-7b-orpo-low-quality
11
  results: []
 
16
 
17
  # gemma-7b-orpo-low-quality
18
 
19
+ This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co/google/gemma-7b) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 1.5398
22
+ - Rewards/chosen: -0.0540
23
+ - Rewards/rejected: -0.0625
24
+ - Rewards/accuracies: 0.5396
25
+ - Rewards/margins: 0.0085
26
+ - Logps/rejected: -1.2503
27
+ - Logps/chosen: -1.0803
28
+ - Logits/rejected: 271.8756
29
+ - Logits/chosen: 300.6891
30
+ - Nll Loss: 1.4724
31
+ - Log Odds Ratio: -0.6945
32
+ - Log Odds Chosen: 0.2937
33
 
34
  ## Model description
35
 
 
53
  - eval_batch_size: 1
54
  - seed: 42
55
  - distributed_type: multi-GPU
56
+ - num_devices: 4
57
+ - gradient_accumulation_steps: 4
58
  - total_train_batch_size: 32
59
+ - total_eval_batch_size: 4
60
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
61
  - lr_scheduler_type: inverse_sqrt
62
  - lr_scheduler_warmup_steps: 100
 
66
 
67
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
68
  |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
69
+ | 1.441 | 0.9955 | 167 | 1.4762 | -0.0510 | -0.0574 | 0.5324 | 0.0064 | -1.1485 | -1.0204 | 290.1581 | 318.9965 | 1.4310 | -0.6990 | 0.1934 |
70
+ | 1.0908 | 1.9970 | 335 | 1.4250 | -0.0497 | -0.0576 | 0.5324 | 0.0079 | -1.1528 | -0.9950 | 285.8206 | 314.6779 | 1.3697 | -0.6970 | 0.2360 |
71
+ | 0.5724 | 2.9866 | 501 | 1.5398 | -0.0540 | -0.0625 | 0.5396 | 0.0085 | -1.2503 | -1.0803 | 271.8756 | 300.6891 | 1.4724 | -0.6945 | 0.2937 |
72
 
73
 
74
  ### Framework versions
75
 
76
  - Transformers 4.44.2
77
+ - Pytorch 2.4.0+cu121
78
+ - Datasets 3.0.0
79
  - Tokenizers 0.19.1
all_results.json CHANGED
@@ -1,25 +1,9 @@
1
  {
2
- "epoch": 2.9928514694201747,
3
- "eval_log_odds_chosen": 0.44380733370780945,
4
- "eval_log_odds_ratio": -0.6702221632003784,
5
- "eval_logits/chosen": 286.3763122558594,
6
- "eval_logits/rejected": 275.9735412597656,
7
- "eval_logps/chosen": -1.2025552988052368,
8
- "eval_logps/rejected": -1.5090675354003906,
9
- "eval_loss": 1.639459252357483,
10
- "eval_nll_loss": 1.5846672058105469,
11
- "eval_rewards/accuracies": 0.6028881072998047,
12
- "eval_rewards/chosen": -0.060127776116132736,
13
- "eval_rewards/margins": 0.015325604937970638,
14
- "eval_rewards/rejected": -0.07545337826013565,
15
- "eval_runtime": 278.5267,
16
- "eval_samples": 553,
17
- "eval_samples_per_second": 1.985,
18
- "eval_steps_per_second": 0.995,
19
  "total_flos": 0.0,
20
- "train_loss": 1.4771008792703066,
21
- "train_runtime": 40012.5124,
22
- "train_samples": 5034,
23
- "train_samples_per_second": 0.377,
24
- "train_steps_per_second": 0.012
25
  }
 
1
  {
2
+ "epoch": 2.9865871833084947,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  "total_flos": 0.0,
4
+ "train_loss": 1.4594077459590402,
5
+ "train_runtime": 13816.0738,
6
+ "train_samples": 5364,
7
+ "train_samples_per_second": 1.165,
8
+ "train_steps_per_second": 0.036
9
  }
config.json CHANGED
@@ -24,6 +24,6 @@
24
  "rope_theta": 10000.0,
25
  "torch_dtype": "bfloat16",
26
  "transformers_version": "4.44.2",
27
- "use_cache": true,
28
  "vocab_size": 256000
29
  }
 
24
  "rope_theta": 10000.0,
25
  "torch_dtype": "bfloat16",
26
  "transformers_version": "4.44.2",
27
+ "use_cache": false,
28
  "vocab_size": 256000
29
  }
model-00001-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e8975fff46f104ed6824905d333347eed409fc7b575867d746e9e9f97e5cf75e
3
  size 4995496656
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dec52e755b8dc3619df2908f535b0555e2dc06427850ceaee163434c259b1fbd
3
  size 4995496656
model-00002-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:469d8954d355e58a7ac4f40647d3b4efefc8feb95a076b5963a4c22420231ac9
3
  size 4982953168
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea059af29dbe1d13cd4663702981098306a719842c8b6daa22775a2d75c899b8
3
  size 4982953168
model-00003-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:54de348e47bc68e158a2fc77b941ca1122fe7648a57c5ff4283af6b7a74b0af3
3
  size 4982953200
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b2900743ebfa120b1bd6c6aede29909b467ef078355948272c970ceeb044543
3
  size 4982953200
model-00004-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1b7bc0ca4c8e5d3903afe6f37e839c1a05496b97ff4caebc1125e0cdde717182
3
  size 2113988336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3cac536ba4dd9bee8fa767d40d69a1aa96062853affb4442bc5f1f4549548ace
3
  size 2113988336
runs/Sep18_14-15-33_65ecb96dba42/events.out.tfevents.1726669044.65ecb96dba42.785.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b2f3461f1d36bb6682beb0cecf57852989edf71b3f7433410ec30a381b9ffe32
3
+ size 9629
runs/Sep18_19-37-56_65ecb96dba42/events.out.tfevents.1726688361.65ecb96dba42.41972.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b9c950d475a67eb227982d8fb066cad17bc944304c6950e2bd62bd31b9764b3c
3
+ size 15544
runs/Sep18_20-46-35_65ecb96dba42/events.out.tfevents.1726692478.65ecb96dba42.1160.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:58c26bca4b1077540001f1170d8682c05ddde0aa550a9ce764cbdb21f1dc750c
3
+ size 94110
train_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "epoch": 2.9928514694201747,
3
  "total_flos": 0.0,
4
- "train_loss": 1.4771008792703066,
5
- "train_runtime": 40012.5124,
6
- "train_samples": 5034,
7
- "train_samples_per_second": 0.377,
8
- "train_steps_per_second": 0.012
9
  }
 
1
  {
2
+ "epoch": 2.9865871833084947,
3
  "total_flos": 0.0,
4
+ "train_loss": 1.4594077459590402,
5
+ "train_runtime": 13816.0738,
6
+ "train_samples": 5364,
7
+ "train_samples_per_second": 1.165,
8
+ "train_steps_per_second": 0.036
9
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:020dd97bf10615e92a193c0fe52d389ece35c255e11bf61e10e432a5f60343c8
3
  size 6776
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7652e215bce2c4fe6a8a69c60dc1736d525ab7940382c277bdbc4dac5af83a39
3
  size 6776