Model save

Browse files

Files changed (6) hide show

README.md +120 -0
adapter_model.safetensors +1 -1
all_results.json +8 -0
runs/Jul16_03-11-50_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721100015.notebook-deployment-48-7d9b6c99-p5kv4.69846.0 +2 -2
train_results.json +8 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,120 @@

+---
+base_model: alignment-handbook/zephyr-7b-sft-full
+library_name: peft
+license: apache-2.0
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: zephyr-dpop-qlora-uf-ours-uffull-5e-7
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-dpop-qlora-uf-ours-uffull-5e-7
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.6825
+- Positive Losses: 0.1480
+- Dpo Losses: 0.6646
+- Rewards/chosen: 0.1662
+- Rewards/rejected: 0.1036
+- Rewards/accuracies: 0.6810
+- Rewards/margins: 0.0626
+- Rewards/margins Max: 0.2720
+- Rewards/margins Min: -0.1174
+- Rewards/margins Std: 0.1305
+- Logps/rejected: -255.4913
+- Logps/chosen: -267.8348
+- Logits/rejected: -2.7189
+- Logits/chosen: -2.7542
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Positive Losses | Dpo Losses | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:-----:|:----:|:---------------:|:---------------:|:----------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.694         | 0.02  | 100  | 0.6937          | 0.0064          | 0.6931     | 0.0049         | 0.0049           | 0.5075             | 0.0001          | 0.0049              | -0.0046             | 0.0032              | -265.3661      | -283.9625    | -2.7648         | -2.8001       |
+| 0.6922        | 0.05  | 200  | 0.6930          | 0.0035          | 0.6926     | 0.0082         | 0.0071           | 0.5875             | 0.0011          | 0.0082              | -0.0056             | 0.0046              | -265.1425      | -283.6357    | -2.7650         | -2.8002       |
+| 0.692         | 0.07  | 300  | 0.6921          | 0.0052          | 0.6914     | 0.0190         | 0.0154           | 0.6175             | 0.0035          | 0.0195              | -0.0103             | 0.0099              | -264.3096      | -282.5598    | -2.7662         | -2.8012       |
+| 0.6914        | 0.1   | 400  | 0.6907          | 0.0081          | 0.6896     | 0.0324         | 0.0252           | 0.6435             | 0.0072          | 0.0364              | -0.0176             | 0.0181              | -263.3349      | -281.2179    | -2.7620         | -2.7972       |
+| 0.6867        | 0.12  | 500  | 0.6887          | 0.0124          | 0.6868     | 0.0581         | 0.0451           | 0.6360             | 0.0130          | 0.0654              | -0.0313             | 0.0323              | -261.3455      | -278.6435    | -2.7580         | -2.7932       |
+| 0.6903        | 0.14  | 600  | 0.6869          | 0.0213          | 0.6837     | 0.0696         | 0.0499           | 0.6565             | 0.0197          | 0.0949              | -0.0434             | 0.0461              | -260.8595      | -277.4952    | -2.7576         | -2.7926       |
+| 0.6828        | 0.17  | 700  | 0.6855          | 0.0302          | 0.6813     | 0.0840         | 0.0592           | 0.6595             | 0.0248          | 0.1199              | -0.0539             | 0.0580              | -259.9324      | -276.0511    | -2.7490         | -2.7843       |
+| 0.6758        | 0.19  | 800  | 0.6855          | 0.0526          | 0.6791     | 0.0969         | 0.0672           | 0.6550             | 0.0297          | 0.1423              | -0.0640             | 0.0688              | -259.1296      | -274.7613    | -2.7450         | -2.7804       |
+| 0.6811        | 0.22  | 900  | 0.6854          | 0.0594          | 0.6771     | 0.1064         | 0.0725           | 0.6645             | 0.0339          | 0.1596              | -0.0715             | 0.0771              | -258.6040      | -273.8141    | -2.7378         | -2.7726       |
+| 0.6803        | 0.24  | 1000 | 0.6845          | 0.0609          | 0.6762     | 0.1167         | 0.0807           | 0.6645             | 0.0360          | 0.1687              | -0.0763             | 0.0818              | -257.7856      | -272.7885    | -2.7285         | -2.7634       |
+| 0.6759        | 0.26  | 1100 | 0.6842          | 0.0676          | 0.6750     | 0.1250         | 0.0862           | 0.6610             | 0.0388          | 0.1815              | -0.0829             | 0.0881              | -257.2345      | -271.9526    | -2.7320         | -2.7672       |
+| 0.6732        | 0.29  | 1200 | 0.6896          | 0.1405          | 0.6722     | 0.1179         | 0.0727           | 0.6695             | 0.0452          | 0.2076              | -0.0939             | 0.1005              | -258.5845      | -272.6641    | -2.7315         | -2.7664       |
+| 0.6748        | 0.31  | 1300 | 0.6835          | 0.0876          | 0.6734     | 0.1391         | 0.0966           | 0.6665             | 0.0425          | 0.1965              | -0.0897             | 0.0954              | -256.1944      | -270.5492    | -2.7357         | -2.7709       |
+| 0.6872        | 0.34  | 1400 | 0.6834          | 0.0973          | 0.6721     | 0.1392         | 0.0939           | 0.6670             | 0.0453          | 0.2070              | -0.0930             | 0.1000              | -256.4647      | -270.5385    | -2.7367         | -2.7719       |
+| 0.6926        | 0.36  | 1500 | 0.6833          | 0.1058          | 0.6710     | 0.1402         | 0.0925           | 0.6685             | 0.0477          | 0.2165              | -0.0956             | 0.1042              | -256.6026      | -270.4324    | -2.7329         | -2.7681       |
+| 0.6862        | 0.38  | 1600 | 0.6891          | 0.1729          | 0.6689     | 0.1322         | 0.0796           | 0.6750             | 0.0526          | 0.2361              | -0.1039             | 0.1134              | -257.8935      | -271.2309    | -2.7292         | -2.7642       |
+| 0.6779        | 0.41  | 1700 | 0.6821          | 0.0962          | 0.6698     | 0.1486         | 0.0979           | 0.6705             | 0.0507          | 0.2293              | -0.1016             | 0.1104              | -256.0604      | -269.5961    | -2.7308         | -2.7658       |
+| 0.6726        | 0.43  | 1800 | 0.6842          | 0.1209          | 0.6687     | 0.1467         | 0.0934           | 0.6730             | 0.0533          | 0.2380              | -0.1060             | 0.1149              | -256.5087      | -269.7857    | -2.7266         | -2.7615       |
+| 0.6688        | 0.45  | 1900 | 0.6834          | 0.1202          | 0.6681     | 0.1483         | 0.0938           | 0.6745             | 0.0545          | 0.2410              | -0.1065             | 0.1162              | -256.4724      | -269.6281    | -2.7300         | -2.7651       |
+| 0.6616        | 0.48  | 2000 | 0.6818          | 0.1092          | 0.6681     | 0.1532         | 0.0987           | 0.6720             | 0.0545          | 0.2409              | -0.1069             | 0.1164              | -255.9825      | -269.1367    | -2.7336         | -2.7687       |
+| 0.6707        | 0.5   | 2100 | 0.6804          | 0.0930          | 0.6684     | 0.1588         | 0.1049           | 0.6710             | 0.0538          | 0.2405              | -0.1069             | 0.1162              | -255.3586      | -268.5765    | -2.7300         | -2.7651       |
+| 0.6796        | 0.53  | 2200 | 0.6849          | 0.1551          | 0.6666     | 0.1500         | 0.0920           | 0.6755             | 0.0580          | 0.2565              | -0.1121             | 0.1234              | -256.6537      | -269.4551    | -2.7228         | -2.7582       |
+| 0.6672        | 0.55  | 2300 | 0.6830          | 0.1404          | 0.6668     | 0.1562         | 0.0986           | 0.6725             | 0.0576          | 0.2557              | -0.1114             | 0.1231              | -255.9975      | -268.8366    | -2.7203         | -2.7554       |
+| 0.6769        | 0.57  | 2400 | 0.6819          | 0.1252          | 0.6668     | 0.1596         | 0.1019           | 0.6740             | 0.0577          | 0.2565              | -0.1128             | 0.1238              | -255.6599      | -268.4941    | -2.7159         | -2.7508       |
+| 0.6725        | 0.6   | 2500 | 0.6903          | 0.2239          | 0.6645     | 0.1488         | 0.0859           | 0.6850             | 0.0630          | 0.2751              | -0.1201             | 0.1325              | -257.2663      | -269.5727    | -2.7161         | -2.7509       |
+| 0.6762        | 0.62  | 2600 | 0.6834          | 0.1472          | 0.6655     | 0.1615         | 0.1008           | 0.6760             | 0.0606          | 0.2671              | -0.1166             | 0.1287              | -255.7709      | -268.3081    | -2.7154         | -2.7503       |
+| 0.6867        | 0.65  | 2700 | 0.6846          | 0.1619          | 0.6649     | 0.1605         | 0.0985           | 0.6820             | 0.0620          | 0.2708              | -0.1178             | 0.1304              | -256.0078      | -268.4086    | -2.7205         | -2.7554       |
+| 0.702         | 0.67  | 2800 | 0.6836          | 0.1510          | 0.6651     | 0.1623         | 0.1007           | 0.6815             | 0.0616          | 0.2697              | -0.1175             | 0.1299              | -255.7832      | -268.2218    | -2.7157         | -2.7510       |
+| 0.6822        | 0.69  | 2900 | 0.6818          | 0.1312          | 0.6653     | 0.1655         | 0.1045           | 0.6800             | 0.0610          | 0.2669              | -0.1156             | 0.1282              | -255.4075      | -267.9095    | -2.7201         | -2.7554       |
+| 0.6751        | 0.72  | 3000 | 0.6809          | 0.1235          | 0.6656     | 0.1674         | 0.1070           | 0.6745             | 0.0604          | 0.2651              | -0.1144             | 0.1272              | -255.1547      | -267.7156    | -2.7193         | -2.7547       |
+| 0.673         | 0.74  | 3100 | 0.6830          | 0.1523          | 0.6648     | 0.1643         | 0.1022           | 0.6815             | 0.0621          | 0.2709              | -0.1168             | 0.1301              | -255.6314      | -268.0210    | -2.7211         | -2.7563       |
+| 0.6666        | 0.77  | 3200 | 0.6818          | 0.1381          | 0.6653     | 0.1672         | 0.1062           | 0.6785             | 0.0611          | 0.2675              | -0.1157             | 0.1284              | -255.2344      | -267.7304    | -2.7202         | -2.7554       |
+| 0.6619        | 0.79  | 3300 | 0.6829          | 0.1523          | 0.6647     | 0.1652         | 0.1028           | 0.6810             | 0.0624          | 0.2717              | -0.1172             | 0.1304              | -255.5768      | -267.9396    | -2.7207         | -2.7559       |
+| 0.6752        | 0.81  | 3400 | 0.6830          | 0.1530          | 0.6647     | 0.1653         | 0.1029           | 0.6805             | 0.0625          | 0.2718              | -0.1177             | 0.1306              | -255.5670      | -267.9222    | -2.7197         | -2.7548       |
+| 0.6711        | 0.84  | 3500 | 0.6841          | 0.1663          | 0.6643     | 0.1634         | 0.1000           | 0.6795             | 0.0633          | 0.2740              | -0.1183             | 0.1317              | -255.8493      | -268.1196    | -2.7188         | -2.7540       |
+| 0.669         | 0.86  | 3600 | 0.6843          | 0.1689          | 0.6642     | 0.1628         | 0.0992           | 0.6815             | 0.0637          | 0.2755              | -0.1190             | 0.1323              | -255.9366      | -268.1706    | -2.7180         | -2.7533       |
+| 0.6563        | 0.89  | 3700 | 0.6835          | 0.1602          | 0.6643     | 0.1642         | 0.1009           | 0.6815             | 0.0633          | 0.2740              | -0.1182             | 0.1316              | -255.7627      | -268.0358    | -2.7189         | -2.7540       |
+| 0.6811        | 0.91  | 3800 | 0.6828          | 0.1517          | 0.6646     | 0.1658         | 0.1032           | 0.6820             | 0.0627          | 0.2721              | -0.1176             | 0.1307              | -255.5359      | -267.8722    | -2.7190         | -2.7541       |
+| 0.664         | 0.93  | 3900 | 0.6823          | 0.1453          | 0.6647     | 0.1664         | 0.1039           | 0.6780             | 0.0625          | 0.2717              | -0.1171             | 0.1305              | -255.4641      | -267.8119    | -2.7221         | -2.7571       |
+| 0.6771        | 0.96  | 4000 | 0.6824          | 0.1453          | 0.6647     | 0.1662         | 0.1037           | 0.6775             | 0.0625          | 0.2716              | -0.1174             | 0.1304              | -255.4852      | -267.8388    | -2.7216         | -2.7566       |
+| 0.6644        | 0.98  | 4100 | 0.6825          | 0.1480          | 0.6646     | 0.1662         | 0.1036           | 0.6810             | 0.0626          | 0.2720              | -0.1174             | 0.1305              | -255.4913      | -267.8348    | -2.7189         | -2.7542       |
+### Framework versions
+- PEFT 0.7.1
+- Transformers 4.39.0.dev0
+- Pytorch 2.1.2+cu121
+- Datasets 2.14.6
+- Tokenizers 0.15.2

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9398ec187b62913fee3cd344ec00e5b16e33899ff7c0f89e1cddcc850bd0b1ba
 size 671150064

 version https://git-lfs.github.com/spec/v1
+oid sha256:0371fbb85b053648e8e0b36a6104b5f3b86e6c8e5913bb3e7c0dff078989bde3
 size 671150064

all_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.6776896565581647,
+    "train_runtime": 67917.559,
+    "train_samples": 66812,
+    "train_samples_per_second": 0.984,
+    "train_steps_per_second": 0.061
+}

runs/Jul16_03-11-50_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721100015.notebook-deployment-48-7d9b6c99-p5kv4.69846.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3535a16671435d9de876b57f96d4270c2d275b0f96cb1919b529c2f3d1440a05
-size 456809

 version https://git-lfs.github.com/spec/v1
+oid sha256:7e8a568606180f4e015aaf931bdc62cb23d4f4c17f3aa08d339089b70cee5b14
+size 464128

train_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.6776896565581647,
+    "train_runtime": 67917.559,
+    "train_samples": 66812,
+    "train_samples_per_second": 0.984,
+    "train_steps_per_second": 0.061
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff