Model save

Browse files

Files changed (6) hide show

README.md +49 -49
adapter_model.safetensors +1 -1
all_results.json +3 -3
runs/Apr05_19-58-41_allennlp-cirrascale-50.reviz.ai2.in/events.out.tfevents.1712372721.allennlp-cirrascale-50.reviz.ai2.in.49327.0 +2 -2
train_results.json +3 -3
trainer_state.json +0 -0

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ tags:
 - trl
 - dpo
 - generated_from_trainer
-base_model: mistralai/Mistral-7B-v0.1
 model-index:
 - name: zephyr-7b-dpo-qlora
   results: []
@@ -16,17 +16,17 @@ should probably proofread and complete it, then remove this comment. -->
 # zephyr-7b-dpo-qlora
-This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.4889
-- Rewards/chosen: -2.5570
-- Rewards/rejected: -3.6602
-- Rewards/accuracies: 0.7420
-- Rewards/margins: 1.1032
-- Logps/rejected: -610.6146
-- Logps/chosen: -520.3311
-- Logits/rejected: -1.1934
-- Logits/chosen: -1.3156
 ## Model description
@@ -61,44 +61,44 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.6822        | 0.03  | 100  | 0.6822          | 0.0500         | 0.0271           | 0.6620             | 0.0229          | -241.8811      | -259.6322    | -1.9601         | -2.0996       |
-| 0.6491        | 0.05  | 200  | 0.6486          | -0.0620        | -0.1691          | 0.6800             | 0.1071          | -261.5014      | -270.8282    | -1.9371         | -2.0738       |
-| 0.6171        | 0.08  | 300  | 0.6234          | -0.3309        | -0.5345          | 0.6725             | 0.2036          | -298.0363      | -297.7160    | -1.8915         | -2.0253       |
-| 0.6176        | 0.1   | 400  | 0.5988          | -0.7494        | -1.0723          | 0.6810             | 0.3229          | -351.8179      | -339.5648    | -1.8181         | -1.9461       |
-| 0.5761        | 0.13  | 500  | 0.5704          | -1.2031        | -1.7242          | 0.6925             | 0.5211          | -417.0116      | -384.9406    | -1.6429         | -1.7682       |
-| 0.5583        | 0.16  | 600  | 0.5622          | -0.8290        | -1.3172          | 0.7055             | 0.4881          | -376.3094      | -347.5316    | -1.6199         | -1.7496       |
-| 0.5297        | 0.18  | 700  | 0.5517          | -1.1832        | -1.8211          | 0.7115             | 0.6379          | -426.7012      | -382.9457    | -1.5803         | -1.7075       |
-| 0.5161        | 0.21  | 800  | 0.5413          | -1.6079        | -2.2528          | 0.7135             | 0.6449          | -469.8732      | -425.4197    | -1.5290         | -1.6555       |
-| 0.5089        | 0.24  | 900  | 0.5513          | -1.1205        | -1.7563          | 0.7120             | 0.6357          | -420.2160      | -376.6812    | -1.3977         | -1.5150       |
-| 0.5577        | 0.26  | 1000 | 0.5359          | -1.4373        | -2.1710          | 0.7200             | 0.7337          | -461.6903      | -408.3596    | -1.3512         | -1.4712       |
-| 0.5701        | 0.29  | 1100 | 0.5276          | -1.0004        | -1.6579          | 0.7315             | 0.6575          | -410.3777      | -364.6696    | -1.3514         | -1.4728       |
-| 0.5581        | 0.31  | 1200 | 0.5236          | -1.1076        | -1.8024          | 0.7300             | 0.6948          | -424.8326      | -375.3857    | -1.3054         | -1.4257       |
-| 0.5446        | 0.34  | 1300 | 0.5300          | -2.2586        | -3.2287          | 0.7235             | 0.9701          | -567.4619      | -490.4902    | -0.9841         | -1.0957       |
-| 0.5288        | 0.37  | 1400 | 0.5134          | -1.6785        | -2.5620          | 0.7350             | 0.8835          | -500.7915      | -432.4789    | -1.1189         | -1.2369       |
-| 0.4638        | 0.39  | 1500 | 0.5280          | -2.2152        | -3.3623          | 0.7325             | 1.1471          | -580.8159      | -486.1478    | -1.0978         | -1.2233       |
-| 0.5653        | 0.42  | 1600 | 0.5065          | -1.8073        | -2.7765          | 0.7360             | 0.9692          | -522.2392      | -445.3528    | -1.2289         | -1.3479       |
-| 0.5129        | 0.44  | 1700 | 0.5115          | -2.6322        | -3.6578          | 0.7290             | 1.0256          | -610.3751      | -527.8514    | -1.1655         | -1.2898       |
-| 0.464         | 0.47  | 1800 | 0.5067          | -2.5458        | -3.6660          | 0.7360             | 1.1202          | -611.1868      | -519.2065    | -1.1092         | -1.2262       |
-| 0.4435        | 0.5   | 1900 | 0.5028          | -2.4198        | -3.5101          | 0.7295             | 1.0903          | -595.5961      | -506.6063    | -1.1773         | -1.2970       |
-| 0.4722        | 0.52  | 2000 | 0.5024          | -2.8634        | -3.9045          | 0.7370             | 1.0411          | -635.0359      | -550.9646    | -1.1621         | -1.2850       |
-| 0.4946        | 0.55  | 2100 | 0.4990          | -2.5939        | -3.6584          | 0.7405             | 1.0645          | -610.4345      | -524.0187    | -1.2223         | -1.3435       |
-| 0.4809        | 0.58  | 2200 | 0.4960          | -1.9937        | -2.9287          | 0.7400             | 0.9350          | -537.4633      | -464.0007    | -1.2750         | -1.3983       |
-| 0.4721        | 0.6   | 2300 | 0.4994          | -2.7426        | -3.9056          | 0.7410             | 1.1630          | -635.1489      | -538.8865    | -1.1593         | -1.2804       |
-| 0.4693        | 0.63  | 2400 | 0.4980          | -2.6255        | -3.7698          | 0.7405             | 1.1443          | -621.5709      | -527.1746    | -1.0849         | -1.2053       |
-| 0.5           | 0.65  | 2500 | 0.4928          | -2.3522        | -3.4480          | 0.7425             | 1.0959          | -589.3930      | -499.8447    | -1.1667         | -1.2915       |
-| 0.4706        | 0.68  | 2600 | 0.4921          | -2.3971        | -3.4902          | 0.7390             | 1.0931          | -593.6089      | -504.3380    | -1.1721         | -1.2961       |
-| 0.5242        | 0.71  | 2700 | 0.4933          | -2.5905        | -3.7015          | 0.7390             | 1.1110          | -614.7410      | -523.6794    | -1.1556         | -1.2788       |
-| 0.4557        | 0.73  | 2800 | 0.4921          | -2.4710        | -3.5949          | 0.7400             | 1.1239          | -604.0808      | -511.7323    | -1.1781         | -1.3009       |
-| 0.523         | 0.76  | 2900 | 0.4899          | -2.5572        | -3.6406          | 0.7435             | 1.0834          | -608.6472      | -520.3428    | -1.1831         | -1.3050       |
-| 0.4588        | 0.79  | 3000 | 0.4897          | -2.5669        | -3.6213          | 0.7415             | 1.0544          | -606.7161      | -521.3174    | -1.1914         | -1.3136       |
-| 0.5038        | 0.81  | 3100 | 0.4894          | -2.6148        | -3.7110          | 0.7400             | 1.0961          | -615.6882      | -526.1104    | -1.1866         | -1.3089       |
-| 0.5           | 0.84  | 3200 | 0.4889          | -2.5558        | -3.6512          | 0.7435             | 1.0955          | -609.7109      | -520.2028    | -1.1907         | -1.3130       |
-| 0.5164        | 0.86  | 3300 | 0.4891          | -2.5467        | -3.6430          | 0.7415             | 1.0963          | -608.8884      | -519.2968    | -1.1940         | -1.3162       |
-| 0.4554        | 0.89  | 3400 | 0.4889          | -2.5665        | -3.6678          | 0.7410             | 1.1014          | -611.3746      | -521.2744    | -1.1941         | -1.3162       |
-| 0.5354        | 0.92  | 3500 | 0.4888          | -2.5581        | -3.6613          | 0.7410             | 1.1032          | -610.7186      | -520.4333    | -1.1966         | -1.3187       |
-| 0.4576        | 0.94  | 3600 | 0.4890          | -2.5580        | -3.6613          | 0.7395             | 1.1033          | -610.7242      | -520.4294    | -1.1960         | -1.3180       |
-| 0.4816        | 0.97  | 3700 | 0.4889          | -2.5574        | -3.6608          | 0.7410             | 1.1034          | -610.6686      | -520.3651    | -1.1920         | -1.3143       |
-| 0.5057        | 0.99  | 3800 | 0.4889          | -2.5570        | -3.6602          | 0.7420             | 1.1032          | -610.6146      | -520.3311    | -1.1934         | -1.3156       |
 ### Framework versions

 - trl
 - dpo
 - generated_from_trainer
+base_model: alignment-handbook/zephyr-7b-sft-full
 model-index:
 - name: zephyr-7b-dpo-qlora
   results: []
 # zephyr-7b-dpo-qlora
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5056
+- Rewards/chosen: -1.4058
+- Rewards/rejected: -2.2921
+- Rewards/accuracies: 0.7345
+- Rewards/margins: 0.8863
+- Logps/rejected: -492.4564
+- Logps/chosen: -425.7130
+- Logits/rejected: -1.8131
+- Logits/chosen: -1.9265
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6896        | 0.03  | 100  | 0.6884          | 0.0072         | -0.0023          | 0.6745             | 0.0096          | -263.4773      | -284.4093    | -2.5586         | -2.6909       |
+| 0.6699        | 0.05  | 200  | 0.6726          | 0.0082         | -0.0359          | 0.6895             | 0.0441          | -266.8299      | -284.3103    | -2.5495         | -2.6813       |
+| 0.636         | 0.08  | 300  | 0.6466          | -0.0002        | -0.1125          | 0.6780             | 0.1123          | -274.4987      | -285.1534    | -2.5520         | -2.6823       |
+| 0.6312        | 0.1   | 400  | 0.6191          | -0.2138        | -0.4222          | 0.6805             | 0.2084          | -305.4655      | -306.5131    | -2.5251         | -2.6530       |
+| 0.5918        | 0.13  | 500  | 0.6031          | -0.2627        | -0.5412          | 0.6880             | 0.2785          | -317.3649      | -311.4058    | -2.5298         | -2.6575       |
+| 0.6012        | 0.16  | 600  | 0.5928          | -0.5129        | -0.8554          | 0.6935             | 0.3424          | -348.7829      | -336.4283    | -2.5443         | -2.6737       |
+| 0.5823        | 0.18  | 700  | 0.5811          | -0.5775        | -1.0207          | 0.7000             | 0.4432          | -365.3115      | -342.8825    | -2.3446         | -2.4662       |
+| 0.5502        | 0.21  | 800  | 0.5688          | -0.5710        | -1.0329          | 0.7040             | 0.4619          | -366.5324      | -342.2334    | -2.3173         | -2.4395       |
+| 0.551         | 0.24  | 900  | 0.5723          | -0.5585        | -1.0146          | 0.7100             | 0.4561          | -364.7085      | -340.9870    | -2.2573         | -2.3767       |
+| 0.5684        | 0.26  | 1000 | 0.5602          | -0.7542        | -1.3111          | 0.7070             | 0.5569          | -394.3551      | -360.5555    | -2.2283         | -2.3464       |
+| 0.5722        | 0.29  | 1100 | 0.5429          | -0.7936        | -1.4574          | 0.7240             | 0.6638          | -408.9803      | -364.4904    | -2.0677         | -2.1820       |
+| 0.5866        | 0.31  | 1200 | 0.5338          | -1.0463        | -1.7337          | 0.7205             | 0.6874          | -436.6128      | -389.7662    | -2.0249         | -2.1388       |
+| 0.5659        | 0.34  | 1300 | 0.5310          | -0.8607        | -1.5398          | 0.7310             | 0.6792          | -417.2296      | -371.2006    | -1.9893         | -2.1049       |
+| 0.5625        | 0.37  | 1400 | 0.5295          | -0.7999        | -1.5056          | 0.7215             | 0.7058          | -413.8092      | -365.1206    | -1.9254         | -2.0391       |
+| 0.4575        | 0.39  | 1500 | 0.5266          | -1.1455        | -1.9646          | 0.7260             | 0.8191          | -459.7086      | -399.6889    | -1.9105         | -2.0252       |
+| 0.5855        | 0.42  | 1600 | 0.5227          | -1.0359        | -1.7628          | 0.7345             | 0.7269          | -439.5246      | -388.7278    | -1.9276         | -2.0403       |
+| 0.5333        | 0.44  | 1700 | 0.5155          | -1.1618        | -1.9731          | 0.7310             | 0.8113          | -460.5566      | -401.3148    | -1.9572         | -2.0732       |
+| 0.5055        | 0.47  | 1800 | 0.5181          | -1.1105        | -1.8968          | 0.7330             | 0.7863          | -452.9257      | -396.1870    | -1.9572         | -2.0727       |
+| 0.4687        | 0.5   | 1900 | 0.5198          | -1.4078        | -2.3064          | 0.7290             | 0.8986          | -493.8867      | -425.9163    | -1.8519         | -1.9678       |
+| 0.4936        | 0.52  | 2000 | 0.5123          | -1.4097        | -2.2536          | 0.7290             | 0.8438          | -488.6001      | -426.1056    | -1.8371         | -1.9508       |
+| 0.5058        | 0.55  | 2100 | 0.5121          | -1.4030        | -2.2804          | 0.7320             | 0.8773          | -491.2808      | -425.4353    | -1.8156         | -1.9302       |
+| 0.491         | 0.58  | 2200 | 0.5102          | -1.2883        | -2.1172          | 0.7300             | 0.8289          | -474.9657      | -413.9656    | -1.8893         | -2.0053       |
+| 0.4923        | 0.6   | 2300 | 0.5107          | -1.2460        | -2.0925          | 0.7320             | 0.8465          | -472.4916      | -409.7295    | -1.8949         | -2.0101       |
+| 0.4718        | 0.63  | 2400 | 0.5093          | -1.3443        | -2.1905          | 0.7265             | 0.8462          | -482.2936      | -419.5653    | -1.8618         | -1.9757       |
+| 0.5187        | 0.65  | 2500 | 0.5103          | -1.3233        | -2.2095          | 0.7285             | 0.8862          | -484.1988      | -417.4668    | -1.8641         | -1.9796       |
+| 0.5025        | 0.68  | 2600 | 0.5115          | -1.2910        | -2.1842          | 0.7315             | 0.8932          | -481.6620      | -414.2359    | -1.8388         | -1.9538       |
+| 0.4946        | 0.71  | 2700 | 0.5094          | -1.3454        | -2.2424          | 0.7300             | 0.8970          | -487.4804      | -419.6713    | -1.8200         | -1.9339       |
+| 0.5054        | 0.73  | 2800 | 0.5085          | -1.4083        | -2.3252          | 0.7320             | 0.9169          | -495.7629      | -425.9614    | -1.8042         | -1.9180       |
+| 0.5159        | 0.76  | 2900 | 0.5066          | -1.3467        | -2.2328          | 0.7320             | 0.8861          | -486.5227      | -419.8022    | -1.8193         | -1.9330       |
+| 0.4671        | 0.79  | 3000 | 0.5062          | -1.4194        | -2.3064          | 0.7325             | 0.8870          | -493.8865      | -427.0751    | -1.8140         | -1.9274       |
+| 0.4864        | 0.81  | 3100 | 0.5059          | -1.4248        | -2.3084          | 0.7330             | 0.8836          | -494.0863      | -427.6172    | -1.8158         | -1.9291       |
+| 0.5101        | 0.84  | 3200 | 0.5056          | -1.4159        | -2.2981          | 0.7340             | 0.8821          | -493.0526      | -426.7279    | -1.8167         | -1.9300       |
+| 0.5317        | 0.86  | 3300 | 0.5056          | -1.4029        | -2.2863          | 0.7355             | 0.8834          | -491.8742      | -425.4280    | -1.8139         | -1.9273       |
+| 0.4668        | 0.89  | 3400 | 0.5055          | -1.4064        | -2.2921          | 0.7350             | 0.8857          | -492.4527      | -425.7719    | -1.8132         | -1.9266       |
+| 0.5671        | 0.92  | 3500 | 0.5056          | -1.4036        | -2.2899          | 0.7345             | 0.8863          | -492.2395      | -425.4986    | -1.8158         | -1.9291       |
+| 0.4708        | 0.94  | 3600 | 0.5056          | -1.4050        | -2.2912          | 0.7345             | 0.8862          | -492.3603      | -425.6342    | -1.8127         | -1.9261       |
+| 0.4904        | 0.97  | 3700 | 0.5054          | -1.4047        | -2.2913          | 0.7355             | 0.8866          | -492.3736      | -425.6043    | -1.8155         | -1.9289       |
+| 0.5001        | 0.99  | 3800 | 0.5056          | -1.4058        | -2.2921          | 0.7345             | 0.8863          | -492.4564      | -425.7130    | -1.8131         | -1.9265       |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:923ceec33abc352e1d8ba4ff849344216f7dc442a4fe5738f6d1fb44d2f665bd
 size 671150064

 version https://git-lfs.github.com/spec/v1
+oid sha256:e237784759ae82d05007c64de0d6b6905c61fbdaed669850c91f6ca78555089f
 size 671150064

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.5232822467709236,
-    "train_runtime": 82986.4165,
     "train_samples": 61134,
-    "train_samples_per_second": 0.737,
     "train_steps_per_second": 0.046
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.5390157630246398,
+    "train_runtime": 82744.187,
     "train_samples": 61134,
+    "train_samples_per_second": 0.739,
     "train_steps_per_second": 0.046
 }

runs/Apr05_19-58-41_allennlp-cirrascale-50.reviz.ai2.in/events.out.tfevents.1712372721.allennlp-cirrascale-50.reviz.ai2.in.49327.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:408c8ddad6076c0eb3ab86a33a189cd1986669f39c47319b2f50c8da76749b55
-size 295334

 version https://git-lfs.github.com/spec/v1
+oid sha256:9e0c1bce4cedb7cff0133d1cbff2ee9d404c387bcc76b522de150a4ab7862798
+size 297064

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.5232822467709236,
-    "train_runtime": 82986.4165,
     "train_samples": 61134,
-    "train_samples_per_second": 0.737,
     "train_steps_per_second": 0.046
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.5390157630246398,
+    "train_runtime": 82744.187,
     "train_samples": 61134,
+    "train_samples_per_second": 0.739,
     "train_steps_per_second": 0.046
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff