File size: 21,074 Bytes
639c9a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
license: apache-2.0
base_model: TheBloke/OpenHermes-2-Mistral-7B-GPTQ
tags:
- generated_from_trainer
model-index:
- name: covid3-mistral-dpo-gptq
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# covid3-mistral-dpo-gptq

This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co./TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 2.2375
- Rewards/chosen: -2.8294
- Rewards/rejected: -1.7077
- Rewards/accuracies: 0.25
- Rewards/margins: -1.1217
- Logps/rejected: -24.0692
- Logps/chosen: -35.7956
- Logits/rejected: -2.8653
- Logits/chosen: -2.8666

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- training_steps: 1000
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.6957        | 0.0   | 10   | 0.6940          | 0.0226         | 0.0252           | 0.375              | -0.0026         | -6.7409        | -7.2761      | -2.8058         | -2.8067       |
| 0.6925        | 0.0   | 20   | 0.6971          | 0.0317         | 0.0422           | 0.3333             | -0.0105         | -6.5702        | -7.1844      | -2.8074         | -2.8082       |
| 0.6876        | 0.01  | 30   | 0.6995          | 0.0202         | 0.0373           | 0.375              | -0.0170         | -6.6197        | -7.2995      | -2.8093         | -2.8102       |
| 0.6961        | 0.01  | 40   | 0.6982          | 0.0054         | 0.0189           | 0.375              | -0.0135         | -6.8034        | -7.4475      | -2.8113         | -2.8122       |
| 0.6863        | 0.01  | 50   | 0.6998          | 0.0019         | 0.0188           | 0.3333             | -0.0169         | -6.8044        | -7.4830      | -2.8121         | -2.8130       |
| 0.6965        | 0.01  | 60   | 0.6977          | 0.0119         | 0.0251           | 0.2917             | -0.0132         | -6.7419        | -7.3829      | -2.8120         | -2.8129       |
| 0.7209        | 0.01  | 70   | 0.6993          | 0.0336         | 0.0497           | 0.3333             | -0.0161         | -6.4949        | -7.1656      | -2.8103         | -2.8112       |
| 0.6988        | 0.01  | 80   | 0.6984          | 0.0294         | 0.0432           | 0.375              | -0.0138         | -6.5605        | -7.2080      | -2.8085         | -2.8094       |
| 0.6913        | 0.01  | 90   | 0.6981          | 0.0216         | 0.0342           | 0.4167             | -0.0126         | -6.6501        | -7.2856      | -2.8084         | -2.8093       |
| 0.6641        | 0.02  | 100  | 0.7030          | 0.0493         | 0.0702           | 0.3333             | -0.0209         | -6.2907        | -7.0088      | -2.8098         | -2.8107       |
| 0.7083        | 0.02  | 110  | 0.7072          | 0.0575         | 0.0870           | 0.3333             | -0.0295         | -6.1225        | -6.9268      | -2.8105         | -2.8114       |
| 0.6307        | 0.02  | 120  | 0.7128          | 0.0727         | 0.1120           | 0.3333             | -0.0393         | -5.8727        | -6.7749      | -2.8105         | -2.8114       |
| 0.7216        | 0.02  | 130  | 0.7158          | 0.0814         | 0.1250           | 0.3333             | -0.0436         | -5.7422        | -6.6879      | -2.8108         | -2.8117       |
| 0.7189        | 0.02  | 140  | 0.7135          | 0.0948         | 0.1343           | 0.3333             | -0.0395         | -5.6489        | -6.5536      | -2.8099         | -2.8108       |
| 0.7177        | 0.03  | 150  | 0.7128          | 0.0954         | 0.1335           | 0.3333             | -0.0381         | -5.6579        | -6.5481      | -2.8100         | -2.8109       |
| 0.639         | 0.03  | 160  | 0.7232          | 0.0823         | 0.1404           | 0.3333             | -0.0581         | -5.5880        | -6.6785      | -2.8135         | -2.8144       |
| 0.7128        | 0.03  | 170  | 0.7361          | 0.0571         | 0.1393           | 0.375              | -0.0822         | -5.5991        | -6.9308      | -2.8165         | -2.8174       |
| 0.709         | 0.03  | 180  | 0.7361          | 0.0690         | 0.1519           | 0.375              | -0.0829         | -5.4739        | -6.8120      | -2.8159         | -2.8168       |
| 0.6167        | 0.03  | 190  | 0.7483          | 0.0424         | 0.1461           | 0.375              | -0.1038         | -5.5311        | -7.0782      | -2.8180         | -2.8189       |
| 0.7521        | 0.03  | 200  | 0.7589          | 0.0180         | 0.1360           | 0.3333             | -0.1180         | -5.6325        | -7.3223      | -2.8199         | -2.8209       |
| 0.6204        | 0.04  | 210  | 0.7726          | -0.0220        | 0.1130           | 0.375              | -0.1350         | -5.8622        | -7.7214      | -2.8217         | -2.8227       |
| 0.6578        | 0.04  | 220  | 0.7839          | -0.0525        | 0.0994           | 0.3333             | -0.1520         | -5.9980        | -8.0273      | -2.8232         | -2.8242       |
| 0.7633        | 0.04  | 230  | 0.7868          | -0.0613        | 0.0902           | 0.375              | -0.1516         | -6.0903        | -8.1152      | -2.8235         | -2.8245       |
| 0.7391        | 0.04  | 240  | 0.7917          | -0.0742        | 0.0850           | 0.375              | -0.1592         | -6.1429        | -8.2441      | -2.8246         | -2.8256       |
| 0.6759        | 0.04  | 250  | 0.8023          | -0.1101        | 0.0656           | 0.3333             | -0.1757         | -6.3368        | -8.6031      | -2.8262         | -2.8272       |
| 0.6768        | 0.04  | 260  | 0.8107          | -0.1470        | 0.0326           | 0.375              | -0.1796         | -6.6662        | -8.9720      | -2.8264         | -2.8274       |
| 0.5398        | 0.04  | 270  | 0.8411          | -0.2390        | -0.0341          | 0.375              | -0.2049         | -7.3331        | -9.8918      | -2.8279         | -2.8289       |
| 0.5617        | 0.05  | 280  | 0.8797          | -0.3532        | -0.1075          | 0.375              | -0.2457         | -8.0674        | -11.0340     | -2.8306         | -2.8317       |
| 0.7585        | 0.05  | 290  | 0.9009          | -0.4183        | -0.1540          | 0.375              | -0.2642         | -8.5328        | -11.6845     | -2.8318         | -2.8329       |
| 0.4971        | 0.05  | 300  | 0.9602          | -0.5793        | -0.2520          | 0.375              | -0.3274         | -9.5121        | -13.2952     | -2.8362         | -2.8373       |
| 0.5759        | 0.05  | 310  | 1.0568          | -0.8155        | -0.3982          | 0.375              | -0.4173         | -10.9749       | -15.6568     | -2.8426         | -2.8437       |
| 0.451         | 0.05  | 320  | 1.1605          | -1.0527        | -0.5383          | 0.375              | -0.5144         | -12.3754       | -18.0287     | -2.8482         | -2.8493       |
| 1.4199        | 0.06  | 330  | 1.1756          | -1.1393        | -0.6287          | 0.375              | -0.5106         | -13.2791       | -18.8948     | -2.8505         | -2.8516       |
| 0.6853        | 0.06  | 340  | 1.1875          | -1.1840        | -0.6936          | 0.375              | -0.4904         | -13.9281       | -19.3416     | -2.8530         | -2.8541       |
| 0.3956        | 0.06  | 350  | 1.2550          | -1.2944        | -0.7654          | 0.375              | -0.5291         | -14.6460       | -20.4463     | -2.8568         | -2.8579       |
| 0.8692        | 0.06  | 360  | 1.3093          | -1.4107        | -0.8644          | 0.375              | -0.5463         | -15.6363       | -21.6084     | -2.8602         | -2.8613       |
| 1.4214        | 0.06  | 370  | 1.2759          | -1.3853        | -0.8782          | 0.375              | -0.5071         | -15.7746       | -21.3549     | -2.8579         | -2.8590       |
| 0.6163        | 0.06  | 380  | 1.3124          | -1.4537        | -0.9274          | 0.375              | -0.5263         | -16.2665       | -22.0389     | -2.8580         | -2.8591       |
| 0.586         | 0.07  | 390  | 1.4060          | -1.6073        | -1.0263          | 0.375              | -0.5810         | -17.2554       | -23.5750     | -2.8594         | -2.8605       |
| 1.7565        | 0.07  | 400  | 1.3869          | -1.5469        | -0.9534          | 0.375              | -0.5936         | -16.5259       | -22.9709     | -2.8611         | -2.8623       |
| 0.749         | 0.07  | 410  | 1.4037          | -1.5658        | -0.9400          | 0.375              | -0.6258         | -16.3927       | -23.1602     | -2.8615         | -2.8626       |
| 0.7682        | 0.07  | 420  | 1.4444          | -1.6154        | -0.9575          | 0.375              | -0.6578         | -16.5678       | -23.6556     | -2.8618         | -2.8630       |
| 0.5276        | 0.07  | 430  | 1.5646          | -1.7833        | -1.0365          | 0.375              | -0.7467         | -17.3576       | -25.3345     | -2.8645         | -2.8658       |
| 1.2132        | 0.07  | 440  | 1.6229          | -1.8510        | -1.0641          | 0.375              | -0.7869         | -17.6336       | -26.0119     | -2.8657         | -2.8670       |
| 1.0323        | 0.07  | 450  | 1.6468          | -1.8672        | -1.0528          | 0.3333             | -0.8143         | -17.5208       | -26.1736     | -2.8655         | -2.8668       |
| 1.1453        | 0.08  | 460  | 1.6741          | -1.8759        | -1.0266          | 0.3333             | -0.8494         | -17.2580       | -26.2613     | -2.8659         | -2.8672       |
| 1.526         | 0.08  | 470  | 1.6465          | -1.8347        | -1.0076          | 0.3333             | -0.8271         | -17.0681       | -25.8488     | -2.8671         | -2.8684       |
| 1.1323        | 0.08  | 480  | 1.5543          | -1.7064        | -0.9557          | 0.3333             | -0.7507         | -16.5494       | -24.5655     | -2.8682         | -2.8694       |
| 1.0389        | 0.08  | 490  | 1.5824          | -1.7717        | -1.0002          | 0.3333             | -0.7715         | -16.9945       | -25.2190     | -2.8694         | -2.8706       |
| 0.8626        | 0.08  | 500  | 1.6038          | -1.8376        | -1.0545          | 0.3333             | -0.7831         | -17.5374       | -25.8781     | -2.8693         | -2.8706       |
| 0.8392        | 0.09  | 510  | 1.6952          | -1.9873        | -1.1387          | 0.3333             | -0.8486         | -18.3790       | -27.3744     | -2.8697         | -2.8710       |
| 0.6528        | 0.09  | 520  | 1.7895          | -2.1144        | -1.1842          | 0.25               | -0.9302         | -18.8344       | -28.6457     | -2.8693         | -2.8707       |
| 1.3843        | 0.09  | 530  | 1.8088          | -2.1501        | -1.2043          | 0.25               | -0.9458         | -19.0354       | -29.0030     | -2.8696         | -2.8710       |
| 1.296         | 0.09  | 540  | 1.7833          | -2.1309        | -1.2130          | 0.25               | -0.9178         | -19.1228       | -28.8106     | -2.8691         | -2.8705       |
| 0.7343        | 0.09  | 550  | 1.8244          | -2.1833        | -1.2404          | 0.25               | -0.9428         | -19.3968       | -29.3344     | -2.8676         | -2.8689       |
| 1.089         | 0.09  | 560  | 1.8288          | -2.1789        | -1.2313          | 0.25               | -0.9476         | -19.3059       | -29.2912     | -2.8690         | -2.8704       |
| 0.8322        | 0.1   | 570  | 1.9009          | -2.2715        | -1.2811          | 0.25               | -0.9903         | -19.8038       | -30.2165     | -2.8697         | -2.8711       |
| 0.8684        | 0.1   | 580  | 1.9310          | -2.3144        | -1.3151          | 0.25               | -0.9993         | -20.1433       | -30.6454     | -2.8722         | -2.8736       |
| 0.9827        | 0.1   | 590  | 1.9558          | -2.3309        | -1.3222          | 0.25               | -1.0087         | -20.2145       | -30.8112     | -2.8740         | -2.8754       |
| 0.5176        | 0.1   | 600  | 1.9731          | -2.3665        | -1.3574          | 0.25               | -1.0091         | -20.5666       | -31.1672     | -2.8754         | -2.8768       |
| 1.0789        | 0.1   | 610  | 2.0276          | -2.4550        | -1.4152          | 0.25               | -1.0398         | -21.1444       | -32.0516     | -2.8756         | -2.8769       |
| 0.8444        | 0.1   | 620  | 2.1331          | -2.6253        | -1.5121          | 0.25               | -1.1132         | -22.1132       | -33.7550     | -2.8726         | -2.8739       |
| 1.6609        | 0.1   | 630  | 2.1160          | -2.6511        | -1.5573          | 0.25               | -1.0938         | -22.5657       | -34.0127     | -2.8740         | -2.8753       |
| 1.3086        | 0.11  | 640  | 2.0791          | -2.6721        | -1.6152          | 0.25               | -1.0569         | -23.1446       | -34.2231     | -2.8749         | -2.8762       |
| 1.0659        | 0.11  | 650  | 2.0520          | -2.6575        | -1.6184          | 0.25               | -1.0391         | -23.1760       | -34.0763     | -2.8766         | -2.8778       |
| 1.3081        | 0.11  | 660  | 2.0481          | -2.6650        | -1.6332          | 0.25               | -1.0318         | -23.3247       | -34.1520     | -2.8756         | -2.8769       |
| 0.769         | 0.11  | 670  | 2.0971          | -2.7165        | -1.6666          | 0.25               | -1.0500         | -23.6581       | -34.6672     | -2.8745         | -2.8758       |
| 1.1385        | 0.11  | 680  | 2.1554          | -2.7771        | -1.7021          | 0.25               | -1.0750         | -24.0137       | -35.2731     | -2.8735         | -2.8748       |
| 1.0306        | 0.12  | 690  | 2.2076          | -2.8501        | -1.7587          | 0.25               | -1.0914         | -24.5793       | -36.0025     | -2.8714         | -2.8727       |
| 1.3893        | 0.12  | 700  | 2.2299          | -2.8955        | -1.7944          | 0.25               | -1.1010         | -24.9367       | -36.4564     | -2.8682         | -2.8695       |
| 2.2234        | 0.12  | 710  | 2.2237          | -2.9162        | -1.8126          | 0.25               | -1.1036         | -25.1184       | -36.6639     | -2.8654         | -2.8667       |
| 0.4678        | 0.12  | 720  | 2.2379          | -2.9096        | -1.7873          | 0.25               | -1.1223         | -24.8652       | -36.5974     | -2.8658         | -2.8671       |
| 0.8098        | 0.12  | 730  | 2.2768          | -2.9290        | -1.7762          | 0.25               | -1.1529         | -24.7543       | -36.7922     | -2.8652         | -2.8665       |
| 1.8821        | 0.12  | 740  | 2.2740          | -2.9198        | -1.7623          | 0.25               | -1.1574         | -24.6159       | -36.6994     | -2.8641         | -2.8654       |
| 1.095         | 0.12  | 750  | 2.2689          | -2.8862        | -1.7174          | 0.25               | -1.1688         | -24.1662       | -36.3637     | -2.8647         | -2.8660       |
| 1.7464        | 0.13  | 760  | 2.2488          | -2.8828        | -1.7320          | 0.25               | -1.1508         | -24.3128       | -36.3297     | -2.8640         | -2.8653       |
| 0.9967        | 0.13  | 770  | 2.2235          | -2.8783        | -1.7502          | 0.25               | -1.1281         | -24.4945       | -36.2849     | -2.8622         | -2.8634       |
| 0.7823        | 0.13  | 780  | 2.2370          | -2.9074        | -1.7744          | 0.25               | -1.1330         | -24.7361       | -36.5759     | -2.8593         | -2.8606       |
| 1.3903        | 0.13  | 790  | 2.2755          | -2.9143        | -1.7485          | 0.25               | -1.1658         | -24.4774       | -36.6450     | -2.8587         | -2.8600       |
| 2.0372        | 0.13  | 800  | 2.2250          | -2.7892        | -1.6505          | 0.25               | -1.1387         | -23.4972       | -35.3939     | -2.8629         | -2.8642       |
| 0.7111        | 0.14  | 810  | 2.2409          | -2.7911        | -1.6348          | 0.25               | -1.1562         | -23.3407       | -35.4124     | -2.8642         | -2.8654       |
| 0.8446        | 0.14  | 820  | 2.2740          | -2.8395        | -1.6646          | 0.25               | -1.1749         | -23.6383       | -35.8968     | -2.8638         | -2.8651       |
| 1.2303        | 0.14  | 830  | 2.2812          | -2.8540        | -1.6787          | 0.25               | -1.1752         | -23.7798       | -36.0417     | -2.8648         | -2.8661       |
| 0.5053        | 0.14  | 840  | 2.2834          | -2.8740        | -1.7065          | 0.25               | -1.1675         | -24.0571       | -36.2418     | -2.8640         | -2.8653       |
| 0.5767        | 0.14  | 850  | 2.3105          | -2.9262        | -1.7448          | 0.25               | -1.1814         | -24.4399       | -36.7635     | -2.8618         | -2.8631       |
| 1.7435        | 0.14  | 860  | 2.3174          | -2.9360        | -1.7519          | 0.25               | -1.1841         | -24.5119       | -36.8619     | -2.8627         | -2.8639       |
| 1.6134        | 0.14  | 870  | 2.3028          | -2.9288        | -1.7659          | 0.25               | -1.1629         | -24.6517       | -36.7902     | -2.8635         | -2.8647       |
| 1.747         | 0.15  | 880  | 2.2686          | -2.8780        | -1.7398          | 0.25               | -1.1382         | -24.3902       | -36.2816     | -2.8658         | -2.8671       |
| 1.3341        | 0.15  | 890  | 2.2555          | -2.8559        | -1.7244          | 0.25               | -1.1315         | -24.2361       | -36.0610     | -2.8673         | -2.8686       |
| 1.884         | 0.15  | 900  | 2.2349          | -2.8291        | -1.7129          | 0.25               | -1.1162         | -24.1211       | -35.7924     | -2.8677         | -2.8689       |
| 0.5031        | 0.15  | 910  | 2.2361          | -2.8327        | -1.7156          | 0.25               | -1.1171         | -24.1479       | -35.8284     | -2.8671         | -2.8684       |
| 0.7273        | 0.15  | 920  | 2.2545          | -2.8595        | -1.7291          | 0.25               | -1.1304         | -24.2834       | -36.0963     | -2.8665         | -2.8678       |
| 1.2208        | 0.15  | 930  | 2.2655          | -2.8756        | -1.7364          | 0.25               | -1.1393         | -24.3561       | -36.2580     | -2.8656         | -2.8669       |
| 0.6928        | 0.16  | 940  | 2.2697          | -2.8817        | -1.7405          | 0.25               | -1.1412         | -24.3971       | -36.3184     | -2.8652         | -2.8665       |
| 2.2099        | 0.16  | 950  | 2.2581          | -2.8642        | -1.7302          | 0.25               | -1.1340         | -24.2945       | -36.1442     | -2.8656         | -2.8668       |
| 1.6883        | 0.16  | 960  | 2.2544          | -2.8575        | -1.7258          | 0.25               | -1.1318         | -24.2503       | -36.0772     | -2.8656         | -2.8668       |
| 1.9968        | 0.16  | 970  | 2.2455          | -2.8405        | -1.7135          | 0.25               | -1.1271         | -24.1270       | -35.9072     | -2.8657         | -2.8670       |
| 2.1044        | 0.16  | 980  | 2.2400          | -2.8308        | -1.7064          | 0.25               | -1.1243         | -24.0569       | -35.8097     | -2.8656         | -2.8668       |
| 0.7207        | 0.17  | 990  | 2.2376          | -2.8286        | -1.7067          | 0.25               | -1.1218         | -24.0597       | -35.7875     | -2.8654         | -2.8667       |
| 1.1388        | 0.17  | 1000 | 2.2375          | -2.8294        | -1.7077          | 0.25               | -1.1217         | -24.0692       | -35.7956     | -2.8653         | -2.8666       |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.0.1+cu117
- Datasets 2.15.0
- Tokenizers 0.15.0