File size: 11,734 Bytes
60f9be8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
---
tags:
- generated_from_trainer
model-index:
- name: t5-small-p-l-akk-en-20240809-220318
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# t5-small-p-l-akk-en-20240809-220318

This model was trained from scratch on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1791

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 250
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch    | Step   | Validation Loss |
|:-------------:|:--------:|:------:|:---------------:|
| 0.1938        | 1.3270   | 2500   | 0.2014          |
| 0.1921        | 2.6539   | 5000   | 0.2010          |
| 0.1884        | 3.9809   | 7500   | 0.1993          |
| 0.1919        | 5.3079   | 10000  | 0.1985          |
| 0.1849        | 6.6348   | 12500  | 0.1981          |
| 0.1907        | 7.9618   | 15000  | 0.1969          |
| 0.1869        | 9.2887   | 17500  | 0.1970          |
| 0.1872        | 10.6157  | 20000  | 0.1969          |
| 0.183         | 11.9427  | 22500  | 0.1963          |
| 0.183         | 13.2696  | 25000  | 0.1957          |
| 0.1872        | 14.5966  | 27500  | 0.1946          |
| 0.1802        | 15.9236  | 30000  | 0.1931          |
| 0.1823        | 17.2505  | 32500  | 0.1932          |
| 0.1791        | 18.5775  | 35000  | 0.1927          |
| 0.1798        | 19.9045  | 37500  | 0.1924          |
| 0.1803        | 21.2314  | 40000  | 0.1916          |
| 0.179         | 22.5584  | 42500  | 0.1912          |
| 0.1794        | 23.8854  | 45000  | 0.1905          |
| 0.1783        | 25.2123  | 47500  | 0.1904          |
| 0.1741        | 26.5393  | 50000  | 0.1900          |
| 0.1712        | 27.8662  | 52500  | 0.1900          |
| 0.1747        | 29.1932  | 55000  | 0.1901          |
| 0.1705        | 30.5202  | 57500  | 0.1892          |
| 0.1719        | 31.8471  | 60000  | 0.1889          |
| 0.1716        | 33.1741  | 62500  | 0.1891          |
| 0.1681        | 34.5011  | 65000  | 0.1890          |
| 0.1694        | 35.8280  | 67500  | 0.1875          |
| 0.1677        | 37.1550  | 70000  | 0.1878          |
| 0.169         | 38.4820  | 72500  | 0.1861          |
| 0.17          | 39.8089  | 75000  | 0.1863          |
| 0.1662        | 41.1359  | 77500  | 0.1858          |
| 0.163         | 42.4628  | 80000  | 0.1862          |
| 0.1637        | 43.7898  | 82500  | 0.1859          |
| 0.1647        | 45.1168  | 85000  | 0.1854          |
| 0.1609        | 46.4437  | 87500  | 0.1856          |
| 0.1678        | 47.7707  | 90000  | 0.1846          |
| 0.1595        | 49.0977  | 92500  | 0.1849          |
| 0.1605        | 50.4246  | 95000  | 0.1849          |
| 0.1609        | 51.7516  | 97500  | 0.1843          |
| 0.1635        | 53.0786  | 100000 | 0.1847          |
| 0.1583        | 54.4055  | 102500 | 0.1836          |
| 0.1564        | 55.7325  | 105000 | 0.1836          |
| 0.1606        | 57.0594  | 107500 | 0.1834          |
| 0.1555        | 58.3864  | 110000 | 0.1833          |
| 0.1572        | 59.7134  | 112500 | 0.1826          |
| 0.1601        | 61.0403  | 115000 | 0.1838          |
| 0.1567        | 62.3673  | 117500 | 0.1832          |
| 0.1551        | 63.6943  | 120000 | 0.1815          |
| 0.1558        | 65.0212  | 122500 | 0.1825          |
| 0.1531        | 66.3482  | 125000 | 0.1819          |
| 0.155         | 67.6752  | 127500 | 0.1823          |
| 0.1562        | 69.0021  | 130000 | 0.1815          |
| 0.1536        | 70.3291  | 132500 | 0.1820          |
| 0.1501        | 71.6561  | 135000 | 0.1819          |
| 0.1532        | 72.9830  | 137500 | 0.1813          |
| 0.1501        | 74.3100  | 140000 | 0.1816          |
| 0.1507        | 75.6369  | 142500 | 0.1809          |
| 0.1501        | 76.9639  | 145000 | 0.1812          |
| 0.1474        | 78.2909  | 147500 | 0.1802          |
| 0.1462        | 79.6178  | 150000 | 0.1819          |
| 0.1464        | 80.9448  | 152500 | 0.1807          |
| 0.1465        | 82.2718  | 155000 | 0.1802          |
| 0.1478        | 83.5987  | 157500 | 0.1810          |
| 0.1451        | 84.9257  | 160000 | 0.1794          |
| 0.144         | 86.2527  | 162500 | 0.1816          |
| 0.144         | 87.5796  | 165000 | 0.1803          |
| 0.1453        | 88.9066  | 167500 | 0.1795          |
| 0.1429        | 90.2335  | 170000 | 0.1792          |
| 0.1438        | 91.5605  | 172500 | 0.1804          |
| 0.1452        | 92.8875  | 175000 | 0.1790          |
| 0.1453        | 94.2144  | 177500 | 0.1791          |
| 0.1406        | 95.5414  | 180000 | 0.1799          |
| 0.1391        | 96.8684  | 182500 | 0.1792          |
| 0.144         | 98.1953  | 185000 | 0.1793          |
| 0.144         | 99.5223  | 187500 | 0.1787          |
| 0.1385        | 100.8493 | 190000 | 0.1784          |
| 0.1406        | 102.1762 | 192500 | 0.1787          |
| 0.142         | 103.5032 | 195000 | 0.1800          |
| 0.1394        | 104.8301 | 197500 | 0.1787          |
| 0.1391        | 106.1571 | 200000 | 0.1789          |
| 0.1357        | 107.4841 | 202500 | 0.1797          |
| 0.1384        | 108.8110 | 205000 | 0.1785          |
| 0.1408        | 110.1380 | 207500 | 0.1792          |
| 0.1366        | 111.4650 | 210000 | 0.1800          |
| 0.1375        | 112.7919 | 212500 | 0.1792          |
| 0.1383        | 114.1189 | 215000 | 0.1790          |
| 0.1351        | 115.4459 | 217500 | 0.1788          |
| 0.1382        | 116.7728 | 220000 | 0.1784          |
| 0.1341        | 118.0998 | 222500 | 0.1791          |
| 0.1385        | 119.4268 | 225000 | 0.1788          |
| 0.1353        | 120.7537 | 227500 | 0.1783          |
| 0.1362        | 122.0807 | 230000 | 0.1783          |
| 0.1343        | 123.4076 | 232500 | 0.1783          |
| 0.1419        | 124.7346 | 235000 | 0.1786          |
| 0.1332        | 126.0616 | 237500 | 0.1787          |
| 0.1333        | 127.3885 | 240000 | 0.1785          |
| 0.1336        | 128.7155 | 242500 | 0.1782          |
| 0.132         | 130.0425 | 245000 | 0.1783          |
| 0.1299        | 131.3694 | 247500 | 0.1776          |
| 0.1313        | 132.6964 | 250000 | 0.1790          |
| 0.1302        | 134.0234 | 252500 | 0.1775          |
| 0.1301        | 135.3503 | 255000 | 0.1786          |
| 0.1337        | 136.6773 | 257500 | 0.1785          |
| 0.1302        | 138.0042 | 260000 | 0.1791          |
| 0.1288        | 139.3312 | 262500 | 0.1789          |
| 0.1321        | 140.6582 | 265000 | 0.1785          |
| 0.1299        | 141.9851 | 267500 | 0.1779          |
| 0.129         | 143.3121 | 270000 | 0.1791          |
| 0.13          | 144.6391 | 272500 | 0.1780          |
| 0.133         | 145.9660 | 275000 | 0.1786          |
| 0.1295        | 147.2930 | 277500 | 0.1781          |
| 0.1283        | 148.6200 | 280000 | 0.1780          |
| 0.127         | 149.9469 | 282500 | 0.1778          |
| 0.1246        | 151.2739 | 285000 | 0.1785          |
| 0.1293        | 152.6008 | 287500 | 0.1783          |
| 0.1259        | 153.9278 | 290000 | 0.1781          |
| 0.129         | 155.2548 | 292500 | 0.1777          |
| 0.126         | 156.5817 | 295000 | 0.1778          |
| 0.1275        | 157.9087 | 297500 | 0.1777          |
| 0.1259        | 159.2357 | 300000 | 0.1784          |
| 0.1273        | 160.5626 | 302500 | 0.1774          |
| 0.1272        | 161.8896 | 305000 | 0.1786          |
| 0.1243        | 163.2166 | 307500 | 0.1787          |
| 0.1245        | 164.5435 | 310000 | 0.1784          |
| 0.1259        | 165.8705 | 312500 | 0.1785          |
| 0.1262        | 167.1975 | 315000 | 0.1779          |
| 0.1242        | 168.5244 | 317500 | 0.1783          |
| 0.1241        | 169.8514 | 320000 | 0.1779          |
| 0.1293        | 171.1783 | 322500 | 0.1792          |
| 0.1247        | 172.5053 | 325000 | 0.1777          |
| 0.1266        | 173.8323 | 327500 | 0.1790          |
| 0.1232        | 175.1592 | 330000 | 0.1787          |
| 0.1239        | 176.4862 | 332500 | 0.1788          |
| 0.1248        | 177.8132 | 335000 | 0.1789          |
| 0.1242        | 179.1401 | 337500 | 0.1787          |
| 0.1236        | 180.4671 | 340000 | 0.1786          |
| 0.1259        | 181.7941 | 342500 | 0.1787          |
| 0.1206        | 183.1210 | 345000 | 0.1779          |
| 0.1226        | 184.4480 | 347500 | 0.1778          |
| 0.1231        | 185.7749 | 350000 | 0.1782          |
| 0.1201        | 187.1019 | 352500 | 0.1789          |
| 0.121         | 188.4289 | 355000 | 0.1791          |
| 0.1223        | 189.7558 | 357500 | 0.1792          |
| 0.1227        | 191.0828 | 360000 | 0.1779          |
| 0.121         | 192.4098 | 362500 | 0.1783          |
| 0.1211        | 193.7367 | 365000 | 0.1790          |
| 0.1249        | 195.0637 | 367500 | 0.1787          |
| 0.1216        | 196.3907 | 370000 | 0.1781          |
| 0.1224        | 197.7176 | 372500 | 0.1785          |
| 0.1208        | 199.0446 | 375000 | 0.1794          |
| 0.1203        | 200.3715 | 377500 | 0.1787          |
| 0.1179        | 201.6985 | 380000 | 0.1786          |
| 0.1214        | 203.0255 | 382500 | 0.1785          |
| 0.1204        | 204.3524 | 385000 | 0.1790          |
| 0.118         | 205.6794 | 387500 | 0.1782          |
| 0.1224        | 207.0064 | 390000 | 0.1793          |
| 0.1225        | 208.3333 | 392500 | 0.1788          |
| 0.121         | 209.6603 | 395000 | 0.1790          |
| 0.1187        | 210.9873 | 397500 | 0.1788          |
| 0.1225        | 212.3142 | 400000 | 0.1787          |
| 0.119         | 213.6412 | 402500 | 0.1786          |
| 0.1179        | 214.9682 | 405000 | 0.1793          |
| 0.1212        | 216.2951 | 407500 | 0.1790          |
| 0.12          | 217.6221 | 410000 | 0.1791          |
| 0.1204        | 218.9490 | 412500 | 0.1788          |
| 0.1202        | 220.2760 | 415000 | 0.1786          |
| 0.1224        | 221.6030 | 417500 | 0.1794          |
| 0.1175        | 222.9299 | 420000 | 0.1785          |
| 0.1188        | 224.2569 | 422500 | 0.1783          |
| 0.118         | 225.5839 | 425000 | 0.1789          |
| 0.1197        | 226.9108 | 427500 | 0.1789          |
| 0.1181        | 228.2378 | 430000 | 0.1786          |
| 0.1195        | 229.5648 | 432500 | 0.1792          |
| 0.1206        | 230.8917 | 435000 | 0.1790          |
| 0.1174        | 232.2187 | 437500 | 0.1793          |
| 0.1189        | 233.5456 | 440000 | 0.1787          |
| 0.1183        | 234.8726 | 442500 | 0.1787          |
| 0.1193        | 236.1996 | 445000 | 0.1790          |
| 0.1171        | 237.5265 | 447500 | 0.1788          |
| 0.1179        | 238.8535 | 450000 | 0.1789          |
| 0.1202        | 240.1805 | 452500 | 0.1789          |
| 0.1206        | 241.5074 | 455000 | 0.1786          |
| 0.1183        | 242.8344 | 457500 | 0.1789          |
| 0.1183        | 244.1614 | 460000 | 0.1790          |
| 0.1181        | 245.4883 | 462500 | 0.1791          |
| 0.1205        | 246.8153 | 465000 | 0.1790          |
| 0.1208        | 248.1423 | 467500 | 0.1791          |
| 0.1175        | 249.4692 | 470000 | 0.1791          |


### Framework versions

- Transformers 4.41.2
- Pytorch 2.3.1+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1