tanatapanun
commited on
Model save
Browse files
README.md
ADDED
@@ -0,0 +1,114 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model: GanjinZero/biobart-base
|
4 |
+
tags:
|
5 |
+
- generated_from_trainer
|
6 |
+
metrics:
|
7 |
+
- rouge
|
8 |
+
model-index:
|
9 |
+
- name: fine-tuned-BioBART-50-epochs-1024-input-160-output
|
10 |
+
results: []
|
11 |
+
---
|
12 |
+
|
13 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
14 |
+
should probably proofread and complete it, then remove this comment. -->
|
15 |
+
|
16 |
+
# fine-tuned-BioBART-50-epochs-1024-input-160-output
|
17 |
+
|
18 |
+
This model is a fine-tuned version of [GanjinZero/biobart-base](https://huggingface.co/GanjinZero/biobart-base) on the None dataset.
|
19 |
+
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 1.6492
|
21 |
+
- Rouge1: 0.173
|
22 |
+
- Rouge2: 0.0346
|
23 |
+
- Rougel: 0.1373
|
24 |
+
- Rougelsum: 0.1364
|
25 |
+
- Gen Len: 40.05
|
26 |
+
|
27 |
+
## Model description
|
28 |
+
|
29 |
+
More information needed
|
30 |
+
|
31 |
+
## Intended uses & limitations
|
32 |
+
|
33 |
+
More information needed
|
34 |
+
|
35 |
+
## Training and evaluation data
|
36 |
+
|
37 |
+
More information needed
|
38 |
+
|
39 |
+
## Training procedure
|
40 |
+
|
41 |
+
### Training hyperparameters
|
42 |
+
|
43 |
+
The following hyperparameters were used during training:
|
44 |
+
- learning_rate: 0.0001
|
45 |
+
- train_batch_size: 8
|
46 |
+
- eval_batch_size: 8
|
47 |
+
- seed: 42
|
48 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
49 |
+
- lr_scheduler_type: linear
|
50 |
+
- lr_scheduler_warmup_ratio: 0.1
|
51 |
+
- num_epochs: 50
|
52 |
+
|
53 |
+
### Training results
|
54 |
+
|
55 |
+
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
56 |
+
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
|
57 |
+
| No log | 1.0 | 151 | 8.7694 | 0.0 | 0.0 | 0.0 | 0.0 | 14.0 |
|
58 |
+
| No log | 2.0 | 302 | 4.3319 | 0.0047 | 0.0003 | 0.0047 | 0.0047 | 4.35 |
|
59 |
+
| No log | 3.0 | 453 | 1.6898 | 0.1088 | 0.0334 | 0.099 | 0.0996 | 15.96 |
|
60 |
+
| 6.0134 | 4.0 | 604 | 1.4547 | 0.1004 | 0.0207 | 0.0776 | 0.0769 | 24.61 |
|
61 |
+
| 6.0134 | 5.0 | 755 | 1.3712 | 0.1532 | 0.0285 | 0.1197 | 0.1194 | 38.39 |
|
62 |
+
| 6.0134 | 6.0 | 906 | 1.3235 | 0.1144 | 0.0282 | 0.0901 | 0.0913 | 23.41 |
|
63 |
+
| 1.2432 | 7.0 | 1057 | 1.2835 | 0.1011 | 0.025 | 0.0784 | 0.0798 | 25.49 |
|
64 |
+
| 1.2432 | 8.0 | 1208 | 1.2733 | 0.1536 | 0.0387 | 0.1169 | 0.1183 | 38.0 |
|
65 |
+
| 1.2432 | 9.0 | 1359 | 1.2842 | 0.1386 | 0.0244 | 0.1162 | 0.1162 | 20.83 |
|
66 |
+
| 0.7926 | 10.0 | 1510 | 1.2752 | 0.1812 | 0.0353 | 0.1352 | 0.1363 | 45.95 |
|
67 |
+
| 0.7926 | 11.0 | 1661 | 1.2846 | 0.1804 | 0.0378 | 0.1452 | 0.1464 | 31.63 |
|
68 |
+
| 0.7926 | 12.0 | 1812 | 1.2998 | 0.1899 | 0.0432 | 0.1346 | 0.1348 | 48.98 |
|
69 |
+
| 0.7926 | 13.0 | 1963 | 1.3226 | 0.1809 | 0.0474 | 0.143 | 0.1438 | 33.78 |
|
70 |
+
| 0.4817 | 14.0 | 2114 | 1.3471 | 0.1425 | 0.0341 | 0.1024 | 0.1028 | 37.38 |
|
71 |
+
| 0.4817 | 15.0 | 2265 | 1.3651 | 0.1805 | 0.0315 | 0.1402 | 0.1412 | 33.77 |
|
72 |
+
| 0.4817 | 16.0 | 2416 | 1.3818 | 0.1469 | 0.0333 | 0.1188 | 0.1191 | 30.55 |
|
73 |
+
| 0.2578 | 17.0 | 2567 | 1.3936 | 0.1734 | 0.0353 | 0.1339 | 0.133 | 36.63 |
|
74 |
+
| 0.2578 | 18.0 | 2718 | 1.4192 | 0.1988 | 0.0471 | 0.1576 | 0.1587 | 40.01 |
|
75 |
+
| 0.2578 | 19.0 | 2869 | 1.4183 | 0.1852 | 0.0378 | 0.1449 | 0.1444 | 39.72 |
|
76 |
+
| 0.1232 | 20.0 | 3020 | 1.4483 | 0.1625 | 0.0442 | 0.1285 | 0.1296 | 36.7 |
|
77 |
+
| 0.1232 | 21.0 | 3171 | 1.4582 | 0.1771 | 0.0408 | 0.1321 | 0.1329 | 41.33 |
|
78 |
+
| 0.1232 | 22.0 | 3322 | 1.4860 | 0.1813 | 0.0429 | 0.1458 | 0.1458 | 40.09 |
|
79 |
+
| 0.1232 | 23.0 | 3473 | 1.5091 | 0.1616 | 0.0373 | 0.1273 | 0.1269 | 37.73 |
|
80 |
+
| 0.0543 | 24.0 | 3624 | 1.4922 | 0.1914 | 0.0371 | 0.1429 | 0.143 | 45.71 |
|
81 |
+
| 0.0543 | 25.0 | 3775 | 1.5290 | 0.1642 | 0.0388 | 0.1307 | 0.1315 | 36.5 |
|
82 |
+
| 0.0543 | 26.0 | 3926 | 1.5310 | 0.1929 | 0.0428 | 0.1524 | 0.1521 | 40.69 |
|
83 |
+
| 0.0278 | 27.0 | 4077 | 1.5282 | 0.1691 | 0.0414 | 0.1355 | 0.1362 | 39.25 |
|
84 |
+
| 0.0278 | 28.0 | 4228 | 1.5424 | 0.1749 | 0.0424 | 0.1404 | 0.1408 | 44.13 |
|
85 |
+
| 0.0278 | 29.0 | 4379 | 1.5573 | 0.1922 | 0.0364 | 0.1549 | 0.1548 | 41.01 |
|
86 |
+
| 0.0174 | 30.0 | 4530 | 1.5614 | 0.1635 | 0.0358 | 0.1313 | 0.1318 | 39.58 |
|
87 |
+
| 0.0174 | 31.0 | 4681 | 1.5683 | 0.187 | 0.0427 | 0.1508 | 0.1508 | 39.52 |
|
88 |
+
| 0.0174 | 32.0 | 4832 | 1.5910 | 0.172 | 0.0262 | 0.1312 | 0.13 | 39.9 |
|
89 |
+
| 0.0174 | 33.0 | 4983 | 1.5748 | 0.1828 | 0.0429 | 0.1471 | 0.1483 | 38.88 |
|
90 |
+
| 0.0118 | 34.0 | 5134 | 1.5834 | 0.1702 | 0.034 | 0.1327 | 0.1321 | 38.71 |
|
91 |
+
| 0.0118 | 35.0 | 5285 | 1.5935 | 0.1987 | 0.0451 | 0.1576 | 0.1577 | 40.79 |
|
92 |
+
| 0.0118 | 36.0 | 5436 | 1.5993 | 0.193 | 0.0407 | 0.156 | 0.1555 | 41.14 |
|
93 |
+
| 0.009 | 37.0 | 5587 | 1.6120 | 0.1818 | 0.0393 | 0.1406 | 0.1408 | 40.82 |
|
94 |
+
| 0.009 | 38.0 | 5738 | 1.6203 | 0.1699 | 0.034 | 0.1344 | 0.1353 | 40.08 |
|
95 |
+
| 0.009 | 39.0 | 5889 | 1.6201 | 0.1866 | 0.0419 | 0.1446 | 0.1443 | 40.17 |
|
96 |
+
| 0.0068 | 40.0 | 6040 | 1.6161 | 0.1708 | 0.0279 | 0.136 | 0.1365 | 42.42 |
|
97 |
+
| 0.0068 | 41.0 | 6191 | 1.6334 | 0.1753 | 0.0396 | 0.14 | 0.14 | 38.92 |
|
98 |
+
| 0.0068 | 42.0 | 6342 | 1.6321 | 0.1806 | 0.0397 | 0.1448 | 0.1449 | 37.77 |
|
99 |
+
| 0.0068 | 43.0 | 6493 | 1.6399 | 0.1881 | 0.0373 | 0.1508 | 0.1499 | 40.48 |
|
100 |
+
| 0.0055 | 44.0 | 6644 | 1.6371 | 0.1847 | 0.0364 | 0.1486 | 0.1479 | 39.22 |
|
101 |
+
| 0.0055 | 45.0 | 6795 | 1.6421 | 0.1879 | 0.0368 | 0.1499 | 0.1491 | 40.72 |
|
102 |
+
| 0.0055 | 46.0 | 6946 | 1.6471 | 0.1862 | 0.0381 | 0.1484 | 0.1483 | 40.26 |
|
103 |
+
| 0.0044 | 47.0 | 7097 | 1.6503 | 0.1719 | 0.036 | 0.1362 | 0.1351 | 40.92 |
|
104 |
+
| 0.0044 | 48.0 | 7248 | 1.6493 | 0.1711 | 0.036 | 0.1375 | 0.1377 | 39.36 |
|
105 |
+
| 0.0044 | 49.0 | 7399 | 1.6492 | 0.1738 | 0.0353 | 0.1375 | 0.1365 | 40.71 |
|
106 |
+
| 0.004 | 50.0 | 7550 | 1.6492 | 0.173 | 0.0346 | 0.1373 | 0.1364 | 40.05 |
|
107 |
+
|
108 |
+
|
109 |
+
### Framework versions
|
110 |
+
|
111 |
+
- Transformers 4.36.2
|
112 |
+
- Pytorch 1.12.1+cu113
|
113 |
+
- Datasets 2.16.1
|
114 |
+
- Tokenizers 0.15.0
|
generation_config.json
ADDED
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token_id": 0,
|
3 |
+
"decoder_start_token_id": 2,
|
4 |
+
"early_stopping": true,
|
5 |
+
"eos_token_id": 2,
|
6 |
+
"forced_eos_token_id": 2,
|
7 |
+
"max_length": 160,
|
8 |
+
"no_repeat_ngram_size": 3,
|
9 |
+
"num_beams": 4,
|
10 |
+
"pad_token_id": 1,
|
11 |
+
"transformers_version": "4.36.2"
|
12 |
+
}
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 557912620
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:18f38a7aeaf8bdca3ccaf172517c6c556657c3350c197928ab49c37354b695df
|
3 |
size 557912620
|
runs/Jan25_14-52-42_william-gpu-3090-10-8vlnc/events.out.tfevents.1706194363.william-gpu-3090-10-8vlnc.8365.1
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:797d7f482d9f1dc0013df669f1e3af189708322f5084aa4635c83d20a0b7e477
|
3 |
+
size 34684
|