File size: 3,344 Bytes
023447b
 
 
 
 
 
 
 
 
 
 
 
c8521c1
 
 
7eb8862
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c8521c1
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
language: "en"
widget:
- text: "On the other hand, a decline of the arsenic content in hair and nail was observed after withdrawal of the drug."
- text: "These differences in gene expression have not been molecularly defined."
- text: "p65 was detected in the cytoplasm of FDC , whereas nuclei were negative."
- text: "These differences in gene expression have not been molecularly defined."

datasets: 
- Genia
---

## A Biomedical Pos-Tagger for English
Trained with the GENIA corpus.

Eval:
```
precision    recall  f1-score   support

           0       0.98      1.00      0.99       263
           3       0.93      1.00      0.97        14
           5       1.00      1.00      1.00         8
           6       0.99      0.99      0.99       169
           7       1.00      1.00      1.00       203
           8       0.99      1.00      1.00       195
           9       0.95      0.78      0.85        98
          10       0.83      1.00      0.91         5
          11       0.96      0.97      0.96       532
          12       1.00      1.00      1.00       252
          13       0.99      0.98      0.99      1575
          14       0.95      0.95      0.95       133
          15       0.89      0.89      0.89         9
          16       1.00      1.00      1.00         3
          18       0.99      1.00      0.99        69
          19       1.00      0.95      0.98        22
          20       0.99      1.00      1.00       395
          22       1.00      1.00      1.00      1328
          23       1.00      1.00      1.00       987
          24       1.00      1.00      1.00         6
          25       0.00      0.00      0.00         0
          26       1.00      1.00      1.00       620
          27       0.00      0.00      0.00         1
          28       1.00      1.00      1.00        39
          29       0.98      0.99      0.98      5674
          30       0.97      0.96      0.96      2075
          31       1.00      0.71      0.83         7
          32       1.00      0.80      0.89         5
          33       1.00      1.00      1.00        58
          34       1.00      1.00      1.00         2
          35       0.96      0.96      0.96       336
          37       0.99      1.00      1.00      1579
          38       1.00      1.00      1.00      1446
          39       1.00      0.98      0.99        57

    accuracy                           0.99     18165
   macro avg       0.92      0.91      0.91     18165
weighted avg       0.99      0.99      0.99     18165

F1:  0.985267446136761 Accuracy:  0.9853564547206166
```

Tags:
```
{0: 'VBD',
 1: 'N',
 2: 'XT',
 3: 'JJS',
 4: 'E2A',
 5: 'WRB',
 6: 'VB',
 7: 'TO',
 8: 'VBP',
 9: 'FW',
 10: 'EX',
 11: 'VBN',
 12: 'VBZ',
 13: 'NNS',
 14: 'VBG',
 15: 'RBR',
 16: 'WP',
 17: 'CT',
 18: 'PRP',
 19: 'JJR',
 20: 'CC',
 21: 'NNPS',
 22: 'CD',
 23: 'DT',
 24: 'NNP',
 25: 'PDT',
 26: 'LS',
 27: 'PP',
 28: 'PRP$',
 29: 'NN',
 30: 'JJ',
 31: 'RP',
 32: 'RBS',
 33: 'MD',
 34: 'WP$',
 35: 'RB',
 36: 'SYM',
 37: 'IN',
 38: 'PUNCT',
 39: 'WDT',
 40: 'POS',
 41: '<pad>'}
 ```
 
Parameters:
 ```
nepochs = 30 (stop at 18th)
batch_size = 32
batch_status = 32
learning_rate = 1e-5
early_stop = 3
max_length = 200
checkpoint: dmis-lab/biobert-base-cased-v1.2
```

See more in: https://github.com/lisaterumi/postagger-bio-english