init: model card
Browse files
README.md
CHANGED
@@ -1,3 +1,51 @@
|
|
1 |
-
---
|
2 |
-
license: afl-3.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: afl-3.0
|
3 |
+
language:
|
4 |
+
- ja
|
5 |
+
metrics:
|
6 |
+
- seqeval
|
7 |
+
library_name: transformers
|
8 |
+
pipeline_tag: token-classification
|
9 |
+
---
|
10 |
+
# SMM4H-2024 Task 2 Japanese NER
|
11 |
+
|
12 |
+
## Overview
|
13 |
+
|
14 |
+
This is a named entity extraction model created by fine-tuning [daisaku-s/medtxt_ner_roberta](https://huggingface.co/daisaku-s/medtxt_ner_roberta) on [SMM4H 2024 Task 2a](https://healthlanguageprocessing.org/smm4h-2024/) corpus.
|
15 |
+
|
16 |
+
Tag set (IOB2 format):
|
17 |
+
* DRUG
|
18 |
+
* DISORDER
|
19 |
+
* FUNCTION
|
20 |
+
|
21 |
+
## Usage
|
22 |
+
|
23 |
+
```python
|
24 |
+
from transformers import BertForTokenClassification, AutoTokenizer
|
25 |
+
|
26 |
+
import torch
|
27 |
+
text = "サンプルテキスト"
|
28 |
+
model_name = "yseop/SMM4H2024_Task2a_ja"
|
29 |
+
with torch.inference_mode():
|
30 |
+
model = BertForTokenClassification.from_pretrained(model_name).eval()
|
31 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
32 |
+
idx2tag = model.config.id2label
|
33 |
+
vecs = tokenizer(text,
|
34 |
+
padding=True,
|
35 |
+
truncation=True,
|
36 |
+
return_tensors="pt")
|
37 |
+
ner_logits = model(input_ids=vecs["input_ids"],
|
38 |
+
attention_mask=vecs["attention_mask"])
|
39 |
+
idx = torch.argmax(ner_logits.logits, dim=2).detach().cpu().numpy().tolist()[0]
|
40 |
+
token = [tokenizer.convert_ids_to_tokens(v) for v in vecs["input_ids"]][0][1:-1]
|
41 |
+
pred_tag = [idx2tag[x] for x in idx][1:-1]
|
42 |
+
```
|
43 |
+
|
44 |
+
## Results
|
45 |
+
|
46 |
+
|NE |tp |fp |fn |precision| recall| f1|
|
47 |
+
|---|---:|---:|---:|---:|---:|---:|
|
48 |
+
|DISORDER| 588 |409| 330| 0.5898| 0.6405| 0.6141|
|
49 |
+
|DRUG| 307 |143 |169| 0.6822| 0.645| 0.6631|
|
50 |
+
|FUNCTION| 69 |160 |170| 0.3013| 0.2887| 0.2949|
|
51 |
+
|all| 964| 712 |669 |0.5752 |0.5903 |0.5827|
|