Migrate model card from transformers-repo
Browse filesRead announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/sachaarbonel/bert-italian-cased-finetuned-pos/README.md
README.md
ADDED
@@ -0,0 +1,96 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: it
|
3 |
+
datasets:
|
4 |
+
- xtreme
|
5 |
+
---
|
6 |
+
|
7 |
+
# Italian-Bert (Italian Bert) + POS ππ·
|
8 |
+
|
9 |
+
This model is a fine-tuned on [xtreme udpos Italian](https://huggingface.co/nlp/viewer/?dataset=xtreme&config=udpos.Italian) version of [Bert Base Italian](https://huggingface.co/dbmdz/bert-base-italian-cased) for **POS** downstream task.
|
10 |
+
|
11 |
+
## Details of the downstream task (POS) - Dataset
|
12 |
+
|
13 |
+
- [Dataset: xtreme udpos Italian](https://huggingface.co/nlp/viewer/?dataset=xtreme&config=udpos.Italian) π
|
14 |
+
|
15 |
+
| Dataset | # Examples |
|
16 |
+
| ---------------------- | ----- |
|
17 |
+
| Train | 716 K |
|
18 |
+
| Dev | 85 K |
|
19 |
+
|
20 |
+
- [Fine-tune on NER script provided by @stefan-it](https://raw.githubusercontent.com/stefan-it/fine-tuned-berts-seq/master/scripts/preprocess.py)
|
21 |
+
|
22 |
+
- Labels covered:
|
23 |
+
|
24 |
+
```
|
25 |
+
ADJ
|
26 |
+
ADP
|
27 |
+
ADV
|
28 |
+
AUX
|
29 |
+
CCONJ
|
30 |
+
DET
|
31 |
+
INTJ
|
32 |
+
NOUN
|
33 |
+
NUM
|
34 |
+
PART
|
35 |
+
PRON
|
36 |
+
PROPN
|
37 |
+
PUNCT
|
38 |
+
SCONJ
|
39 |
+
SYM
|
40 |
+
VERB
|
41 |
+
X
|
42 |
+
```
|
43 |
+
|
44 |
+
## Metrics on evaluation set π§Ύ
|
45 |
+
|
46 |
+
| Metric | # score |
|
47 |
+
| :------------------------------------------------------------------------------------: | :-------: |
|
48 |
+
| F1 | **97.25**
|
49 |
+
| Precision | **97.15** |
|
50 |
+
| Recall | **97.36** |
|
51 |
+
|
52 |
+
## Model in action π¨
|
53 |
+
|
54 |
+
|
55 |
+
Example of usage
|
56 |
+
|
57 |
+
```python
|
58 |
+
from transformers import pipeline
|
59 |
+
|
60 |
+
nlp_pos = pipeline(
|
61 |
+
"ner",
|
62 |
+
model="sachaarbonel/bert-italian-cased-finetuned-pos",
|
63 |
+
tokenizer=(
|
64 |
+
'sachaarbonel/bert-spanish-cased-finetuned-pos',
|
65 |
+
{"use_fast": False}
|
66 |
+
))
|
67 |
+
|
68 |
+
|
69 |
+
text = 'Roma Γ¨ la Capitale d'Italia.'
|
70 |
+
|
71 |
+
nlp_pos(text)
|
72 |
+
|
73 |
+
'''
|
74 |
+
Output:
|
75 |
+
--------
|
76 |
+
[{'entity': 'PROPN', 'index': 1, 'score': 0.9995346665382385, 'word': 'roma'},
|
77 |
+
{'entity': 'AUX', 'index': 2, 'score': 0.9966597557067871, 'word': 'e'},
|
78 |
+
{'entity': 'DET', 'index': 3, 'score': 0.9994786977767944, 'word': 'la'},
|
79 |
+
{'entity': 'NOUN',
|
80 |
+
'index': 4,
|
81 |
+
'score': 0.9995198249816895,
|
82 |
+
'word': 'capitale'},
|
83 |
+
{'entity': 'ADP', 'index': 5, 'score': 0.9990894198417664, 'word': 'd'},
|
84 |
+
{'entity': 'PART', 'index': 6, 'score': 0.57159024477005, 'word': "'"},
|
85 |
+
{'entity': 'PROPN',
|
86 |
+
'index': 7,
|
87 |
+
'score': 0.9994804263114929,
|
88 |
+
'word': 'italia'},
|
89 |
+
{'entity': 'PUNCT', 'index': 8, 'score': 0.9772886633872986, 'word': '.'}]
|
90 |
+
'''
|
91 |
+
```
|
92 |
+
Yeah! Not too bad π
|
93 |
+
|
94 |
+
> Created by [Sacha Arbonel/@sachaarbonel](https://twitter.com/sachaarbonel) | [LinkedIn](https://www.linkedin.com/in/sacha-arbonel)
|
95 |
+
|
96 |
+
> Made with <span style="color: #e25555;">♥</span> in Paris
|