Fill-Mask
Transformers
PyTorch
Portuguese
deberta-v2
albertina-pt*
albertina-100m-portuguese-ptpt
albertina-100m-portuguese-ptbr
albertina-900m-portuguese-ptpt
albertina-900m-portuguese-ptbr
albertina-1b5-portuguese-ptpt
albertina-1b5-portuguese-ptbr
bert
deberta
portuguese
encoder
foundation model
Inference Endpoints
jarodrigues
commited on
Commit
•
0adcda1
1
Parent(s):
d73d950
Update README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,7 @@ widget:
|
|
39 |
It is an **encoder** of the BERT family, based on the neural architecture Transformer and
|
40 |
developed over the DeBERTa model, with most competitive performance for this language.
|
41 |
It has different versions that were trained for different variants of Portuguese (PT),
|
42 |
-
namely the European variant from Portugal (**
|
43 |
and it is distributed free of charge and under a most permissible license.
|
44 |
|
45 |
| Albertina's Family of Models |
|
@@ -53,7 +53,7 @@ and it is distributed free of charge and under a most permissible license.
|
|
53 |
| [**Albertina 100M PTPT**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptpt-encoder) |
|
54 |
| [**Albertina 100M PTBR**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptbr-encoder) |
|
55 |
|
56 |
-
**Albertina 1.5B PTPT** is the version for European
|
57 |
and to the best of our knowledge, this is an encoder specifically for this language and variant
|
58 |
that, at the time of its initial distribution, sets a new state of the art for it, and is made publicly available
|
59 |
and distributed for reuse.
|
@@ -127,8 +127,8 @@ We opted for a learning rate of 1e-5 with linear decay and 10k warm-up steps.
|
|
127 |
# Evaluation
|
128 |
|
129 |
|
130 |
-
We resorted to [
|
131 |
-
We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Translate](https://www.deepl.com/), which specifically provides translation from English to
|
132 |
|
133 |
| Model | RTE (Accuracy) | WNLI (Accuracy)| MRPC (F1) | STS-B (Pearson) | COPA (Accuracy) | CB (F1) | MultiRC (F1) | BoolQ (Accuracy) |
|
134 |
|-------------------------------|----------------|----------------|-----------|-----------------|-----------------|------------|--------------|------------------|
|
@@ -137,8 +137,8 @@ We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Trans
|
|
137 |
| **Albertina 900M PTPT** | 0.8339 | 0.4225 | **0.9171**| 0.8801 | 0.7033 | 0.6018 | 0.6728 | 0.8224 |
|
138 |
| **Albertina 100M PTPT** | 0.6919 | 0.4742 | 0.8047 | 0.8590 | n.a. | 0.4529 | 0.6481 | 0.7578 |
|
139 |
||||||||||
|
140 |
-
| **DeBERTa 1.5B
|
141 |
-
| **DeBERTa 100M
|
142 |
|
143 |
|
144 |
|
@@ -153,8 +153,8 @@ We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Trans
|
|
153 |
| **BERTimbau (335M)** | 0.6446 | **0.5634** | 0.8873 | 0.8842 | 0.6933 | 0.5438 | 0.6787 | 0.7783 |
|
154 |
| **Albertina 100M PTBR** | 0.6582 | **0.5634** | 0.8149 | 0.8489 | n.a. | 0.4771 | 0.6469 | 0.7537 |
|
155 |
||||||||||
|
156 |
-
| **DeBERTa 1.5B
|
157 |
-
| **DeBERTa 100M
|
158 |
|
159 |
|
160 |
<br>
|
@@ -212,9 +212,11 @@ The model can be used by fine-tuning it for a specific task:
|
|
212 |
When using or citing this model, kindly cite the following [publication](https://arxiv.org/abs/?):
|
213 |
|
214 |
``` latex
|
215 |
-
@misc{albertina-pt,
|
216 |
-
title={Fostering the Ecosystem of Open Neural Encoders for Portuguese
|
217 |
-
|
|
|
|
|
218 |
and Tomás Freitas Osório and Bernardo Leite},
|
219 |
year={2024},
|
220 |
eprint={?},
|
|
|
39 |
It is an **encoder** of the BERT family, based on the neural architecture Transformer and
|
40 |
developed over the DeBERTa model, with most competitive performance for this language.
|
41 |
It has different versions that were trained for different variants of Portuguese (PT),
|
42 |
+
namely the European variant from Portugal (**PTPT**) and the American variant from Brazil (**PTBR**),
|
43 |
and it is distributed free of charge and under a most permissible license.
|
44 |
|
45 |
| Albertina's Family of Models |
|
|
|
53 |
| [**Albertina 100M PTPT**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptpt-encoder) |
|
54 |
| [**Albertina 100M PTBR**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptbr-encoder) |
|
55 |
|
56 |
+
**Albertina 1.5B PTPT** is the version for **European Portuguese** from **Portugal**,
|
57 |
and to the best of our knowledge, this is an encoder specifically for this language and variant
|
58 |
that, at the time of its initial distribution, sets a new state of the art for it, and is made publicly available
|
59 |
and distributed for reuse.
|
|
|
127 |
# Evaluation
|
128 |
|
129 |
|
130 |
+
We resorted to [ExtraGLUE](https://huggingface.co/datasets/PORTULAN/extraglue), a **PTPT version of the GLUE and SUPERGLUE** benchmark.
|
131 |
+
We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Translate](https://www.deepl.com/), which specifically provides translation from English to PTPT as an option.
|
132 |
|
133 |
| Model | RTE (Accuracy) | WNLI (Accuracy)| MRPC (F1) | STS-B (Pearson) | COPA (Accuracy) | CB (F1) | MultiRC (F1) | BoolQ (Accuracy) |
|
134 |
|-------------------------------|----------------|----------------|-----------|-----------------|-----------------|------------|--------------|------------------|
|
|
|
137 |
| **Albertina 900M PTPT** | 0.8339 | 0.4225 | **0.9171**| 0.8801 | 0.7033 | 0.6018 | 0.6728 | 0.8224 |
|
138 |
| **Albertina 100M PTPT** | 0.6919 | 0.4742 | 0.8047 | 0.8590 | n.a. | 0.4529 | 0.6481 | 0.7578 |
|
139 |
||||||||||
|
140 |
+
| **DeBERTa 1.5B (English)** | 0.8147 | 0.4554 | 0.8696 | 0.8557 | 0.5167 | 0.4901 | 0.6687 | 0.8347 |
|
141 |
+
| **DeBERTa 100M (English)** | 0.6029 | **0.5634** | 0.7802 | 0.8320 | n.a. | 0.4698 | 0.6368 | 0.6829 |
|
142 |
|
143 |
|
144 |
|
|
|
153 |
| **BERTimbau (335M)** | 0.6446 | **0.5634** | 0.8873 | 0.8842 | 0.6933 | 0.5438 | 0.6787 | 0.7783 |
|
154 |
| **Albertina 100M PTBR** | 0.6582 | **0.5634** | 0.8149 | 0.8489 | n.a. | 0.4771 | 0.6469 | 0.7537 |
|
155 |
||||||||||
|
156 |
+
| **DeBERTa 1.5B (English)** | 0.7112 | **0.5634** | 0.8545 | 0.0123 | 0.5700 | 0.4307 | 0.3639 | 0.6217 |
|
157 |
+
| **DeBERTa 100M (English)** | 0.5716 | 0.5587 | 0.8060 | 0.8266 | n.a. | 0.4739 | 0.6391 | 0.6838 |
|
158 |
|
159 |
|
160 |
<br>
|
|
|
212 |
When using or citing this model, kindly cite the following [publication](https://arxiv.org/abs/?):
|
213 |
|
214 |
``` latex
|
215 |
+
@misc{albertina-pt-fostering,
|
216 |
+
title={Fostering the Ecosystem of Open Neural Encoders for Portuguese
|
217 |
+
with Albertina PT-* family},
|
218 |
+
author={Rodrigo Santos and João Rodrigues and Luís Gomes and João Silva
|
219 |
+
and António Branco and Henrique Lopes Cardoso
|
220 |
and Tomás Freitas Osório and Bernardo Leite},
|
221 |
year={2024},
|
222 |
eprint={?},
|