PORTULAN
/

albertina-1b5-portuguese-ptpt-encoder

@@ -39,7 +39,7 @@ widget:
 It is an **encoder** of the BERT family, based on the neural architecture Transformer and
 developed over the DeBERTa model, with most competitive performance for this language.
 It has different versions that were trained for different variants of Portuguese (PT),
-namely the European variant from Portugal (**PT-PT**) and the American variant from Brazil (**PT-BR**),
 and it is distributed free of charge and under a most permissible license.
 | Albertina's Family of Models                                                                             |
@@ -53,7 +53,7 @@ and it is distributed free of charge and under a most permissible license.
 | [**Albertina 100M PTPT**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptpt-encoder)       |
 | [**Albertina 100M PTBR**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptbr-encoder)       |
-**Albertina 1.5B PTPT** is the version for European **Portuguese** from **Portugal**,
 and to the best of our knowledge, this is an encoder specifically for this language and variant
 that,  at the time of its initial distribution, sets a new state of the art for it, and is made publicly available
 and distributed for reuse.
@@ -127,8 +127,8 @@ We opted for a learning rate of 1e-5 with linear decay and 10k warm-up steps.
 # Evaluation
-We resorted to [HyperGlue-PT](?), a **PTPT version of the GLUE and SUPERGLUE** benchmark.
-We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Translate](https://www.deepl.com/), which specifically provides translation from English to PT-PT as an option.
 | Model                         | RTE (Accuracy) | WNLI (Accuracy)| MRPC (F1) | STS-B (Pearson) | COPA (Accuracy) | CB (F1)    | MultiRC (F1) | BoolQ (Accuracy) |
 |-------------------------------|----------------|----------------|-----------|-----------------|-----------------|------------|--------------|------------------|
@@ -137,8 +137,8 @@ We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Trans
 | **Albertina 900M PTPT**      |  0.8339        | 0.4225         | **0.9171**| 0.8801          | 0.7033          | 0.6018     | 0.6728       | 0.8224           |
 | **Albertina 100M PTPT**      |  0.6919        | 0.4742         | 0.8047    | 0.8590          | n.a.            | 0.4529     | 0.6481       | 0.7578           |
 ||||||||||
-| **DeBERTa 1.5B EN**           |  0.8147        | 0.4554         | 0.8696    | 0.8557          | 0.5167          | 0.4901     | 0.6687       | 0.8347           |
-| **DeBERTa 100M EN**           |  0.6029        | **0.5634**     | 0.7802    | 0.8320          | n.a.            | 0.4698     | 0.6368       | 0.6829           |
@@ -153,8 +153,8 @@ We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Trans
 | **BERTimbau (335M)**          |  0.6446        | **0.5634**     | 0.8873    | 0.8842          | 0.6933          | 0.5438     | 0.6787       | 0.7783           |
 | **Albertina 100M PTBR**      |  0.6582        | **0.5634**     | 0.8149    | 0.8489          | n.a.            | 0.4771     | 0.6469       | 0.7537           |
 ||||||||||
-| **DeBERTa 1.5B EN**           |  0.7112        | **0.5634**     | 0.8545    | 0.0123          | 0.5700          | 0.4307     | 0.3639       | 0.6217           |
-| **DeBERTa 100M EN**           |  0.5716        | 0.5587         | 0.8060    | 0.8266          | n.a.            | 0.4739     | 0.6391       | 0.6838           |
 <br>
@@ -212,9 +212,11 @@ The model can be used by fine-tuning it for a specific task:
 When using or citing this model, kindly cite the following [publication](https://arxiv.org/abs/?):
 ``` latex
-@misc{albertina-pt,
-      title={Fostering the Ecosystem of Open Neural Encoders for Portuguese with Albertina PT-* family},
-      author={Rodrigo Santos and João Rodrigues and Luís Gomes and João Silva and António Branco and Henrique Lopes Cardoso
               and Tomás Freitas Osório and Bernardo Leite},
       year={2024},
       eprint={?},

 It is an **encoder** of the BERT family, based on the neural architecture Transformer and
 developed over the DeBERTa model, with most competitive performance for this language.
 It has different versions that were trained for different variants of Portuguese (PT),
+namely the European variant from Portugal (**PTPT**) and the American variant from Brazil (**PTBR**),
 and it is distributed free of charge and under a most permissible license.
 | Albertina's Family of Models                                                                             |
 | [**Albertina 100M PTPT**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptpt-encoder)       |
 | [**Albertina 100M PTBR**](https://huggingface.co/PORTULAN/albertina-100m-portuguese-ptbr-encoder)       |
+**Albertina 1.5B PTPT** is the version for **European Portuguese** from **Portugal**,
 and to the best of our knowledge, this is an encoder specifically for this language and variant
 that,  at the time of its initial distribution, sets a new state of the art for it, and is made publicly available
 and distributed for reuse.
 # Evaluation
+We resorted to [ExtraGLUE](https://huggingface.co/datasets/PORTULAN/extraglue), a **PTPT version of the GLUE and SUPERGLUE** benchmark.
+We automatically translated the tasks from GLUE and SUPERGLUE using [DeepL Translate](https://www.deepl.com/), which specifically provides translation from English to PTPT as an option.
 | Model                         | RTE (Accuracy) | WNLI (Accuracy)| MRPC (F1) | STS-B (Pearson) | COPA (Accuracy) | CB (F1)    | MultiRC (F1) | BoolQ (Accuracy) |
 |-------------------------------|----------------|----------------|-----------|-----------------|-----------------|------------|--------------|------------------|
 | **Albertina 900M PTPT**      |  0.8339        | 0.4225         | **0.9171**| 0.8801          | 0.7033          | 0.6018     | 0.6728       | 0.8224           |
 | **Albertina 100M PTPT**      |  0.6919        | 0.4742         | 0.8047    | 0.8590          | n.a.            | 0.4529     | 0.6481       | 0.7578           |
 ||||||||||
+| **DeBERTa 1.5B (English)**           |  0.8147        | 0.4554         | 0.8696    | 0.8557          | 0.5167          | 0.4901     | 0.6687       | 0.8347           |
+| **DeBERTa 100M (English)**           |  0.6029        | **0.5634**     | 0.7802    | 0.8320          | n.a.            | 0.4698     | 0.6368       | 0.6829           |
 | **BERTimbau (335M)**          |  0.6446        | **0.5634**     | 0.8873    | 0.8842          | 0.6933          | 0.5438     | 0.6787       | 0.7783           |
 | **Albertina 100M PTBR**      |  0.6582        | **0.5634**     | 0.8149    | 0.8489          | n.a.            | 0.4771     | 0.6469       | 0.7537           |
 ||||||||||
+| **DeBERTa 1.5B (English)**           |  0.7112        | **0.5634**     | 0.8545    | 0.0123          | 0.5700          | 0.4307     | 0.3639       | 0.6217           |
+| **DeBERTa 100M (English)**           |  0.5716        | 0.5587         | 0.8060    | 0.8266          | n.a.            | 0.4739     | 0.6391       | 0.6838           |
 <br>
 When using or citing this model, kindly cite the following [publication](https://arxiv.org/abs/?):
 ``` latex
+@misc{albertina-pt-fostering,
+      title={Fostering the Ecosystem of Open Neural Encoders for Portuguese
+             with Albertina PT-* family},
+      author={Rodrigo Santos and João Rodrigues and Luís Gomes and João Silva
+              and António Branco and Henrique Lopes Cardoso
               and Tomás Freitas Osório and Bernardo Leite},
       year={2024},
       eprint={?},