|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- tay-yozhik/NaturalText |
|
language: |
|
- ru |
|
--- |
|
|
|
# NaturalRoBERTa |
|
|
|
This is a pre-trained model of type [RoBERTa](https://arxiv.org/abs/1907.11692). |
|
NaturalRoBERTa is built on a dataset obtained from open sources: three news sub-corpuses [Taiga](https://github.com/TatianaShavrina/taiga_site) (Lenta.ru, Interfax, N+1) and [Russian Wikipedia texts](https://ru.wikipedia.org/). |
|
|
|
# Evaluation |
|
|
|
This model was evaluated on [RussianSuperGLUE tests](https://russiansuperglue.com/): |
|
| Task | Result | Metrics | |
|
|-------|----------|---------| |
|
| LiDiRus | 0,0 | Matthews Correlation Coefficient | |
|
| RCB | 0,217 / 0,484 | F1 / Accuracy | |
|
| PARus | 0,498 | Accuracy | |
|
| TERRa | 0,487 | Accuracy | |
|
| RUSSE | 0,587 | Accuracy | |
|
| RWSD | 0,669 | Accuracy | |