Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# XLM-RoBERTa-large-sag
|
2 |
|
3 |
## Model description
|
4 |
|
5 |
This is a model based on the [XLM-RoBERTa large](https://huggingface.co/xlm-roberta-large) topology (provided by Facebook, see original [paper](https://arxiv.org/abs/1911.02116)) with additional training on two sets of medicine-domain texts:
|
6 |
-
* 250.000 text reviews on medicines (1000-tokens-long in average) collected from the site irecommend.ru;
|
7 |
* the raw part of the [RuDReC corpus](https://github.com/cimm-kzn/RuDReC) (about 1.4 million texts, see [paper](https://arxiv.org/abs/2004.03659)).
|
8 |
|
9 |
The XLM-RoBERTa-large calculations for one epoch on this data were performed using one Nvidia Tesla v100 and the Huggingface Transformers library.
|
|
|
1 |
+
---
|
2 |
+
language: multilingual
|
3 |
+
thumbnail: "url to a thumbnail used in social sharing"
|
4 |
+
tags: exbert
|
5 |
+
license: apache-2.0
|
6 |
+
---
|
7 |
+
|
8 |
# XLM-RoBERTa-large-sag
|
9 |
|
10 |
## Model description
|
11 |
|
12 |
This is a model based on the [XLM-RoBERTa large](https://huggingface.co/xlm-roberta-large) topology (provided by Facebook, see original [paper](https://arxiv.org/abs/1911.02116)) with additional training on two sets of medicine-domain texts:
|
13 |
+
* about 250.000 text reviews on medicines (1000-tokens-long in average) collected from the site irecommend.ru;
|
14 |
* the raw part of the [RuDReC corpus](https://github.com/cimm-kzn/RuDReC) (about 1.4 million texts, see [paper](https://arxiv.org/abs/2004.03659)).
|
15 |
|
16 |
The XLM-RoBERTa-large calculations for one epoch on this data were performed using one Nvidia Tesla v100 and the Huggingface Transformers library.
|