Commit
·
63949e3
1
Parent(s):
62e4d47
Update README.md
Browse files
README.md
CHANGED
@@ -16,6 +16,7 @@ tags:
|
|
16 |
|
17 |
[**READ THE FULL PAPER**](https://arxiv.org/abs/2111.09453)
|
18 |
[Github Repository](https://github.com/pysentimiento/robertuito)
|
|
|
19 |
|
20 |
*RoBERTuito* is a pre-trained language model for user-generated content in Spanish, trained following RoBERTa guidelines on 500 million tweets. *RoBERTuito* comes in 3 flavors: cased, uncased, and uncased+deaccented.
|
21 |
|
@@ -75,7 +76,11 @@ tokenizer.tokenize(preprocessed_text)
|
|
75 |
# ['<s>','▁Esto','▁es','▁un','▁tweet','▁estoy','▁usando','▁','▁hashtag','▁','▁ro','bert','uito','▁@usuario','▁','▁emoji','▁cara','▁revolviéndose','▁de','▁la','▁risa','▁emoji','</s>']
|
76 |
```
|
77 |
|
78 |
-
We are working on integrating this preprocessing step into a Tokenizer within `transformers` library
|
|
|
|
|
|
|
|
|
79 |
## Citation
|
80 |
|
81 |
If you use *RoBERTuito*, please cite our paper:
|
|
|
16 |
|
17 |
[**READ THE FULL PAPER**](https://arxiv.org/abs/2111.09453)
|
18 |
[Github Repository](https://github.com/pysentimiento/robertuito)
|
19 |
+
[![Test it in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1WcubR0kbqT289XupSnN5-STe7HafyKpf#scrollTo=SF-n4IdjnoYk)
|
20 |
|
21 |
*RoBERTuito* is a pre-trained language model for user-generated content in Spanish, trained following RoBERTa guidelines on 500 million tweets. *RoBERTuito* comes in 3 flavors: cased, uncased, and uncased+deaccented.
|
22 |
|
|
|
76 |
# ['<s>','▁Esto','▁es','▁un','▁tweet','▁estoy','▁usando','▁','▁hashtag','▁','▁ro','bert','uito','▁@usuario','▁','▁emoji','▁cara','▁revolviéndose','▁de','▁la','▁risa','▁emoji','</s>']
|
77 |
```
|
78 |
|
79 |
+
We are working on integrating this preprocessing step into a Tokenizer within `transformers` library.
|
80 |
+
|
81 |
+
You can check a text classification example in this notebook:
|
82 |
+
[![Test it in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1WcubR0kbqT289XupSnN5-STe7HafyKpf#scrollTo=SF-n4IdjnoYk)
|
83 |
+
|
84 |
## Citation
|
85 |
|
86 |
If you use *RoBERTuito*, please cite our paper:
|