unicamp-dl
/

ptt5-v2-small

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

marcospiau commited on Jun 11, 2024

Commit

ef5e1a3

·

verified ·

1 Parent(s): 71977a8

Update README.md

Files changed (1) hide show

README.md +12 -2

README.md CHANGED Viewed

@@ -8,11 +8,21 @@ pipeline_tag: text2text-generation
 base_model: google-t5/t5-small
 ---
 ## Introduction
-ptt5-v2 models were trained for approximately 1 epoch over the "pt" subset of the mC4 dataset, on top of the Google T5 original checkpoints. These models need to be fine-tuned before being used on downstream tasks.
-# Citation
 If you use our models, please cite:
     @article{ptt5_2020,

 base_model: google-t5/t5-small
 ---
+# ptt5-v2-small
 ## Introduction
+ptt5-v2 models are pretrained T5 models tailored for the Portuguese language, continuing from Google's original checkpoints with sizes from t5-small to t5-3B.
+For further information about the pretraining process and the complete study, please refer to our paper [PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data](https://arxiv.org/abs/2008.09144).
+## Usage
+```python
+from transformers import T5Tokenizer, T5ForConditionalGeneration
+tokenizer = T5Tokenizer.from_pretrained("unicamp-dl/ptt5-v2-small")
+model = T5ForConditionalGeneration.from_pretrained("unicamp-dl/ptt5-v2-small")
+```
+## Citation
 If you use our models, please cite:
     @article{ptt5_2020,