metadata
datasets:
- allenai/c4
- legacy-datasets/mc4
language:
- pt
pipeline_tag: text2text-generation
base_model: google-t5/t5-small
Introduction
ptt5-v2 models were trained for approximately 1 epoch over the "pt" subset of the mC4 dataset, on top of the Google T5 original checkpoints. These models need to be fine-tuned before being used on downstream tasks.
Citation
If you use our models, please cite:
@article{ptt5_2020,
title={PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data},
author={Carmo, Diedre and Piau, Marcos and Campiotti, Israel and Nogueira, Rodrigo and Lotufo, Roberto},
journal={arXiv preprint arXiv:2008.09144},
year={2020}
}