|
--- |
|
language: |
|
- it |
|
license: apache-2.0 |
|
tags: |
|
- italian |
|
- sequence-to-sequence |
|
- style-transfer |
|
- efficient |
|
- formality-style-transfer |
|
datasets: |
|
- yahoo/xformal_it |
|
widget: |
|
- text: "maronn qualcuno mi spieg' CHECCOSA SUCCEDE?!?!" |
|
- text: "wellaaaaaaa, ma fraté sei proprio troppo simpatiko, grazieeee!!" |
|
- text: "nn capisco xke tt i ragazzi lo fanno" |
|
- text: "IT5 è SUPERMEGA BRAVISSIMO a capire tt il vernacolo italiano!!!" |
|
metrics: |
|
- rouge |
|
- bertscore |
|
model-index: |
|
- name: it5-efficient-small-el32-informal-to-formal |
|
results: |
|
- task: |
|
type: formality-style-transfer |
|
name: "Informal-to-formal Style Transfer" |
|
dataset: |
|
type: xformal_it |
|
name: "XFORMAL (Italian Subset)" |
|
metrics: |
|
- type: rouge1 |
|
value: 0.430 |
|
name: "Avg. Test Rouge1" |
|
- type: rouge2 |
|
value: 0.221 |
|
name: "Avg. Test Rouge2" |
|
- type: rougeL |
|
value: 0.408 |
|
name: "Avg. Test RougeL" |
|
- type: bertscore |
|
value: 0.630 |
|
name: "Avg. Test BERTScore" |
|
args: |
|
- model_type: "dbmdz/bert-base-italian-xxl-uncased" |
|
- lang: "it" |
|
- num_layers: 10 |
|
- rescale_with_baseline: True |
|
- baseline_path: "bertscore_baseline_ita.tsv" |
|
--- |
|
|
|
# IT5 Cased Small Efficient EL32 for Informal-to-formal Style Transfer 🧐 |
|
|
|
*Shout-out to [Stefan Schweter](https://github.com/stefan-it) for contributing the pre-trained efficient model!* |
|
|
|
This repository contains the checkpoint for the [IT5 Cased Small Efficient EL32](https://huggingface.co./it5/it5-efficient-small-el32) model fine-tuned on Informal-to-formal style transfer on the Italian subset of the XFORMAL dataset as part of the experiments of the paper [IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation](https://arxiv.org/abs/2203.03759) by [Gabriele Sarti](https://gsarti.com) and [Malvina Nissim](https://malvinanissim.github.io). |
|
|
|
Efficient IT5 models differ from the standard ones by adopting a different vocabulary that enables cased text generation and an [optimized model architecture](https://arxiv.org/abs/2109.10686) to improve performances while reducing parameter count. The Small-EL32 replaces the original encoder from the T5 Small architecture with a 32-layer deep encoder, showing improved performances over the base model. |
|
|
|
A comprehensive overview of other released materials is provided in the [gsarti/it5](https://github.com/gsarti/it5) repository. Refer to the paper for additional details concerning the reported scores and the evaluation approach. |
|
|
|
## Using the model |
|
|
|
Model checkpoints are available for usage in Tensorflow, Pytorch and JAX. They can be used directly with pipelines as: |
|
|
|
```python |
|
from transformers import pipelines |
|
|
|
i2f = pipeline("text2text-generation", model='it5/it5-efficient-small-el32-informal-to-formal') |
|
i2f("nn capisco xke tt i ragazzi lo fanno") |
|
>>> [{"generated_text": "non comprendo perché tutti i ragazzi agiscono così"}] |
|
``` |
|
|
|
or loaded using autoclasses: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("it5/it5-efficient-small-el32-informal-to-formal") |
|
model = AutoModelForSeq2SeqLM.from_pretrained("it5/it5-efficient-small-el32-informal-to-formal") |
|
``` |
|
|
|
If you use this model in your research, please cite our work as: |
|
|
|
```bibtex |
|
@article{sarti-nissim-2022-it5, |
|
title={{IT5}: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation}, |
|
author={Sarti, Gabriele and Nissim, Malvina}, |
|
journal={ArXiv preprint 2203.03759}, |
|
url={https://arxiv.org/abs/2203.03759}, |
|
year={2022}, |
|
month={mar} |
|
} |
|
``` |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0003 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 10.0 |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.15.0 |
|
- Pytorch 1.10.0+cu102 |
|
- Datasets 1.17.0 |
|
- Tokenizers 0.10.3 |
|
|