File size: 3,407 Bytes
3596e33 3a0fca0 3596e33 806bfaf 3596e33 7ce5c13 3596e33 7ce5c13 806bfaf 7ce5c13 81d1955 d625630 81d1955 7ce5c13 c0d3369 81d1955 806bfaf f89da46 806bfaf c0d3369 806bfaf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
---
language: es
tags:
- zero-shot-classification
- nli
- pytorch
datasets:
- xnli
pipeline_tag: zero-shot-classification
license: apache-2
widget:
- text: "El autor se perfila, a los 50 años de su muerte, como uno de los grandes de su siglo"
candidate_labels: "cultura, sociedad, economia, salud, deportes"
---
# Zero-shot SELECTRA: A zero-shot classifier based on SELECTRA
*Zero-shot SELECTRA* is a [SELECTRA model](https://huggingface.co./Recognai/selectra_small) fine-tuned on the Spanish portion of the [XNLI dataset](https://huggingface.co./datasets/xnli). You can use it with Hugging Face's [Zero-shot pipeline](https://huggingface.co./transformers/master/main_classes/pipelines.html#transformers.ZeroShotClassificationPipeline) to make [zero-shot classifications](https://joeddav.github.io/blog/2020/05/29/ZSL.html).
In comparison to our previous zero-shot classifier [based on BETO](https://huggingface.co./Recognai/bert-base-spanish-wwm-cased-xnli), zero-shot SELECTRA is **much more lightweight**. As shown in the *Metrics* section, the *small* version (5 times fewer parameters) performs slightly worse, while the *medium* version (3 times fewer parameters) **outperforms** the BETO based zero-shot classifier.
## Usage
```python
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
model="Recognai/zeroshot_selectra_medium")
classifier(
"El autor se perfila, a los 50 años de su muerte, como uno de los grandes de su siglo",
candidate_labels=["cultura", "sociedad", "economia", "salud", "deportes"],
hypothesis_template="Este ejemplo es {}."
)
"""Output
{'sequence': 'El autor se perfila, a los 50 años de su muerte, como uno de los grandes de su siglo',
'labels': ['sociedad', 'cultura', 'salud', 'economia', 'deportes'],
'scores': [0.3711881935596466,
0.25650349259376526,
0.17355826497077942,
0.1641489565372467,
0.03460107371211052]}
"""
```
The `hypothesis_template` parameter is important and should be in Spanish. **In the widget on the right, this parameter is set to its default value: "This example is {}.", so different results are expected.**
## Metrics
| Model | Params | XNLI (acc) | \*MLSUM (acc) |
| --- | --- | --- | --- |
| [zs BETO](https://huggingface.co./Recognai/bert-base-spanish-wwm-cased-xnli) | 110M | 0.799 | 0.530 |
| [zs SELECTRA medium](https://huggingface.co./Recognai/zeroshot_selectra_medium) | 41M | **0.807** | **0.589** |
| zs SELECTRA small | **22M** | 0.795 | 0.446 |
\*evaluated with zero-shot learning (ZSL)
- **XNLI**: The stated accuracy refers to the test portion of the [XNLI dataset](https://huggingface.co./datasets/xnli), after finetuning the model on the training portion.
- **MLSUM**: For this accuracy we take the test set of the [MLSUM dataset](https://huggingface.co./datasets/mlsum) and classify the summaries of 5 selected labels. For details, check out our [evaluation notebook](https://github.com/recognai/selectra/blob/main/zero-shot_classifier/evaluation.ipynb)
## Training
Check out our [training notebook](https://github.com/recognai/selectra/blob/main/zero-shot_classifier/training.ipynb) for all the details.
## Authors
- David Fidalgo ([GitHub](https://github.com/dcfidalgo))
- Daniel Vila ([GitHub](https://github.com/dvsrepo))
- Francisco Aranda ([GitHub](https://github.com/frascuchon))
- Javier Lopez ([GitHub](https://github.com/javispp)) |