Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,67 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: es
|
3 |
+
datasets:
|
4 |
+
- BSC-TeMU/SQAC
|
5 |
+
widget:
|
6 |
+
- text: "question: ¿Cuál es el nombre que se le da a la unidad morfológica y funcional de los seres vivos?? context: La célula (del latín cellula, diminutivo de cella, ‘celda’) es la unidad morfológica y funcional de todo ser vivo. De hecho, la célula es el elemento de menor tamaño que puede considerarse vivo.\u200b De este modo, puede clasificarse a los organismos vivos según el número de células que posean: si solo tienen una, se les denomina unicelulares (como pueden ser los protozoos o las bacterias, organismos microscópicos); si poseen más, se les llama pluricelulares. En estos últimos el número de células es variable: de unos pocos cientos, como en algunos nematodos, a cientos de billones (1014), como en el caso del ser humano. Las células suelen poseer un tamaño de 10 µm y una masa de 1 ng, si bien existen células mucho mayores."
|
7 |
+
---
|
8 |
+
|
9 |
+
# Spanish-T5-small fine-tuned on **SQAC** for QA 📖❓
|
10 |
+
[Google's mT5-small](https://huggingface.co/flax-community/spanish-t5-small) fine-tuned on [SQAC](https://huggingface.co/datasets/BSC-TeMU/SQAC) (secondary task) for **Q&A** downstream task.
|
11 |
+
|
12 |
+
## Details of Spanish T5 (small)
|
13 |
+
|
14 |
+
|
15 |
+
|
16 |
+
|
17 |
+
## Details of the dataset 📚
|
18 |
+
|
19 |
+
|
20 |
+
|
21 |
+
|
22 |
+
|
23 |
+
## Results on test dataset 📝
|
24 |
+
|
25 |
+
| Metric | # Value |
|
26 |
+
| ------ | --------- |
|
27 |
+
| **EM** | **41.65** |
|
28 |
+
|
29 |
+
|
30 |
+
|
31 |
+
## Model in Action 🚀
|
32 |
+
|
33 |
+
```python
|
34 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
35 |
+
import torch
|
36 |
+
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
37 |
+
tokenizer = AutoTokenizer.from_pretrained("mrm8488/mT5-small-finetuned-tydiqa-for-xqa")
|
38 |
+
model = AutoModelForCausalLM.from_pretrained("mrm8488/mT5-small-finetuned-tydiqa-for-xqa").to(device)
|
39 |
+
|
40 |
+
def get_response(question, context, max_length=32):
|
41 |
+
input_text = 'question: %s context: %s' % (question, context)
|
42 |
+
features = tokenizer([input_text], return_tensors='pt')
|
43 |
+
|
44 |
+
output = model.generate(input_ids=features['input_ids'].to(device),
|
45 |
+
attention_mask=features['attention_mask'].to(device),
|
46 |
+
max_length=max_length)
|
47 |
+
|
48 |
+
return tokenizer.decode(output[0], skip_special_tokens=True)
|
49 |
+
|
50 |
+
# Some examples in different languages
|
51 |
+
|
52 |
+
context = 'HuggingFace won the best Demo paper at EMNLP2020.'
|
53 |
+
question = 'What won HuggingFace?'
|
54 |
+
get_response(question, context)
|
55 |
+
|
56 |
+
context = 'HuggingFace ganó la mejor demostración con su paper en la EMNLP2020.'
|
57 |
+
question = 'Qué ganó HuggingFace?'
|
58 |
+
get_response(question, context)
|
59 |
+
|
60 |
+
context = 'HuggingFace выиграл лучшую демонстрационную работу на EMNLP2020.'
|
61 |
+
question = 'Что победило в HuggingFace?'
|
62 |
+
get_response(question, context)
|
63 |
+
```
|
64 |
+
|
65 |
+
> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
|
66 |
+
|
67 |
+
> Made with <span style="color: #e25555;">♥</span> in Spain
|