Transducens
/

IbRo-nllb

Model card Files Files and versions Community

agaliano commited on Oct 8

Commit

f73ce2b

•

1 Parent(s): e54dff5

Update README.md

Files changed (1) hide show

README.md +39 -1

README.md CHANGED Viewed

@@ -2,6 +2,26 @@
 license: apache-2.0
 ---
 ## Usage
 ```python
@@ -18,4 +38,22 @@ inputs = tokenizer(sentence, return_tensors="pt")
 translated_tokens = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["arg_Latn"])
 print(tokenizer.batch_decode(translated_tokens, skip_special_tokens=True))
-```

 license: apache-2.0
 ---
+## Overview
+This model was presented at the [WMT24 Shared Task on Translation into Low-Resource Languages of Spain](https://www2.statmt.org/wmt24/romance-task.html)
+as a submission by the [Transducens](https://transducens.dlsi.ua.es/) team from the [Universitat d'Alacant](https://www.ua.es/). It is a many-to-many model
+capable of translating between several languages of the Iberian Peninsula.
+**The model is based on [NLLB-1.3B](https://huggingface.co/facebook/nllb-200-1.3B), fine-tuned for the following languages:**
++ Spanish &harr; Asturian
++ Spanish &harr; Aragonese
++ Spanish &harr; Aranese
++ Spanish &harr; Galician
++ Spanish &harr; Catalan
++ Spanish &harr; Valencian
++ Catalan &harr; Aranese
+**The new language tokens are:**
++ Aragonese: arg_Latn
++ Aranese: arn_Latn
++ Valencian: val_Latn
 ## Usage
 ```python
 translated_tokens = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["arg_Latn"])
 print(tokenizer.batch_decode(translated_tokens, skip_special_tokens=True))
+```
+## Citation
+If you use this model, please cite it as follows:
+```
+@inproceedings{wmt2024-galiano-jimenez,
+    title = "Universitat d'{A}lacant's Submission to the {WMT} 2024 {S}hared {T}ask on {T}ranslating into {L}ow-{R}esource {L}anguages of {S}pain",
+    author = "Galiano-Jim{\'e}nez, Aar{\'o}n and S{\'a}nchez-Cartagena, V{\'i}ctor M and P{\'e}rez-Ortiz, Juan Antonio and S{\'a}nchez-Mart{\'i}nez, Felipe",
+    editor = "Koehn, Philipp  and Haddow, Barry  and Kocmi, Tom  and  Monz, Christof",
+    booktitle = "Proceedings of the Ninth Conference on Machine Translation",
+    month = nov,
+    year = "2024",
+    address = "Miami",
+    publisher = "Association for Computational Linguistics",
+}
+```
+## Acknowledgements
+This model has been produced as part of the research project [Lightweight neural translation technologies for low-resource languages (LiLowLa)](https://transducens.dlsi.ua.es/lilowla/) (PID2021-127999NB-I00) funded by the Spanish Ministry of Science and Innovation (MCIN), the Spanish Research Agency (AEI/10.13039/501100011033) and the European Regional Development Fund A way to make Europe.