|
--- |
|
pipeline_tag: translation |
|
library_name: comet |
|
language: |
|
- multilingual |
|
- af |
|
- am |
|
- ar |
|
- as |
|
- az |
|
- be |
|
- bg |
|
- bn |
|
- br |
|
- bs |
|
- ca |
|
- cs |
|
- cy |
|
- da |
|
- de |
|
- el |
|
- en |
|
- eo |
|
- es |
|
- et |
|
- eu |
|
- fa |
|
- fi |
|
- fr |
|
- fy |
|
- ga |
|
- gd |
|
- gl |
|
- gu |
|
- ha |
|
- he |
|
- hi |
|
- hr |
|
- hu |
|
- hy |
|
- id |
|
- is |
|
- it |
|
- ja |
|
- jv |
|
- ka |
|
- kk |
|
- km |
|
- kn |
|
- ko |
|
- ku |
|
- ky |
|
- la |
|
- lo |
|
- lt |
|
- lv |
|
- mg |
|
- mk |
|
- ml |
|
- mn |
|
- mr |
|
- ms |
|
- my |
|
- ne |
|
- nl |
|
- 'no' |
|
- om |
|
- or |
|
- pa |
|
- pl |
|
- ps |
|
- pt |
|
- ro |
|
- ru |
|
- sa |
|
- sd |
|
- si |
|
- sk |
|
- sl |
|
- so |
|
- sq |
|
- sr |
|
- su |
|
- sv |
|
- sw |
|
- ta |
|
- te |
|
- th |
|
- tl |
|
- tr |
|
- ug |
|
- uk |
|
- ur |
|
- uz |
|
- vi |
|
- xh |
|
- yi |
|
- zh |
|
license: apache-2.0 |
|
base_model: |
|
- FacebookAI/xlm-roberta-large |
|
--- |
|
|
|
# COMET-partial |
|
|
|
This model is based on [COMET-early-exit](https://github.com/zouharvi/COMET-early-exit), which is a fork but not compatible with original Unbabel's COMET. |
|
To run the model, you need to first install this version of COMET either with: |
|
```bash |
|
pip install "git+https://github.com/zouharvi/COMET-early-exit#egg=comet-early-exit&subdirectory=comet_early_exit" |
|
``` |
|
or in editable mode: |
|
```bash |
|
git clone https://github.com/zouharvi/COMET-early-exit.git |
|
cd COMET-early-exit |
|
pip3 install -e comet_early_exit |
|
``` |
|
|
|
|
|
This model is described in the appendix in the paper. |
|
It is able to score even *incomplete* translations (i.e. prefixes of translations): |
|
```python |
|
model = comet_early_exit.load_from_checkpoint(comet_early_exit.download_model("zouharvi/COMET-partial")) |
|
data = [ |
|
{ |
|
"src": "Can I receive my food in 10 to 15 minutes?", |
|
"mt": "Mohl bych", |
|
}, |
|
{ |
|
"src": "Can I receive my food in 10 to 15 minutes?", |
|
"mt": "Mohl bych dostat jídlo", |
|
}, |
|
{ |
|
"src": "Can I receive my food in 10 to 15 minutes?", |
|
"mt": "Mohl bych dostat jídlo během 10 či 15 minut?", |
|
} |
|
] |
|
model_output = model.predict(data, batch_size=8, gpus=1) |
|
print("scores", model_output["scores"]) |
|
``` |
|
Outputs (formatted): |
|
``` |
|
scores 89.18 86.52 89.20 |
|
``` |
|
|
|
This model is based on the work [Early-Exit and Instant Confidence Translation Quality Estimation](http://arxiv.org/abs/2502.14429) which can be cited as: |
|
``` |
|
@misc{zouhar2025earlyexitinstantconfidencetranslation, |
|
title={Early-Exit and Instant Confidence Translation Quality Estimation}, |
|
author={Vilém Zouhar and Maike Züfle and Beni Egressy and Julius Cheng and Jan Niehues}, |
|
year={2025}, |
|
eprint={2502.14429}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2502.14429}, |
|
} |
|
``` |