zouharvi
/

COMET-partial

Model card Files Files and versions Community

COMET-partial / README.md

zouharvi's picture

Update README.md

f620a13 verified 3 days ago

|

history blame contribute delete

2.46 kB

	---
	pipeline_tag: translation
	library_name: comet
	language:
	- multilingual
	- af
	- am
	- ar
	- as
	- az
	- be
	- bg
	- bn
	- br
	- bs
	- ca
	- cs
	- cy
	- da
	- de
	- el
	- en
	- eo
	- es
	- et
	- eu
	- fa
	- fi
	- fr
	- fy
	- ga
	- gd
	- gl
	- gu
	- ha
	- he
	- hi
	- hr
	- hu
	- hy
	- id
	- is
	- it
	- ja
	- jv
	- ka
	- kk
	- km
	- kn
	- ko
	- ku
	- ky
	- la
	- lo
	- lt
	- lv
	- mg
	- mk
	- ml
	- mn
	- mr
	- ms
	- my
	- ne
	- nl
	- 'no'
	- om
	- or
	- pa
	- pl
	- ps
	- pt
	- ro
	- ru
	- sa
	- sd
	- si
	- sk
	- sl
	- so
	- sq
	- sr
	- su
	- sv
	- sw
	- ta
	- te
	- th
	- tl
	- tr
	- ug
	- uk
	- ur
	- uz
	- vi
	- xh
	- yi
	- zh
	license: apache-2.0
	base_model:
	- FacebookAI/xlm-roberta-large
	---

	# COMET-partial

	This model is based on [COMET-early-exit](https://github.com/zouharvi/COMET-early-exit), which is a fork but not compatible with original Unbabel's COMET.
	To run the model, you need to first install this version of COMET either with:
	```bash
	pip install "git+https://github.com/zouharvi/COMET-early-exit#egg=comet-early-exit&subdirectory=comet_early_exit"
	```
	or in editable mode:
	```bash
	git clone https://github.com/zouharvi/COMET-early-exit.git
	cd COMET-early-exit
	pip3 install -e comet_early_exit
	```


	This model is described in the appendix in the paper.
	It is able to score even incomplete translations (i.e. prefixes of translations):
	```python
	model = comet_early_exit.load_from_checkpoint(comet_early_exit.download_model("zouharvi/COMET-partial"))
	data = [
	{
	"src": "Can I receive my food in 10 to 15 minutes?",
	"mt": "Mohl bych",
	},
	{
	"src": "Can I receive my food in 10 to 15 minutes?",
	"mt": "Mohl bych dostat jídlo",
	},
	{
	"src": "Can I receive my food in 10 to 15 minutes?",
	"mt": "Mohl bych dostat jídlo během 10 či 15 minut?",
	}
	]
	model_output = model.predict(data, batch_size=8, gpus=1)
	print("scores", model_output["scores"])
	```
	Outputs (formatted):
	```
	scores 89.18 86.52 89.20
	```

	This model is based on the work [Early-Exit and Instant Confidence Translation Quality Estimation](http://arxiv.org/abs/2502.14429) which can be cited as:
	```
	@misc{zouhar2025earlyexitinstantconfidencetranslation,
	title={Early-Exit and Instant Confidence Translation Quality Estimation},
	author={Vilém Zouhar and Maike Züfle and Beni Egressy and Julius Cheng and Jan Niehues},
	year={2025},
	eprint={2502.14429},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2502.14429},
	}
	```