sebastian-hofstaetter
/

uni-colberter-128-1-msmarco

dense-passage-retrieval

knowledge-distillation

Inference Endpoints

Model card Files Files and versions Community

uni-colberter-128-1-msmarco / README.md

Sebastian Hofstätter

Initial Model & Readme

6d34d25 over 2 years ago

|

history blame contribute delete

1.22 kB

	---
	license: apache-2.0
	language: "en"
	tags:
	- bag-of-words
	- dense-passage-retrieval
	- knowledge-distillation
	datasets:
	- ms_marco
	---

	# Uni-ColBERTer (Dim: 1) for Passage Retrieval

	If you want to know more about our (Uni-)ColBERTer architecture check out our paper: https://arxiv.org/abs/2203.13088 🎉

	For more information, source code, and a minimal usage example please visit: https://github.com/sebastian-hofstaetter/colberter

	## Limitations & Bias

	- The model is only trained on english text.

	- The model inherits social biases from both DistilBERT and MSMARCO.

	- The model is only trained on relatively short passages of MSMARCO (avg. 60 words length), so it might struggle with longer text.

	## Citation

	If you use our model checkpoint please cite our work as:

	```
	@article{Hofstaetter2022_colberter,
	author = {Sebastian Hofst{\"a}tter and Omar Khattab and Sophia Althammer and Mete Sertkan and Allan Hanbury},
	title = {Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction},
	publisher = {arXiv},
	url = {https://arxiv.org/abs/2203.13088},
	doi = {10.48550/ARXIV.2203.13088},
	year = {2022},
	}
	```