--- license: mit --- # Cross-Encoder for MS Marco This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task. The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See our paper [R2ANKER](https://arxiv.org/pdf/2206.08063.pdf) for more details. ## Usage with Transformers ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch tokenizer = AutoTokenizer.from_pretrained("YCZhou/R2ANKER") model = AutoModelForSequenceClassification.from_pretrained("YCZhou/R2ANKER") features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="pt") model.eval() with torch.no_grad(): scores = model(**features).logits print(scores) ``` ## Citation ``` @inproceedings{DBLP:conf/acl/Zhou0GTXLJJ23, author = {Yucheng Zhou and Tao Shen and Xiubo Geng and Chongyang Tao and Can Xu and Guodong Long and Binxing Jiao and Daxin Jiang}, title = {Towards Robust Ranker for Text Retrieval}, booktitle = {Findings of the Association for Computational Linguistics: {ACL} 2023, Toronto, Canada, July 9-14, 2023}, pages = {5387--5401}, publisher = {Association for Computational Linguistics}, year = {2023}, url = {https://doi.org/10.18653/v1/2023.findings-acl.332}, doi = {10.18653/V1/2023.FINDINGS-ACL.332}, timestamp = {Sat, 30 Sep 2023 09:33:34 +0200}, biburl = {https://dblp.org/rec/conf/acl/Zhou0GTXLJJ23.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } ```