Papers
arxiv:1907.04307

Multilingual Universal Sentence Encoder for Semantic Retrieval

Published on Jul 9, 2019
Authors:
,
,
,
,
,
,
,
,
,

Abstract

We introduce two pre-trained retrieval focused multilingual sentence encoding models, respectively based on the Transformer and CNN model architectures. The models embed text from 16 languages into a single semantic space using a multi-task trained dual-encoder that learns tied representations using translation based bridge tasks (Chidambaram al., 2018). The models provide performance that is competitive with the state-of-the-art on: semantic retrieval (SR), translation pair bitext retrieval (BR) and retrieval question answering (ReQA). On English transfer learning tasks, our sentence-level embeddings approach, and in some cases exceed, the performance of monolingual, English only, sentence embedding models. Our models are made available for download on TensorFlow Hub.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 9

Browse 9 datasets citing this paper

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.