ArvinZhuang
commited on
Commit
•
2ceb9d3
1
Parent(s):
5af6e5c
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,67 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
library_name: transformers
|
4 |
+
pipeline_tag: text2text-generation
|
5 |
---
|
6 |
+
|
7 |
+
## Model description
|
8 |
+
|
9 |
+
mT5-base query generation model that is trained with XOR QA data.
|
10 |
+
|
11 |
+
Used in paper [Bridging the Gap Between Indexing and Retrieval for
|
12 |
+
Differentiable Search Index with Query Generation](https://arxiv.org/pdf/2206.10128.pdf)
|
13 |
+
|
14 |
+
and [Augmenting Passage Representations with Query Generation
|
15 |
+
for Enhanced Cross-Lingual Dense Retrieval]()
|
16 |
+
|
17 |
+
### How to use
|
18 |
+
```python
|
19 |
+
from transformers import pipeline
|
20 |
+
|
21 |
+
lang2mT5 = dict(
|
22 |
+
ar='Arabic',
|
23 |
+
bn='Bengali',
|
24 |
+
fi='Finnish',
|
25 |
+
ja='Japanese',
|
26 |
+
ko='Korean',
|
27 |
+
ru='Russian',
|
28 |
+
te='Telugu'
|
29 |
+
)
|
30 |
+
PROMPT = 'Generate a {lang} question for this passage: {title} {passage}'
|
31 |
+
|
32 |
+
title = 'Transformer (machine learning model)'
|
33 |
+
passage = 'A transformer is a deep learning model that adopts the mechanism of self-attention, differentially ' \
|
34 |
+
'weighting the significance of each part of the input (which includes the recursive output) data.'
|
35 |
+
|
36 |
+
|
37 |
+
model_name_or_path = 'ielabgroup/xor-tydi-docTquery-mt5-base'
|
38 |
+
input_text = PROMPT.format_map({'lang': lang2mT5['ja'],
|
39 |
+
'title': title,
|
40 |
+
'passage': passage})
|
41 |
+
|
42 |
+
generator = pipeline(model=model_name_or_path,
|
43 |
+
task='text2text-generation',
|
44 |
+
device="cuda:0",
|
45 |
+
)
|
46 |
+
|
47 |
+
results = generator(input_text,
|
48 |
+
do_sample=True,
|
49 |
+
max_length=64,
|
50 |
+
num_return_sequences=10,
|
51 |
+
)
|
52 |
+
|
53 |
+
for i, result in enumerate(results):
|
54 |
+
print(f'{i + 1}. {result["generated_text"]}')
|
55 |
+
```
|
56 |
+
|
57 |
+
### BibTeX entry and citation info
|
58 |
+
|
59 |
+
```bibtex
|
60 |
+
@article{zhuang2022bridging,
|
61 |
+
title={Bridging the gap between indexing and retrieval for differentiable search index with query generation},
|
62 |
+
author={Zhuang, Shengyao and Ren, Houxing and Shou, Linjun and Pei, Jian and Gong, Ming and Zuccon, Guido and Jiang, Daxin},
|
63 |
+
journal={arXiv preprint arXiv:2206.10128},
|
64 |
+
year={2022}
|
65 |
+
}
|
66 |
+
```
|
67 |
+
|