update README
Browse files
README.md
CHANGED
@@ -3330,7 +3330,7 @@ model = AutoModel.from_pretrained('Salesforce/SFR-Embedding-Mistral')
|
|
3330 |
# get the embeddings
|
3331 |
max_length = 4096
|
3332 |
input_texts = queries + passages
|
3333 |
-
batch_dict = tokenizer(input_texts, max_length=max_length
|
3334 |
outputs = model(**batch_dict)
|
3335 |
embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
|
3336 |
|
@@ -3369,7 +3369,9 @@ print(scores.tolist())
|
|
3369 |
# [[86.71537780761719, 36.645721435546875], [35.00497055053711, 82.07388305664062]]
|
3370 |
```
|
3371 |
|
3372 |
-
|
|
|
|
|
3373 |
|
3374 |
SFR-Embedding Team (∗indicates lead contributors).
|
3375 |
* Rui Meng*
|
|
|
3330 |
# get the embeddings
|
3331 |
max_length = 4096
|
3332 |
input_texts = queries + passages
|
3333 |
+
batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors="pt")
|
3334 |
outputs = model(**batch_dict)
|
3335 |
embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
|
3336 |
|
|
|
3369 |
# [[86.71537780761719, 36.645721435546875], [35.00497055053711, 82.07388305664062]]
|
3370 |
```
|
3371 |
|
3372 |
+
### MTEB Benchmark Evaluation
|
3373 |
+
Check out [unilm/e5](https://github.com/microsoft/unilm/tree/master/e5) to reproduce evaluation results on the [BEIR](https://arxiv.org/abs/2104.08663) and [MTEB](https://arxiv.org/abs/2210.07316) benchmark.
|
3374 |
+
|
3375 |
|
3376 |
SFR-Embedding Team (∗indicates lead contributors).
|
3377 |
* Rui Meng*
|