update
Browse files
README.md
CHANGED
@@ -6,8 +6,8 @@ tags:
|
|
6 |
- sentence-similarity
|
7 |
---
|
8 |
|
9 |
-
# multi-qa-
|
10 |
-
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a
|
11 |
|
12 |
|
13 |
## Usage (Sentence-Transformers)
|
@@ -25,7 +25,7 @@ query = "How many people live in London?"
|
|
25 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
26 |
|
27 |
#Load the model
|
28 |
-
model = SentenceTransformer('sentence-transformers/multi-qa-
|
29 |
|
30 |
#Encode query and documents
|
31 |
query_emb = model.encode(query)
|
@@ -84,8 +84,8 @@ query = "How many people live in London?"
|
|
84 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
85 |
|
86 |
# Load model from HuggingFace Hub
|
87 |
-
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/multi-qa-
|
88 |
-
model = AutoModel.from_pretrained("sentence-transformers/multi-qa-
|
89 |
|
90 |
#Encode query and docs
|
91 |
query_emb = encode(query)
|
@@ -111,7 +111,7 @@ In the following some technical details how this model must be used:
|
|
111 |
|
112 |
| Setting | Value |
|
113 |
| --- | :---: |
|
114 |
-
| Dimensions |
|
115 |
| Produces normalized embeddings | Yes |
|
116 |
| Pooling-Method | Mean pooling |
|
117 |
| Suitable score functions | dot-product (`util.dot_score`), cosine-similarity (`util.cos_sim`), or euclidean distance |
|
@@ -145,7 +145,7 @@ The full training script is accessible in this current repository: `train_script
|
|
145 |
|
146 |
### Pre-training
|
147 |
|
148 |
-
We use the pretrained [`
|
149 |
|
150 |
#### Training
|
151 |
|
|
|
6 |
- sentence-similarity
|
7 |
---
|
8 |
|
9 |
+
# multi-qa-distilbert-cos-v1
|
10 |
+
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and was designed for **semantic search**. It has been trained on 215M (question, answer) pairs from diverse sources. For an introduction to semantic search, have a look at: [SBERT.net - Semantic Search](https://www.sbert.net/examples/applications/semantic-search/README.html)
|
11 |
|
12 |
|
13 |
## Usage (Sentence-Transformers)
|
|
|
25 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
26 |
|
27 |
#Load the model
|
28 |
+
model = SentenceTransformer('sentence-transformers/multi-qa-distilbert-cos-v1')
|
29 |
|
30 |
#Encode query and documents
|
31 |
query_emb = model.encode(query)
|
|
|
84 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
85 |
|
86 |
# Load model from HuggingFace Hub
|
87 |
+
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/multi-qa-distilbert-cos-v1")
|
88 |
+
model = AutoModel.from_pretrained("sentence-transformers/multi-qa-distilbert-cos-v1")
|
89 |
|
90 |
#Encode query and docs
|
91 |
query_emb = encode(query)
|
|
|
111 |
|
112 |
| Setting | Value |
|
113 |
| --- | :---: |
|
114 |
+
| Dimensions | 768 |
|
115 |
| Produces normalized embeddings | Yes |
|
116 |
| Pooling-Method | Mean pooling |
|
117 |
| Suitable score functions | dot-product (`util.dot_score`), cosine-similarity (`util.cos_sim`), or euclidean distance |
|
|
|
145 |
|
146 |
### Pre-training
|
147 |
|
148 |
+
We use the pretrained [`distilbert-base-uncased`](https://huggingface.co/distilbert-base-uncased) model. Please refer to the model card for more detailed information about the pre-training procedure.
|
149 |
|
150 |
#### Training
|
151 |
|