Update README.md
Browse files
README.md
CHANGED
@@ -54,7 +54,7 @@ pipeline_tag: sentence-similarity
|
|
54 |
- [How can I reduce overall inference cost ?](#how-can-i-reduce-overall-inference-cost)
|
55 |
- [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
|
56 |
- [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
|
57 |
-
|
58 |
- [Roadmap](#roadmap)
|
59 |
- [Notes on Reproducing:](#notes-on-reproducing)
|
60 |
- [Reference:](#reference)
|
@@ -177,11 +177,17 @@ The below numbers are with mDPR model, but miniDense_arabic_v1 should give a eve
|
|
177 |
|
178 |
*Note: MIRACL paper shows a different (higher) value for BM25 Arabic, So we are taking that value from BGE-M3 paper, rest all are form the MIRACL paper.*
|
179 |
|
180 |
-
|
181 |
MTEB is a general purpose embedding evaluation benchmark covering wide range of tasks, but miniDense models (like BGE-M3) are predominantly tuned for retireval tasks aimed at search & IR based usecases.
|
182 |
So it makes sense to evaluate our models in retrieval slice of the MTEB benchmark.
|
183 |
|
184 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
185 |
|
186 |
<center>
|
187 |
<img src="./ar_metrics_4.png" width=150%/>
|
@@ -189,9 +195,10 @@ So it makes sense to evaluate our models in retrieval slice of the MTEB benchmar
|
|
189 |
</center>
|
190 |
|
191 |
|
192 |
-
|
193 |
|
194 |
-
Almost all models below are monolingual arabic models so they have no notion of any other languages. But the below table shows how our model excels in cross-lingual scenarios.
|
|
|
195 |
|
196 |
<center>
|
197 |
<img src="./ar_metrics_5.png" width=80%/>
|
|
|
54 |
- [How can I reduce overall inference cost ?](#how-can-i-reduce-overall-inference-cost)
|
55 |
- [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
|
56 |
- [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
|
57 |
+
- [MTEB numbers](#mteb-numbers)
|
58 |
- [Roadmap](#roadmap)
|
59 |
- [Notes on Reproducing:](#notes-on-reproducing)
|
60 |
- [Reference:](#reference)
|
|
|
177 |
|
178 |
*Note: MIRACL paper shows a different (higher) value for BM25 Arabic, So we are taking that value from BGE-M3 paper, rest all are form the MIRACL paper.*
|
179 |
|
180 |
+
# MTEB numbers:
|
181 |
MTEB is a general purpose embedding evaluation benchmark covering wide range of tasks, but miniDense models (like BGE-M3) are predominantly tuned for retireval tasks aimed at search & IR based usecases.
|
182 |
So it makes sense to evaluate our models in retrieval slice of the MTEB benchmark.
|
183 |
|
184 |
+
#### MIRACL Retrieval
|
185 |
+
|
186 |
+
Refer tables above
|
187 |
+
|
188 |
+
#### Long Document Retrieval
|
189 |
+
|
190 |
+
This is very ambitious eval because we have not trained for long context, the max_len was 512 for all the models below.
|
191 |
|
192 |
<center>
|
193 |
<img src="./ar_metrics_4.png" width=150%/>
|
|
|
195 |
</center>
|
196 |
|
197 |
|
198 |
+
#### X-lingual Retrieval
|
199 |
|
200 |
+
Almost all models below are monolingual arabic models so they have no notion of any other languages. But the below table shows how our model excels in cross-lingual scenarios owing to its deep multilingual understanding.
|
201 |
+
This also explains its competitive performance when compared to models lot larger.
|
202 |
|
203 |
<center>
|
204 |
<img src="./ar_metrics_5.png" width=80%/>
|