martinhillebrandtd commited on
Commit
d47d862
·
1 Parent(s): 6416cfe

model bge small en v 1.5

Browse files
README.md CHANGED
@@ -1,3 +1,2655 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ model-index:
6
+ - name: bge-small-en-v1.5
7
+ results:
8
+ - dataset:
9
+ config: en
10
+ name: MTEB AmazonCounterfactualClassification (en)
11
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
12
+ split: test
13
+ type: mteb/amazon_counterfactual
14
+ metrics:
15
+ - type: accuracy
16
+ value: 73.79104477611939
17
+ - type: ap
18
+ value: 37.21923821573361
19
+ - type: f1
20
+ value: 68.0914945617093
21
+ task:
22
+ type: Classification
23
+ - dataset:
24
+ config: default
25
+ name: MTEB AmazonPolarityClassification
26
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
27
+ split: test
28
+ type: mteb/amazon_polarity
29
+ metrics:
30
+ - type: accuracy
31
+ value: 92.75377499999999
32
+ - type: ap
33
+ value: 89.46766124546022
34
+ - type: f1
35
+ value: 92.73884001331487
36
+ task:
37
+ type: Classification
38
+ - dataset:
39
+ config: en
40
+ name: MTEB AmazonReviewsClassification (en)
41
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
42
+ split: test
43
+ type: mteb/amazon_reviews_multi
44
+ metrics:
45
+ - type: accuracy
46
+ value: 46.986
47
+ - type: f1
48
+ value: 46.55936786727896
49
+ task:
50
+ type: Classification
51
+ - dataset:
52
+ config: default
53
+ name: MTEB ArguAna
54
+ revision: None
55
+ split: test
56
+ type: arguana
57
+ metrics:
58
+ - type: map_at_1
59
+ value: 35.846000000000004
60
+ - type: map_at_10
61
+ value: 51.388
62
+ - type: map_at_100
63
+ value: 52.132999999999996
64
+ - type: map_at_1000
65
+ value: 52.141000000000005
66
+ - type: map_at_3
67
+ value: 47.037
68
+ - type: map_at_5
69
+ value: 49.579
70
+ - type: mrr_at_1
71
+ value: 36.558
72
+ - type: mrr_at_10
73
+ value: 51.658
74
+ - type: mrr_at_100
75
+ value: 52.402
76
+ - type: mrr_at_1000
77
+ value: 52.410000000000004
78
+ - type: mrr_at_3
79
+ value: 47.345
80
+ - type: mrr_at_5
81
+ value: 49.797999999999995
82
+ - type: ndcg_at_1
83
+ value: 35.846000000000004
84
+ - type: ndcg_at_10
85
+ value: 59.550000000000004
86
+ - type: ndcg_at_100
87
+ value: 62.596
88
+ - type: ndcg_at_1000
89
+ value: 62.759
90
+ - type: ndcg_at_3
91
+ value: 50.666999999999994
92
+ - type: ndcg_at_5
93
+ value: 55.228
94
+ - type: precision_at_1
95
+ value: 35.846000000000004
96
+ - type: precision_at_10
97
+ value: 8.542
98
+ - type: precision_at_100
99
+ value: 0.984
100
+ - type: precision_at_1000
101
+ value: 0.1
102
+ - type: precision_at_3
103
+ value: 20.389
104
+ - type: precision_at_5
105
+ value: 14.438
106
+ - type: recall_at_1
107
+ value: 35.846000000000004
108
+ - type: recall_at_10
109
+ value: 85.42
110
+ - type: recall_at_100
111
+ value: 98.43499999999999
112
+ - type: recall_at_1000
113
+ value: 99.644
114
+ - type: recall_at_3
115
+ value: 61.166
116
+ - type: recall_at_5
117
+ value: 72.191
118
+ task:
119
+ type: Retrieval
120
+ - dataset:
121
+ config: default
122
+ name: MTEB ArxivClusteringP2P
123
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
124
+ split: test
125
+ type: mteb/arxiv-clustering-p2p
126
+ metrics:
127
+ - type: v_measure
128
+ value: 47.402770198163594
129
+ task:
130
+ type: Clustering
131
+ - dataset:
132
+ config: default
133
+ name: MTEB ArxivClusteringS2S
134
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
135
+ split: test
136
+ type: mteb/arxiv-clustering-s2s
137
+ metrics:
138
+ - type: v_measure
139
+ value: 40.01545436974177
140
+ task:
141
+ type: Clustering
142
+ - dataset:
143
+ config: default
144
+ name: MTEB AskUbuntuDupQuestions
145
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
146
+ split: test
147
+ type: mteb/askubuntudupquestions-reranking
148
+ metrics:
149
+ - type: map
150
+ value: 62.586465273207196
151
+ - type: mrr
152
+ value: 74.42169019038825
153
+ task:
154
+ type: Reranking
155
+ - dataset:
156
+ config: default
157
+ name: MTEB BIOSSES
158
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
159
+ split: test
160
+ type: mteb/biosses-sts
161
+ metrics:
162
+ - type: cos_sim_pearson
163
+ value: 85.1891186537969
164
+ - type: cos_sim_spearman
165
+ value: 83.75492046087288
166
+ - type: euclidean_pearson
167
+ value: 84.11766204805357
168
+ - type: euclidean_spearman
169
+ value: 84.01456493126516
170
+ - type: manhattan_pearson
171
+ value: 84.2132950502772
172
+ - type: manhattan_spearman
173
+ value: 83.89227298813377
174
+ task:
175
+ type: STS
176
+ - dataset:
177
+ config: default
178
+ name: MTEB Banking77Classification
179
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
180
+ split: test
181
+ type: mteb/banking77
182
+ metrics:
183
+ - type: accuracy
184
+ value: 85.74025974025975
185
+ - type: f1
186
+ value: 85.71493566466381
187
+ task:
188
+ type: Classification
189
+ - dataset:
190
+ config: default
191
+ name: MTEB BiorxivClusteringP2P
192
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
193
+ split: test
194
+ type: mteb/biorxiv-clustering-p2p
195
+ metrics:
196
+ - type: v_measure
197
+ value: 38.467181385006434
198
+ task:
199
+ type: Clustering
200
+ - dataset:
201
+ config: default
202
+ name: MTEB BiorxivClusteringS2S
203
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
204
+ split: test
205
+ type: mteb/biorxiv-clustering-s2s
206
+ metrics:
207
+ - type: v_measure
208
+ value: 34.719496037339056
209
+ task:
210
+ type: Clustering
211
+ - dataset:
212
+ config: default
213
+ name: MTEB CQADupstackAndroidRetrieval
214
+ revision: None
215
+ split: test
216
+ type: BeIR/cqadupstack
217
+ metrics:
218
+ - type: map_at_1
219
+ value: 29.587000000000003
220
+ - type: map_at_10
221
+ value: 41.114
222
+ - type: map_at_100
223
+ value: 42.532
224
+ - type: map_at_1000
225
+ value: 42.661
226
+ - type: map_at_3
227
+ value: 37.483
228
+ - type: map_at_5
229
+ value: 39.652
230
+ - type: mrr_at_1
231
+ value: 36.338
232
+ - type: mrr_at_10
233
+ value: 46.763
234
+ - type: mrr_at_100
235
+ value: 47.393
236
+ - type: mrr_at_1000
237
+ value: 47.445
238
+ - type: mrr_at_3
239
+ value: 43.538
240
+ - type: mrr_at_5
241
+ value: 45.556000000000004
242
+ - type: ndcg_at_1
243
+ value: 36.338
244
+ - type: ndcg_at_10
245
+ value: 47.658
246
+ - type: ndcg_at_100
247
+ value: 52.824000000000005
248
+ - type: ndcg_at_1000
249
+ value: 54.913999999999994
250
+ - type: ndcg_at_3
251
+ value: 41.989
252
+ - type: ndcg_at_5
253
+ value: 44.944
254
+ - type: precision_at_1
255
+ value: 36.338
256
+ - type: precision_at_10
257
+ value: 9.156
258
+ - type: precision_at_100
259
+ value: 1.4789999999999999
260
+ - type: precision_at_1000
261
+ value: 0.196
262
+ - type: precision_at_3
263
+ value: 20.076
264
+ - type: precision_at_5
265
+ value: 14.85
266
+ - type: recall_at_1
267
+ value: 29.587000000000003
268
+ - type: recall_at_10
269
+ value: 60.746
270
+ - type: recall_at_100
271
+ value: 82.157
272
+ - type: recall_at_1000
273
+ value: 95.645
274
+ - type: recall_at_3
275
+ value: 44.821
276
+ - type: recall_at_5
277
+ value: 52.819
278
+ - type: map_at_1
279
+ value: 30.239
280
+ - type: map_at_10
281
+ value: 39.989000000000004
282
+ - type: map_at_100
283
+ value: 41.196
284
+ - type: map_at_1000
285
+ value: 41.325
286
+ - type: map_at_3
287
+ value: 37.261
288
+ - type: map_at_5
289
+ value: 38.833
290
+ - type: mrr_at_1
291
+ value: 37.516
292
+ - type: mrr_at_10
293
+ value: 46.177
294
+ - type: mrr_at_100
295
+ value: 46.806
296
+ - type: mrr_at_1000
297
+ value: 46.849000000000004
298
+ - type: mrr_at_3
299
+ value: 44.002
300
+ - type: mrr_at_5
301
+ value: 45.34
302
+ - type: ndcg_at_1
303
+ value: 37.516
304
+ - type: ndcg_at_10
305
+ value: 45.586
306
+ - type: ndcg_at_100
307
+ value: 49.897000000000006
308
+ - type: ndcg_at_1000
309
+ value: 51.955
310
+ - type: ndcg_at_3
311
+ value: 41.684
312
+ - type: ndcg_at_5
313
+ value: 43.617
314
+ - type: precision_at_1
315
+ value: 37.516
316
+ - type: precision_at_10
317
+ value: 8.522
318
+ - type: precision_at_100
319
+ value: 1.374
320
+ - type: precision_at_1000
321
+ value: 0.184
322
+ - type: precision_at_3
323
+ value: 20.105999999999998
324
+ - type: precision_at_5
325
+ value: 14.152999999999999
326
+ - type: recall_at_1
327
+ value: 30.239
328
+ - type: recall_at_10
329
+ value: 55.03
330
+ - type: recall_at_100
331
+ value: 73.375
332
+ - type: recall_at_1000
333
+ value: 86.29599999999999
334
+ - type: recall_at_3
335
+ value: 43.269000000000005
336
+ - type: recall_at_5
337
+ value: 48.878
338
+ - type: map_at_1
339
+ value: 38.338
340
+ - type: map_at_10
341
+ value: 50.468999999999994
342
+ - type: map_at_100
343
+ value: 51.553000000000004
344
+ - type: map_at_1000
345
+ value: 51.608
346
+ - type: map_at_3
347
+ value: 47.107
348
+ - type: map_at_5
349
+ value: 49.101
350
+ - type: mrr_at_1
351
+ value: 44.201
352
+ - type: mrr_at_10
353
+ value: 54.057
354
+ - type: mrr_at_100
355
+ value: 54.764
356
+ - type: mrr_at_1000
357
+ value: 54.791000000000004
358
+ - type: mrr_at_3
359
+ value: 51.56699999999999
360
+ - type: mrr_at_5
361
+ value: 53.05
362
+ - type: ndcg_at_1
363
+ value: 44.201
364
+ - type: ndcg_at_10
365
+ value: 56.379000000000005
366
+ - type: ndcg_at_100
367
+ value: 60.645
368
+ - type: ndcg_at_1000
369
+ value: 61.73499999999999
370
+ - type: ndcg_at_3
371
+ value: 50.726000000000006
372
+ - type: ndcg_at_5
373
+ value: 53.58500000000001
374
+ - type: precision_at_1
375
+ value: 44.201
376
+ - type: precision_at_10
377
+ value: 9.141
378
+ - type: precision_at_100
379
+ value: 1.216
380
+ - type: precision_at_1000
381
+ value: 0.135
382
+ - type: precision_at_3
383
+ value: 22.654
384
+ - type: precision_at_5
385
+ value: 15.723999999999998
386
+ - type: recall_at_1
387
+ value: 38.338
388
+ - type: recall_at_10
389
+ value: 70.30499999999999
390
+ - type: recall_at_100
391
+ value: 88.77199999999999
392
+ - type: recall_at_1000
393
+ value: 96.49799999999999
394
+ - type: recall_at_3
395
+ value: 55.218
396
+ - type: recall_at_5
397
+ value: 62.104000000000006
398
+ - type: map_at_1
399
+ value: 25.682
400
+ - type: map_at_10
401
+ value: 33.498
402
+ - type: map_at_100
403
+ value: 34.461000000000006
404
+ - type: map_at_1000
405
+ value: 34.544000000000004
406
+ - type: map_at_3
407
+ value: 30.503999999999998
408
+ - type: map_at_5
409
+ value: 32.216
410
+ - type: mrr_at_1
411
+ value: 27.683999999999997
412
+ - type: mrr_at_10
413
+ value: 35.467999999999996
414
+ - type: mrr_at_100
415
+ value: 36.32
416
+ - type: mrr_at_1000
417
+ value: 36.386
418
+ - type: mrr_at_3
419
+ value: 32.618
420
+ - type: mrr_at_5
421
+ value: 34.262
422
+ - type: ndcg_at_1
423
+ value: 27.683999999999997
424
+ - type: ndcg_at_10
425
+ value: 38.378
426
+ - type: ndcg_at_100
427
+ value: 43.288
428
+ - type: ndcg_at_1000
429
+ value: 45.413
430
+ - type: ndcg_at_3
431
+ value: 32.586
432
+ - type: ndcg_at_5
433
+ value: 35.499
434
+ - type: precision_at_1
435
+ value: 27.683999999999997
436
+ - type: precision_at_10
437
+ value: 5.864
438
+ - type: precision_at_100
439
+ value: 0.882
440
+ - type: precision_at_1000
441
+ value: 0.11
442
+ - type: precision_at_3
443
+ value: 13.446
444
+ - type: precision_at_5
445
+ value: 9.718
446
+ - type: recall_at_1
447
+ value: 25.682
448
+ - type: recall_at_10
449
+ value: 51.712
450
+ - type: recall_at_100
451
+ value: 74.446
452
+ - type: recall_at_1000
453
+ value: 90.472
454
+ - type: recall_at_3
455
+ value: 36.236000000000004
456
+ - type: recall_at_5
457
+ value: 43.234
458
+ - type: map_at_1
459
+ value: 16.073999999999998
460
+ - type: map_at_10
461
+ value: 24.352999999999998
462
+ - type: map_at_100
463
+ value: 25.438
464
+ - type: map_at_1000
465
+ value: 25.545
466
+ - type: map_at_3
467
+ value: 21.614
468
+ - type: map_at_5
469
+ value: 23.104
470
+ - type: mrr_at_1
471
+ value: 19.776
472
+ - type: mrr_at_10
473
+ value: 28.837000000000003
474
+ - type: mrr_at_100
475
+ value: 29.755
476
+ - type: mrr_at_1000
477
+ value: 29.817
478
+ - type: mrr_at_3
479
+ value: 26.201999999999998
480
+ - type: mrr_at_5
481
+ value: 27.714
482
+ - type: ndcg_at_1
483
+ value: 19.776
484
+ - type: ndcg_at_10
485
+ value: 29.701
486
+ - type: ndcg_at_100
487
+ value: 35.307
488
+ - type: ndcg_at_1000
489
+ value: 37.942
490
+ - type: ndcg_at_3
491
+ value: 24.764
492
+ - type: ndcg_at_5
493
+ value: 27.025
494
+ - type: precision_at_1
495
+ value: 19.776
496
+ - type: precision_at_10
497
+ value: 5.659
498
+ - type: precision_at_100
499
+ value: 0.971
500
+ - type: precision_at_1000
501
+ value: 0.133
502
+ - type: precision_at_3
503
+ value: 12.065
504
+ - type: precision_at_5
505
+ value: 8.905000000000001
506
+ - type: recall_at_1
507
+ value: 16.073999999999998
508
+ - type: recall_at_10
509
+ value: 41.647
510
+ - type: recall_at_100
511
+ value: 66.884
512
+ - type: recall_at_1000
513
+ value: 85.91499999999999
514
+ - type: recall_at_3
515
+ value: 27.916
516
+ - type: recall_at_5
517
+ value: 33.729
518
+ - type: map_at_1
519
+ value: 28.444999999999997
520
+ - type: map_at_10
521
+ value: 38.218999999999994
522
+ - type: map_at_100
523
+ value: 39.595
524
+ - type: map_at_1000
525
+ value: 39.709
526
+ - type: map_at_3
527
+ value: 35.586
528
+ - type: map_at_5
529
+ value: 36.895
530
+ - type: mrr_at_1
531
+ value: 34.841
532
+ - type: mrr_at_10
533
+ value: 44.106
534
+ - type: mrr_at_100
535
+ value: 44.98
536
+ - type: mrr_at_1000
537
+ value: 45.03
538
+ - type: mrr_at_3
539
+ value: 41.979
540
+ - type: mrr_at_5
541
+ value: 43.047999999999995
542
+ - type: ndcg_at_1
543
+ value: 34.841
544
+ - type: ndcg_at_10
545
+ value: 43.922
546
+ - type: ndcg_at_100
547
+ value: 49.504999999999995
548
+ - type: ndcg_at_1000
549
+ value: 51.675000000000004
550
+ - type: ndcg_at_3
551
+ value: 39.858
552
+ - type: ndcg_at_5
553
+ value: 41.408
554
+ - type: precision_at_1
555
+ value: 34.841
556
+ - type: precision_at_10
557
+ value: 7.872999999999999
558
+ - type: precision_at_100
559
+ value: 1.2449999999999999
560
+ - type: precision_at_1000
561
+ value: 0.161
562
+ - type: precision_at_3
563
+ value: 18.993
564
+ - type: precision_at_5
565
+ value: 13.032
566
+ - type: recall_at_1
567
+ value: 28.444999999999997
568
+ - type: recall_at_10
569
+ value: 54.984
570
+ - type: recall_at_100
571
+ value: 78.342
572
+ - type: recall_at_1000
573
+ value: 92.77
574
+ - type: recall_at_3
575
+ value: 42.842999999999996
576
+ - type: recall_at_5
577
+ value: 47.247
578
+ - type: map_at_1
579
+ value: 23.072
580
+ - type: map_at_10
581
+ value: 32.354
582
+ - type: map_at_100
583
+ value: 33.800000000000004
584
+ - type: map_at_1000
585
+ value: 33.908
586
+ - type: map_at_3
587
+ value: 29.232000000000003
588
+ - type: map_at_5
589
+ value: 31.049
590
+ - type: mrr_at_1
591
+ value: 29.110000000000003
592
+ - type: mrr_at_10
593
+ value: 38.03
594
+ - type: mrr_at_100
595
+ value: 39.032
596
+ - type: mrr_at_1000
597
+ value: 39.086999999999996
598
+ - type: mrr_at_3
599
+ value: 35.407
600
+ - type: mrr_at_5
601
+ value: 36.76
602
+ - type: ndcg_at_1
603
+ value: 29.110000000000003
604
+ - type: ndcg_at_10
605
+ value: 38.231
606
+ - type: ndcg_at_100
607
+ value: 44.425
608
+ - type: ndcg_at_1000
609
+ value: 46.771
610
+ - type: ndcg_at_3
611
+ value: 33.095
612
+ - type: ndcg_at_5
613
+ value: 35.459
614
+ - type: precision_at_1
615
+ value: 29.110000000000003
616
+ - type: precision_at_10
617
+ value: 7.215000000000001
618
+ - type: precision_at_100
619
+ value: 1.2109999999999999
620
+ - type: precision_at_1000
621
+ value: 0.157
622
+ - type: precision_at_3
623
+ value: 16.058
624
+ - type: precision_at_5
625
+ value: 11.644
626
+ - type: recall_at_1
627
+ value: 23.072
628
+ - type: recall_at_10
629
+ value: 50.285999999999994
630
+ - type: recall_at_100
631
+ value: 76.596
632
+ - type: recall_at_1000
633
+ value: 92.861
634
+ - type: recall_at_3
635
+ value: 35.702
636
+ - type: recall_at_5
637
+ value: 42.152
638
+ - type: map_at_1
639
+ value: 24.937916666666666
640
+ - type: map_at_10
641
+ value: 33.755250000000004
642
+ - type: map_at_100
643
+ value: 34.955999999999996
644
+ - type: map_at_1000
645
+ value: 35.070499999999996
646
+ - type: map_at_3
647
+ value: 30.98708333333333
648
+ - type: map_at_5
649
+ value: 32.51491666666666
650
+ - type: mrr_at_1
651
+ value: 29.48708333333333
652
+ - type: mrr_at_10
653
+ value: 37.92183333333334
654
+ - type: mrr_at_100
655
+ value: 38.76583333333333
656
+ - type: mrr_at_1000
657
+ value: 38.82466666666667
658
+ - type: mrr_at_3
659
+ value: 35.45125
660
+ - type: mrr_at_5
661
+ value: 36.827000000000005
662
+ - type: ndcg_at_1
663
+ value: 29.48708333333333
664
+ - type: ndcg_at_10
665
+ value: 39.05225
666
+ - type: ndcg_at_100
667
+ value: 44.25983333333334
668
+ - type: ndcg_at_1000
669
+ value: 46.568333333333335
670
+ - type: ndcg_at_3
671
+ value: 34.271583333333325
672
+ - type: ndcg_at_5
673
+ value: 36.483916666666666
674
+ - type: precision_at_1
675
+ value: 29.48708333333333
676
+ - type: precision_at_10
677
+ value: 6.865749999999999
678
+ - type: precision_at_100
679
+ value: 1.1195833333333332
680
+ - type: precision_at_1000
681
+ value: 0.15058333333333335
682
+ - type: precision_at_3
683
+ value: 15.742083333333333
684
+ - type: precision_at_5
685
+ value: 11.221916666666667
686
+ - type: recall_at_1
687
+ value: 24.937916666666666
688
+ - type: recall_at_10
689
+ value: 50.650416666666665
690
+ - type: recall_at_100
691
+ value: 73.55383333333334
692
+ - type: recall_at_1000
693
+ value: 89.61691666666667
694
+ - type: recall_at_3
695
+ value: 37.27808333333334
696
+ - type: recall_at_5
697
+ value: 42.99475
698
+ - type: map_at_1
699
+ value: 23.947
700
+ - type: map_at_10
701
+ value: 30.575000000000003
702
+ - type: map_at_100
703
+ value: 31.465
704
+ - type: map_at_1000
705
+ value: 31.558000000000003
706
+ - type: map_at_3
707
+ value: 28.814
708
+ - type: map_at_5
709
+ value: 29.738999999999997
710
+ - type: mrr_at_1
711
+ value: 26.994
712
+ - type: mrr_at_10
713
+ value: 33.415
714
+ - type: mrr_at_100
715
+ value: 34.18
716
+ - type: mrr_at_1000
717
+ value: 34.245
718
+ - type: mrr_at_3
719
+ value: 31.621
720
+ - type: mrr_at_5
721
+ value: 32.549
722
+ - type: ndcg_at_1
723
+ value: 26.994
724
+ - type: ndcg_at_10
725
+ value: 34.482
726
+ - type: ndcg_at_100
727
+ value: 38.915
728
+ - type: ndcg_at_1000
729
+ value: 41.355
730
+ - type: ndcg_at_3
731
+ value: 31.139
732
+ - type: ndcg_at_5
733
+ value: 32.589
734
+ - type: precision_at_1
735
+ value: 26.994
736
+ - type: precision_at_10
737
+ value: 5.322
738
+ - type: precision_at_100
739
+ value: 0.8160000000000001
740
+ - type: precision_at_1000
741
+ value: 0.11100000000000002
742
+ - type: precision_at_3
743
+ value: 13.344000000000001
744
+ - type: precision_at_5
745
+ value: 8.988
746
+ - type: recall_at_1
747
+ value: 23.947
748
+ - type: recall_at_10
749
+ value: 43.647999999999996
750
+ - type: recall_at_100
751
+ value: 63.851
752
+ - type: recall_at_1000
753
+ value: 82.0
754
+ - type: recall_at_3
755
+ value: 34.288000000000004
756
+ - type: recall_at_5
757
+ value: 38.117000000000004
758
+ - type: map_at_1
759
+ value: 16.197
760
+ - type: map_at_10
761
+ value: 22.968
762
+ - type: map_at_100
763
+ value: 24.095
764
+ - type: map_at_1000
765
+ value: 24.217
766
+ - type: map_at_3
767
+ value: 20.771
768
+ - type: map_at_5
769
+ value: 21.995
770
+ - type: mrr_at_1
771
+ value: 19.511
772
+ - type: mrr_at_10
773
+ value: 26.55
774
+ - type: mrr_at_100
775
+ value: 27.500999999999998
776
+ - type: mrr_at_1000
777
+ value: 27.578999999999997
778
+ - type: mrr_at_3
779
+ value: 24.421
780
+ - type: mrr_at_5
781
+ value: 25.604
782
+ - type: ndcg_at_1
783
+ value: 19.511
784
+ - type: ndcg_at_10
785
+ value: 27.386
786
+ - type: ndcg_at_100
787
+ value: 32.828
788
+ - type: ndcg_at_1000
789
+ value: 35.739
790
+ - type: ndcg_at_3
791
+ value: 23.405
792
+ - type: ndcg_at_5
793
+ value: 25.255
794
+ - type: precision_at_1
795
+ value: 19.511
796
+ - type: precision_at_10
797
+ value: 5.017
798
+ - type: precision_at_100
799
+ value: 0.91
800
+ - type: precision_at_1000
801
+ value: 0.133
802
+ - type: precision_at_3
803
+ value: 11.023
804
+ - type: precision_at_5
805
+ value: 8.025
806
+ - type: recall_at_1
807
+ value: 16.197
808
+ - type: recall_at_10
809
+ value: 37.09
810
+ - type: recall_at_100
811
+ value: 61.778
812
+ - type: recall_at_1000
813
+ value: 82.56599999999999
814
+ - type: recall_at_3
815
+ value: 26.034000000000002
816
+ - type: recall_at_5
817
+ value: 30.762
818
+ - type: map_at_1
819
+ value: 25.41
820
+ - type: map_at_10
821
+ value: 33.655
822
+ - type: map_at_100
823
+ value: 34.892
824
+ - type: map_at_1000
825
+ value: 34.995
826
+ - type: map_at_3
827
+ value: 30.94
828
+ - type: map_at_5
829
+ value: 32.303
830
+ - type: mrr_at_1
831
+ value: 29.477999999999998
832
+ - type: mrr_at_10
833
+ value: 37.443
834
+ - type: mrr_at_100
835
+ value: 38.383
836
+ - type: mrr_at_1000
837
+ value: 38.440000000000005
838
+ - type: mrr_at_3
839
+ value: 34.949999999999996
840
+ - type: mrr_at_5
841
+ value: 36.228
842
+ - type: ndcg_at_1
843
+ value: 29.477999999999998
844
+ - type: ndcg_at_10
845
+ value: 38.769
846
+ - type: ndcg_at_100
847
+ value: 44.245000000000005
848
+ - type: ndcg_at_1000
849
+ value: 46.593
850
+ - type: ndcg_at_3
851
+ value: 33.623
852
+ - type: ndcg_at_5
853
+ value: 35.766
854
+ - type: precision_at_1
855
+ value: 29.477999999999998
856
+ - type: precision_at_10
857
+ value: 6.455
858
+ - type: precision_at_100
859
+ value: 1.032
860
+ - type: precision_at_1000
861
+ value: 0.135
862
+ - type: precision_at_3
863
+ value: 14.893999999999998
864
+ - type: precision_at_5
865
+ value: 10.485
866
+ - type: recall_at_1
867
+ value: 25.41
868
+ - type: recall_at_10
869
+ value: 50.669
870
+ - type: recall_at_100
871
+ value: 74.084
872
+ - type: recall_at_1000
873
+ value: 90.435
874
+ - type: recall_at_3
875
+ value: 36.679
876
+ - type: recall_at_5
877
+ value: 41.94
878
+ - type: map_at_1
879
+ value: 23.339
880
+ - type: map_at_10
881
+ value: 31.852000000000004
882
+ - type: map_at_100
883
+ value: 33.411
884
+ - type: map_at_1000
885
+ value: 33.62
886
+ - type: map_at_3
887
+ value: 28.929
888
+ - type: map_at_5
889
+ value: 30.542
890
+ - type: mrr_at_1
891
+ value: 28.063
892
+ - type: mrr_at_10
893
+ value: 36.301
894
+ - type: mrr_at_100
895
+ value: 37.288
896
+ - type: mrr_at_1000
897
+ value: 37.349
898
+ - type: mrr_at_3
899
+ value: 33.663
900
+ - type: mrr_at_5
901
+ value: 35.165
902
+ - type: ndcg_at_1
903
+ value: 28.063
904
+ - type: ndcg_at_10
905
+ value: 37.462
906
+ - type: ndcg_at_100
907
+ value: 43.620999999999995
908
+ - type: ndcg_at_1000
909
+ value: 46.211
910
+ - type: ndcg_at_3
911
+ value: 32.68
912
+ - type: ndcg_at_5
913
+ value: 34.981
914
+ - type: precision_at_1
915
+ value: 28.063
916
+ - type: precision_at_10
917
+ value: 7.1739999999999995
918
+ - type: precision_at_100
919
+ value: 1.486
920
+ - type: precision_at_1000
921
+ value: 0.23500000000000001
922
+ - type: precision_at_3
923
+ value: 15.217
924
+ - type: precision_at_5
925
+ value: 11.265
926
+ - type: recall_at_1
927
+ value: 23.339
928
+ - type: recall_at_10
929
+ value: 48.376999999999995
930
+ - type: recall_at_100
931
+ value: 76.053
932
+ - type: recall_at_1000
933
+ value: 92.455
934
+ - type: recall_at_3
935
+ value: 34.735
936
+ - type: recall_at_5
937
+ value: 40.71
938
+ - type: map_at_1
939
+ value: 18.925
940
+ - type: map_at_10
941
+ value: 26.017000000000003
942
+ - type: map_at_100
943
+ value: 27.034000000000002
944
+ - type: map_at_1000
945
+ value: 27.156000000000002
946
+ - type: map_at_3
947
+ value: 23.604
948
+ - type: map_at_5
949
+ value: 24.75
950
+ - type: mrr_at_1
951
+ value: 20.333000000000002
952
+ - type: mrr_at_10
953
+ value: 27.915
954
+ - type: mrr_at_100
955
+ value: 28.788000000000004
956
+ - type: mrr_at_1000
957
+ value: 28.877999999999997
958
+ - type: mrr_at_3
959
+ value: 25.446999999999996
960
+ - type: mrr_at_5
961
+ value: 26.648
962
+ - type: ndcg_at_1
963
+ value: 20.333000000000002
964
+ - type: ndcg_at_10
965
+ value: 30.673000000000002
966
+ - type: ndcg_at_100
967
+ value: 35.618
968
+ - type: ndcg_at_1000
969
+ value: 38.517
970
+ - type: ndcg_at_3
971
+ value: 25.71
972
+ - type: ndcg_at_5
973
+ value: 27.679
974
+ - type: precision_at_1
975
+ value: 20.333000000000002
976
+ - type: precision_at_10
977
+ value: 4.9910000000000005
978
+ - type: precision_at_100
979
+ value: 0.8130000000000001
980
+ - type: precision_at_1000
981
+ value: 0.117
982
+ - type: precision_at_3
983
+ value: 11.029
984
+ - type: precision_at_5
985
+ value: 7.8740000000000006
986
+ - type: recall_at_1
987
+ value: 18.925
988
+ - type: recall_at_10
989
+ value: 43.311
990
+ - type: recall_at_100
991
+ value: 66.308
992
+ - type: recall_at_1000
993
+ value: 87.49
994
+ - type: recall_at_3
995
+ value: 29.596
996
+ - type: recall_at_5
997
+ value: 34.245
998
+ task:
999
+ type: Retrieval
1000
+ - dataset:
1001
+ config: default
1002
+ name: MTEB ClimateFEVER
1003
+ revision: None
1004
+ split: test
1005
+ type: climate-fever
1006
+ metrics:
1007
+ - type: map_at_1
1008
+ value: 13.714
1009
+ - type: map_at_10
1010
+ value: 23.194
1011
+ - type: map_at_100
1012
+ value: 24.976000000000003
1013
+ - type: map_at_1000
1014
+ value: 25.166
1015
+ - type: map_at_3
1016
+ value: 19.709
1017
+ - type: map_at_5
1018
+ value: 21.523999999999997
1019
+ - type: mrr_at_1
1020
+ value: 30.619000000000003
1021
+ - type: mrr_at_10
1022
+ value: 42.563
1023
+ - type: mrr_at_100
1024
+ value: 43.386
1025
+ - type: mrr_at_1000
1026
+ value: 43.423
1027
+ - type: mrr_at_3
1028
+ value: 39.555
1029
+ - type: mrr_at_5
1030
+ value: 41.268
1031
+ - type: ndcg_at_1
1032
+ value: 30.619000000000003
1033
+ - type: ndcg_at_10
1034
+ value: 31.836
1035
+ - type: ndcg_at_100
1036
+ value: 38.652
1037
+ - type: ndcg_at_1000
1038
+ value: 42.088
1039
+ - type: ndcg_at_3
1040
+ value: 26.733
1041
+ - type: ndcg_at_5
1042
+ value: 28.435
1043
+ - type: precision_at_1
1044
+ value: 30.619000000000003
1045
+ - type: precision_at_10
1046
+ value: 9.751999999999999
1047
+ - type: precision_at_100
1048
+ value: 1.71
1049
+ - type: precision_at_1000
1050
+ value: 0.23500000000000001
1051
+ - type: precision_at_3
1052
+ value: 19.935
1053
+ - type: precision_at_5
1054
+ value: 14.984
1055
+ - type: recall_at_1
1056
+ value: 13.714
1057
+ - type: recall_at_10
1058
+ value: 37.26
1059
+ - type: recall_at_100
1060
+ value: 60.546
1061
+ - type: recall_at_1000
1062
+ value: 79.899
1063
+ - type: recall_at_3
1064
+ value: 24.325
1065
+ - type: recall_at_5
1066
+ value: 29.725
1067
+ task:
1068
+ type: Retrieval
1069
+ - dataset:
1070
+ config: default
1071
+ name: MTEB DBPedia
1072
+ revision: None
1073
+ split: test
1074
+ type: dbpedia-entity
1075
+ metrics:
1076
+ - type: map_at_1
1077
+ value: 8.462
1078
+ - type: map_at_10
1079
+ value: 18.637
1080
+ - type: map_at_100
1081
+ value: 26.131999999999998
1082
+ - type: map_at_1000
1083
+ value: 27.607
1084
+ - type: map_at_3
1085
+ value: 13.333
1086
+ - type: map_at_5
1087
+ value: 15.654000000000002
1088
+ - type: mrr_at_1
1089
+ value: 66.25
1090
+ - type: mrr_at_10
1091
+ value: 74.32600000000001
1092
+ - type: mrr_at_100
1093
+ value: 74.60900000000001
1094
+ - type: mrr_at_1000
1095
+ value: 74.62
1096
+ - type: mrr_at_3
1097
+ value: 72.667
1098
+ - type: mrr_at_5
1099
+ value: 73.817
1100
+ - type: ndcg_at_1
1101
+ value: 53.87499999999999
1102
+ - type: ndcg_at_10
1103
+ value: 40.028999999999996
1104
+ - type: ndcg_at_100
1105
+ value: 44.199
1106
+ - type: ndcg_at_1000
1107
+ value: 51.629999999999995
1108
+ - type: ndcg_at_3
1109
+ value: 44.113
1110
+ - type: ndcg_at_5
1111
+ value: 41.731
1112
+ - type: precision_at_1
1113
+ value: 66.25
1114
+ - type: precision_at_10
1115
+ value: 31.900000000000002
1116
+ - type: precision_at_100
1117
+ value: 10.043000000000001
1118
+ - type: precision_at_1000
1119
+ value: 1.926
1120
+ - type: precision_at_3
1121
+ value: 47.417
1122
+ - type: precision_at_5
1123
+ value: 40.65
1124
+ - type: recall_at_1
1125
+ value: 8.462
1126
+ - type: recall_at_10
1127
+ value: 24.293
1128
+ - type: recall_at_100
1129
+ value: 50.146
1130
+ - type: recall_at_1000
1131
+ value: 74.034
1132
+ - type: recall_at_3
1133
+ value: 14.967
1134
+ - type: recall_at_5
1135
+ value: 18.682000000000002
1136
+ task:
1137
+ type: Retrieval
1138
+ - dataset:
1139
+ config: default
1140
+ name: MTEB EmotionClassification
1141
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1142
+ split: test
1143
+ type: mteb/emotion
1144
+ metrics:
1145
+ - type: accuracy
1146
+ value: 47.84499999999999
1147
+ - type: f1
1148
+ value: 42.48106691979349
1149
+ task:
1150
+ type: Classification
1151
+ - dataset:
1152
+ config: default
1153
+ name: MTEB FEVER
1154
+ revision: None
1155
+ split: test
1156
+ type: fever
1157
+ metrics:
1158
+ - type: map_at_1
1159
+ value: 74.034
1160
+ - type: map_at_10
1161
+ value: 82.76
1162
+ - type: map_at_100
1163
+ value: 82.968
1164
+ - type: map_at_1000
1165
+ value: 82.98299999999999
1166
+ - type: map_at_3
1167
+ value: 81.768
1168
+ - type: map_at_5
1169
+ value: 82.418
1170
+ - type: mrr_at_1
1171
+ value: 80.048
1172
+ - type: mrr_at_10
1173
+ value: 87.64999999999999
1174
+ - type: mrr_at_100
1175
+ value: 87.712
1176
+ - type: mrr_at_1000
1177
+ value: 87.713
1178
+ - type: mrr_at_3
1179
+ value: 87.01100000000001
1180
+ - type: mrr_at_5
1181
+ value: 87.466
1182
+ - type: ndcg_at_1
1183
+ value: 80.048
1184
+ - type: ndcg_at_10
1185
+ value: 86.643
1186
+ - type: ndcg_at_100
1187
+ value: 87.361
1188
+ - type: ndcg_at_1000
1189
+ value: 87.606
1190
+ - type: ndcg_at_3
1191
+ value: 85.137
1192
+ - type: ndcg_at_5
1193
+ value: 86.016
1194
+ - type: precision_at_1
1195
+ value: 80.048
1196
+ - type: precision_at_10
1197
+ value: 10.372
1198
+ - type: precision_at_100
1199
+ value: 1.093
1200
+ - type: precision_at_1000
1201
+ value: 0.11299999999999999
1202
+ - type: precision_at_3
1203
+ value: 32.638
1204
+ - type: precision_at_5
1205
+ value: 20.177
1206
+ - type: recall_at_1
1207
+ value: 74.034
1208
+ - type: recall_at_10
1209
+ value: 93.769
1210
+ - type: recall_at_100
1211
+ value: 96.569
1212
+ - type: recall_at_1000
1213
+ value: 98.039
1214
+ - type: recall_at_3
1215
+ value: 89.581
1216
+ - type: recall_at_5
1217
+ value: 91.906
1218
+ task:
1219
+ type: Retrieval
1220
+ - dataset:
1221
+ config: default
1222
+ name: MTEB FiQA2018
1223
+ revision: None
1224
+ split: test
1225
+ type: fiqa
1226
+ metrics:
1227
+ - type: map_at_1
1228
+ value: 20.5
1229
+ - type: map_at_10
1230
+ value: 32.857
1231
+ - type: map_at_100
1232
+ value: 34.589
1233
+ - type: map_at_1000
1234
+ value: 34.778
1235
+ - type: map_at_3
1236
+ value: 29.160999999999998
1237
+ - type: map_at_5
1238
+ value: 31.033
1239
+ - type: mrr_at_1
1240
+ value: 40.123
1241
+ - type: mrr_at_10
1242
+ value: 48.776
1243
+ - type: mrr_at_100
1244
+ value: 49.495
1245
+ - type: mrr_at_1000
1246
+ value: 49.539
1247
+ - type: mrr_at_3
1248
+ value: 46.605000000000004
1249
+ - type: mrr_at_5
1250
+ value: 47.654
1251
+ - type: ndcg_at_1
1252
+ value: 40.123
1253
+ - type: ndcg_at_10
1254
+ value: 40.343
1255
+ - type: ndcg_at_100
1256
+ value: 46.56
1257
+ - type: ndcg_at_1000
1258
+ value: 49.777
1259
+ - type: ndcg_at_3
1260
+ value: 37.322
1261
+ - type: ndcg_at_5
1262
+ value: 37.791000000000004
1263
+ - type: precision_at_1
1264
+ value: 40.123
1265
+ - type: precision_at_10
1266
+ value: 11.08
1267
+ - type: precision_at_100
1268
+ value: 1.752
1269
+ - type: precision_at_1000
1270
+ value: 0.232
1271
+ - type: precision_at_3
1272
+ value: 24.897
1273
+ - type: precision_at_5
1274
+ value: 17.809
1275
+ - type: recall_at_1
1276
+ value: 20.5
1277
+ - type: recall_at_10
1278
+ value: 46.388
1279
+ - type: recall_at_100
1280
+ value: 69.552
1281
+ - type: recall_at_1000
1282
+ value: 89.011
1283
+ - type: recall_at_3
1284
+ value: 33.617999999999995
1285
+ - type: recall_at_5
1286
+ value: 38.211
1287
+ task:
1288
+ type: Retrieval
1289
+ - dataset:
1290
+ config: default
1291
+ name: MTEB HotpotQA
1292
+ revision: None
1293
+ split: test
1294
+ type: hotpotqa
1295
+ metrics:
1296
+ - type: map_at_1
1297
+ value: 39.135999999999996
1298
+ - type: map_at_10
1299
+ value: 61.673
1300
+ - type: map_at_100
1301
+ value: 62.562
1302
+ - type: map_at_1000
1303
+ value: 62.62
1304
+ - type: map_at_3
1305
+ value: 58.467999999999996
1306
+ - type: map_at_5
1307
+ value: 60.463
1308
+ - type: mrr_at_1
1309
+ value: 78.271
1310
+ - type: mrr_at_10
1311
+ value: 84.119
1312
+ - type: mrr_at_100
1313
+ value: 84.29299999999999
1314
+ - type: mrr_at_1000
1315
+ value: 84.299
1316
+ - type: mrr_at_3
1317
+ value: 83.18900000000001
1318
+ - type: mrr_at_5
1319
+ value: 83.786
1320
+ - type: ndcg_at_1
1321
+ value: 78.271
1322
+ - type: ndcg_at_10
1323
+ value: 69.935
1324
+ - type: ndcg_at_100
1325
+ value: 73.01299999999999
1326
+ - type: ndcg_at_1000
1327
+ value: 74.126
1328
+ - type: ndcg_at_3
1329
+ value: 65.388
1330
+ - type: ndcg_at_5
1331
+ value: 67.906
1332
+ - type: precision_at_1
1333
+ value: 78.271
1334
+ - type: precision_at_10
1335
+ value: 14.562
1336
+ - type: precision_at_100
1337
+ value: 1.6969999999999998
1338
+ - type: precision_at_1000
1339
+ value: 0.184
1340
+ - type: precision_at_3
1341
+ value: 41.841
1342
+ - type: precision_at_5
1343
+ value: 27.087
1344
+ - type: recall_at_1
1345
+ value: 39.135999999999996
1346
+ - type: recall_at_10
1347
+ value: 72.809
1348
+ - type: recall_at_100
1349
+ value: 84.86200000000001
1350
+ - type: recall_at_1000
1351
+ value: 92.208
1352
+ - type: recall_at_3
1353
+ value: 62.76199999999999
1354
+ - type: recall_at_5
1355
+ value: 67.718
1356
+ task:
1357
+ type: Retrieval
1358
+ - dataset:
1359
+ config: default
1360
+ name: MTEB ImdbClassification
1361
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1362
+ split: test
1363
+ type: mteb/imdb
1364
+ metrics:
1365
+ - type: accuracy
1366
+ value: 90.60600000000001
1367
+ - type: ap
1368
+ value: 86.6579587804335
1369
+ - type: f1
1370
+ value: 90.5938853929307
1371
+ task:
1372
+ type: Classification
1373
+ - dataset:
1374
+ config: default
1375
+ name: MTEB MSMARCO
1376
+ revision: None
1377
+ split: dev
1378
+ type: msmarco
1379
+ metrics:
1380
+ - type: map_at_1
1381
+ value: 21.852
1382
+ - type: map_at_10
1383
+ value: 33.982
1384
+ - type: map_at_100
1385
+ value: 35.116
1386
+ - type: map_at_1000
1387
+ value: 35.167
1388
+ - type: map_at_3
1389
+ value: 30.134
1390
+ - type: map_at_5
1391
+ value: 32.340999999999994
1392
+ - type: mrr_at_1
1393
+ value: 22.479
1394
+ - type: mrr_at_10
1395
+ value: 34.594
1396
+ - type: mrr_at_100
1397
+ value: 35.672
1398
+ - type: mrr_at_1000
1399
+ value: 35.716
1400
+ - type: mrr_at_3
1401
+ value: 30.84
1402
+ - type: mrr_at_5
1403
+ value: 32.998
1404
+ - type: ndcg_at_1
1405
+ value: 22.493
1406
+ - type: ndcg_at_10
1407
+ value: 40.833000000000006
1408
+ - type: ndcg_at_100
1409
+ value: 46.357
1410
+ - type: ndcg_at_1000
1411
+ value: 47.637
1412
+ - type: ndcg_at_3
1413
+ value: 32.995999999999995
1414
+ - type: ndcg_at_5
1415
+ value: 36.919000000000004
1416
+ - type: precision_at_1
1417
+ value: 22.493
1418
+ - type: precision_at_10
1419
+ value: 6.465999999999999
1420
+ - type: precision_at_100
1421
+ value: 0.9249999999999999
1422
+ - type: precision_at_1000
1423
+ value: 0.104
1424
+ - type: precision_at_3
1425
+ value: 14.030999999999999
1426
+ - type: precision_at_5
1427
+ value: 10.413
1428
+ - type: recall_at_1
1429
+ value: 21.852
1430
+ - type: recall_at_10
1431
+ value: 61.934999999999995
1432
+ - type: recall_at_100
1433
+ value: 87.611
1434
+ - type: recall_at_1000
1435
+ value: 97.441
1436
+ - type: recall_at_3
1437
+ value: 40.583999999999996
1438
+ - type: recall_at_5
1439
+ value: 49.992999999999995
1440
+ task:
1441
+ type: Retrieval
1442
+ - dataset:
1443
+ config: en
1444
+ name: MTEB MTOPDomainClassification (en)
1445
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1446
+ split: test
1447
+ type: mteb/mtop_domain
1448
+ metrics:
1449
+ - type: accuracy
1450
+ value: 93.36069311445507
1451
+ - type: f1
1452
+ value: 93.16456330371453
1453
+ task:
1454
+ type: Classification
1455
+ - dataset:
1456
+ config: en
1457
+ name: MTEB MTOPIntentClassification (en)
1458
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1459
+ split: test
1460
+ type: mteb/mtop_intent
1461
+ metrics:
1462
+ - type: accuracy
1463
+ value: 74.74692202462381
1464
+ - type: f1
1465
+ value: 58.17903579421599
1466
+ task:
1467
+ type: Classification
1468
+ - dataset:
1469
+ config: en
1470
+ name: MTEB MassiveIntentClassification (en)
1471
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1472
+ split: test
1473
+ type: mteb/amazon_massive_intent
1474
+ metrics:
1475
+ - type: accuracy
1476
+ value: 74.80833893745796
1477
+ - type: f1
1478
+ value: 72.70786592684664
1479
+ task:
1480
+ type: Classification
1481
+ - dataset:
1482
+ config: en
1483
+ name: MTEB MassiveScenarioClassification (en)
1484
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1485
+ split: test
1486
+ type: mteb/amazon_massive_scenario
1487
+ metrics:
1488
+ - type: accuracy
1489
+ value: 78.69872225958305
1490
+ - type: f1
1491
+ value: 78.61626934504731
1492
+ task:
1493
+ type: Classification
1494
+ - dataset:
1495
+ config: default
1496
+ name: MTEB MedrxivClusteringP2P
1497
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1498
+ split: test
1499
+ type: mteb/medrxiv-clustering-p2p
1500
+ metrics:
1501
+ - type: v_measure
1502
+ value: 33.058658628717694
1503
+ task:
1504
+ type: Clustering
1505
+ - dataset:
1506
+ config: default
1507
+ name: MTEB MedrxivClusteringS2S
1508
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1509
+ split: test
1510
+ type: mteb/medrxiv-clustering-s2s
1511
+ metrics:
1512
+ - type: v_measure
1513
+ value: 30.85561739360599
1514
+ task:
1515
+ type: Clustering
1516
+ - dataset:
1517
+ config: default
1518
+ name: MTEB MindSmallReranking
1519
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1520
+ split: test
1521
+ type: mteb/mind_small
1522
+ metrics:
1523
+ - type: map
1524
+ value: 31.290259910144385
1525
+ - type: mrr
1526
+ value: 32.44223046102856
1527
+ task:
1528
+ type: Reranking
1529
+ - dataset:
1530
+ config: default
1531
+ name: MTEB NFCorpus
1532
+ revision: None
1533
+ split: test
1534
+ type: nfcorpus
1535
+ metrics:
1536
+ - type: map_at_1
1537
+ value: 5.288
1538
+ - type: map_at_10
1539
+ value: 12.267999999999999
1540
+ - type: map_at_100
1541
+ value: 15.557000000000002
1542
+ - type: map_at_1000
1543
+ value: 16.98
1544
+ - type: map_at_3
1545
+ value: 8.866
1546
+ - type: map_at_5
1547
+ value: 10.418
1548
+ - type: mrr_at_1
1549
+ value: 43.653
1550
+ - type: mrr_at_10
1551
+ value: 52.681
1552
+ - type: mrr_at_100
1553
+ value: 53.315999999999995
1554
+ - type: mrr_at_1000
1555
+ value: 53.357
1556
+ - type: mrr_at_3
1557
+ value: 51.393
1558
+ - type: mrr_at_5
1559
+ value: 51.903999999999996
1560
+ - type: ndcg_at_1
1561
+ value: 42.415000000000006
1562
+ - type: ndcg_at_10
1563
+ value: 34.305
1564
+ - type: ndcg_at_100
1565
+ value: 30.825999999999997
1566
+ - type: ndcg_at_1000
1567
+ value: 39.393
1568
+ - type: ndcg_at_3
1569
+ value: 39.931
1570
+ - type: ndcg_at_5
1571
+ value: 37.519999999999996
1572
+ - type: precision_at_1
1573
+ value: 43.653
1574
+ - type: precision_at_10
1575
+ value: 25.728
1576
+ - type: precision_at_100
1577
+ value: 7.932
1578
+ - type: precision_at_1000
1579
+ value: 2.07
1580
+ - type: precision_at_3
1581
+ value: 38.184000000000005
1582
+ - type: precision_at_5
1583
+ value: 32.879000000000005
1584
+ - type: recall_at_1
1585
+ value: 5.288
1586
+ - type: recall_at_10
1587
+ value: 16.195
1588
+ - type: recall_at_100
1589
+ value: 31.135
1590
+ - type: recall_at_1000
1591
+ value: 61.531000000000006
1592
+ - type: recall_at_3
1593
+ value: 10.313
1594
+ - type: recall_at_5
1595
+ value: 12.754999999999999
1596
+ task:
1597
+ type: Retrieval
1598
+ - dataset:
1599
+ config: default
1600
+ name: MTEB NQ
1601
+ revision: None
1602
+ split: test
1603
+ type: nq
1604
+ metrics:
1605
+ - type: map_at_1
1606
+ value: 28.216
1607
+ - type: map_at_10
1608
+ value: 42.588
1609
+ - type: map_at_100
1610
+ value: 43.702999999999996
1611
+ - type: map_at_1000
1612
+ value: 43.739
1613
+ - type: map_at_3
1614
+ value: 38.177
1615
+ - type: map_at_5
1616
+ value: 40.754000000000005
1617
+ - type: mrr_at_1
1618
+ value: 31.866
1619
+ - type: mrr_at_10
1620
+ value: 45.189
1621
+ - type: mrr_at_100
1622
+ value: 46.056000000000004
1623
+ - type: mrr_at_1000
1624
+ value: 46.081
1625
+ - type: mrr_at_3
1626
+ value: 41.526999999999994
1627
+ - type: mrr_at_5
1628
+ value: 43.704
1629
+ - type: ndcg_at_1
1630
+ value: 31.837
1631
+ - type: ndcg_at_10
1632
+ value: 50.178
1633
+ - type: ndcg_at_100
1634
+ value: 54.98800000000001
1635
+ - type: ndcg_at_1000
1636
+ value: 55.812
1637
+ - type: ndcg_at_3
1638
+ value: 41.853
1639
+ - type: ndcg_at_5
1640
+ value: 46.153
1641
+ - type: precision_at_1
1642
+ value: 31.837
1643
+ - type: precision_at_10
1644
+ value: 8.43
1645
+ - type: precision_at_100
1646
+ value: 1.1119999999999999
1647
+ - type: precision_at_1000
1648
+ value: 0.11900000000000001
1649
+ - type: precision_at_3
1650
+ value: 19.023
1651
+ - type: precision_at_5
1652
+ value: 13.911000000000001
1653
+ - type: recall_at_1
1654
+ value: 28.216
1655
+ - type: recall_at_10
1656
+ value: 70.8
1657
+ - type: recall_at_100
1658
+ value: 91.857
1659
+ - type: recall_at_1000
1660
+ value: 97.941
1661
+ - type: recall_at_3
1662
+ value: 49.196
1663
+ - type: recall_at_5
1664
+ value: 59.072
1665
+ task:
1666
+ type: Retrieval
1667
+ - dataset:
1668
+ config: default
1669
+ name: MTEB QuoraRetrieval
1670
+ revision: None
1671
+ split: test
1672
+ type: quora
1673
+ metrics:
1674
+ - type: map_at_1
1675
+ value: 71.22800000000001
1676
+ - type: map_at_10
1677
+ value: 85.115
1678
+ - type: map_at_100
1679
+ value: 85.72
1680
+ - type: map_at_1000
1681
+ value: 85.737
1682
+ - type: map_at_3
1683
+ value: 82.149
1684
+ - type: map_at_5
1685
+ value: 84.029
1686
+ - type: mrr_at_1
1687
+ value: 81.96
1688
+ - type: mrr_at_10
1689
+ value: 88.00200000000001
1690
+ - type: mrr_at_100
1691
+ value: 88.088
1692
+ - type: mrr_at_1000
1693
+ value: 88.089
1694
+ - type: mrr_at_3
1695
+ value: 87.055
1696
+ - type: mrr_at_5
1697
+ value: 87.715
1698
+ - type: ndcg_at_1
1699
+ value: 82.01
1700
+ - type: ndcg_at_10
1701
+ value: 88.78
1702
+ - type: ndcg_at_100
1703
+ value: 89.91
1704
+ - type: ndcg_at_1000
1705
+ value: 90.013
1706
+ - type: ndcg_at_3
1707
+ value: 85.957
1708
+ - type: ndcg_at_5
1709
+ value: 87.56
1710
+ - type: precision_at_1
1711
+ value: 82.01
1712
+ - type: precision_at_10
1713
+ value: 13.462
1714
+ - type: precision_at_100
1715
+ value: 1.528
1716
+ - type: precision_at_1000
1717
+ value: 0.157
1718
+ - type: precision_at_3
1719
+ value: 37.553
1720
+ - type: precision_at_5
1721
+ value: 24.732000000000003
1722
+ - type: recall_at_1
1723
+ value: 71.22800000000001
1724
+ - type: recall_at_10
1725
+ value: 95.69
1726
+ - type: recall_at_100
1727
+ value: 99.531
1728
+ - type: recall_at_1000
1729
+ value: 99.98
1730
+ - type: recall_at_3
1731
+ value: 87.632
1732
+ - type: recall_at_5
1733
+ value: 92.117
1734
+ task:
1735
+ type: Retrieval
1736
+ - dataset:
1737
+ config: default
1738
+ name: MTEB RedditClustering
1739
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1740
+ split: test
1741
+ type: mteb/reddit-clustering
1742
+ metrics:
1743
+ - type: v_measure
1744
+ value: 52.31768034366916
1745
+ task:
1746
+ type: Clustering
1747
+ - dataset:
1748
+ config: default
1749
+ name: MTEB RedditClusteringP2P
1750
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
1751
+ split: test
1752
+ type: mteb/reddit-clustering-p2p
1753
+ metrics:
1754
+ - type: v_measure
1755
+ value: 60.640266772723606
1756
+ task:
1757
+ type: Clustering
1758
+ - dataset:
1759
+ config: default
1760
+ name: MTEB SCIDOCS
1761
+ revision: None
1762
+ split: test
1763
+ type: scidocs
1764
+ metrics:
1765
+ - type: map_at_1
1766
+ value: 4.7780000000000005
1767
+ - type: map_at_10
1768
+ value: 12.299
1769
+ - type: map_at_100
1770
+ value: 14.363000000000001
1771
+ - type: map_at_1000
1772
+ value: 14.71
1773
+ - type: map_at_3
1774
+ value: 8.738999999999999
1775
+ - type: map_at_5
1776
+ value: 10.397
1777
+ - type: mrr_at_1
1778
+ value: 23.599999999999998
1779
+ - type: mrr_at_10
1780
+ value: 34.845
1781
+ - type: mrr_at_100
1782
+ value: 35.916
1783
+ - type: mrr_at_1000
1784
+ value: 35.973
1785
+ - type: mrr_at_3
1786
+ value: 31.7
1787
+ - type: mrr_at_5
1788
+ value: 33.535
1789
+ - type: ndcg_at_1
1790
+ value: 23.599999999999998
1791
+ - type: ndcg_at_10
1792
+ value: 20.522000000000002
1793
+ - type: ndcg_at_100
1794
+ value: 28.737000000000002
1795
+ - type: ndcg_at_1000
1796
+ value: 34.596
1797
+ - type: ndcg_at_3
1798
+ value: 19.542
1799
+ - type: ndcg_at_5
1800
+ value: 16.958000000000002
1801
+ - type: precision_at_1
1802
+ value: 23.599999999999998
1803
+ - type: precision_at_10
1804
+ value: 10.67
1805
+ - type: precision_at_100
1806
+ value: 2.259
1807
+ - type: precision_at_1000
1808
+ value: 0.367
1809
+ - type: precision_at_3
1810
+ value: 18.333
1811
+ - type: precision_at_5
1812
+ value: 14.879999999999999
1813
+ - type: recall_at_1
1814
+ value: 4.7780000000000005
1815
+ - type: recall_at_10
1816
+ value: 21.617
1817
+ - type: recall_at_100
1818
+ value: 45.905
1819
+ - type: recall_at_1000
1820
+ value: 74.42
1821
+ - type: recall_at_3
1822
+ value: 11.148
1823
+ - type: recall_at_5
1824
+ value: 15.082999999999998
1825
+ task:
1826
+ type: Retrieval
1827
+ - dataset:
1828
+ config: default
1829
+ name: MTEB SICK-R
1830
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1831
+ split: test
1832
+ type: mteb/sickr-sts
1833
+ metrics:
1834
+ - type: cos_sim_pearson
1835
+ value: 83.22372750297885
1836
+ - type: cos_sim_spearman
1837
+ value: 79.40972617119405
1838
+ - type: euclidean_pearson
1839
+ value: 80.6101072020434
1840
+ - type: euclidean_spearman
1841
+ value: 79.53844217225202
1842
+ - type: manhattan_pearson
1843
+ value: 80.57265975286111
1844
+ - type: manhattan_spearman
1845
+ value: 79.46335611792958
1846
+ task:
1847
+ type: STS
1848
+ - dataset:
1849
+ config: default
1850
+ name: MTEB STS12
1851
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
1852
+ split: test
1853
+ type: mteb/sts12-sts
1854
+ metrics:
1855
+ - type: cos_sim_pearson
1856
+ value: 85.43713315520749
1857
+ - type: cos_sim_spearman
1858
+ value: 77.44128693329532
1859
+ - type: euclidean_pearson
1860
+ value: 81.63869928101123
1861
+ - type: euclidean_spearman
1862
+ value: 77.29512977961515
1863
+ - type: manhattan_pearson
1864
+ value: 81.63704185566183
1865
+ - type: manhattan_spearman
1866
+ value: 77.29909412738657
1867
+ task:
1868
+ type: STS
1869
+ - dataset:
1870
+ config: default
1871
+ name: MTEB STS13
1872
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1873
+ split: test
1874
+ type: mteb/sts13-sts
1875
+ metrics:
1876
+ - type: cos_sim_pearson
1877
+ value: 81.59451537860527
1878
+ - type: cos_sim_spearman
1879
+ value: 82.97994638856723
1880
+ - type: euclidean_pearson
1881
+ value: 82.89478688288412
1882
+ - type: euclidean_spearman
1883
+ value: 83.58740751053104
1884
+ - type: manhattan_pearson
1885
+ value: 82.69140840941608
1886
+ - type: manhattan_spearman
1887
+ value: 83.33665956040555
1888
+ task:
1889
+ type: STS
1890
+ - dataset:
1891
+ config: default
1892
+ name: MTEB STS14
1893
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
1894
+ split: test
1895
+ type: mteb/sts14-sts
1896
+ metrics:
1897
+ - type: cos_sim_pearson
1898
+ value: 82.00756527711764
1899
+ - type: cos_sim_spearman
1900
+ value: 81.83560996841379
1901
+ - type: euclidean_pearson
1902
+ value: 82.07684151976518
1903
+ - type: euclidean_spearman
1904
+ value: 82.00913052060511
1905
+ - type: manhattan_pearson
1906
+ value: 82.05690778488794
1907
+ - type: manhattan_spearman
1908
+ value: 82.02260252019525
1909
+ task:
1910
+ type: STS
1911
+ - dataset:
1912
+ config: default
1913
+ name: MTEB STS15
1914
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
1915
+ split: test
1916
+ type: mteb/sts15-sts
1917
+ metrics:
1918
+ - type: cos_sim_pearson
1919
+ value: 86.13710262895447
1920
+ - type: cos_sim_spearman
1921
+ value: 87.26412811156248
1922
+ - type: euclidean_pearson
1923
+ value: 86.94151453230228
1924
+ - type: euclidean_spearman
1925
+ value: 87.5363796699571
1926
+ - type: manhattan_pearson
1927
+ value: 86.86989424083748
1928
+ - type: manhattan_spearman
1929
+ value: 87.47315940781353
1930
+ task:
1931
+ type: STS
1932
+ - dataset:
1933
+ config: default
1934
+ name: MTEB STS16
1935
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
1936
+ split: test
1937
+ type: mteb/sts16-sts
1938
+ metrics:
1939
+ - type: cos_sim_pearson
1940
+ value: 83.0230597603627
1941
+ - type: cos_sim_spearman
1942
+ value: 84.93344499318864
1943
+ - type: euclidean_pearson
1944
+ value: 84.23754743431141
1945
+ - type: euclidean_spearman
1946
+ value: 85.09707376597099
1947
+ - type: manhattan_pearson
1948
+ value: 84.04325160987763
1949
+ - type: manhattan_spearman
1950
+ value: 84.89353071339909
1951
+ task:
1952
+ type: STS
1953
+ - dataset:
1954
+ config: en-en
1955
+ name: MTEB STS17 (en-en)
1956
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
1957
+ split: test
1958
+ type: mteb/sts17-crosslingual-sts
1959
+ metrics:
1960
+ - type: cos_sim_pearson
1961
+ value: 86.75620824563921
1962
+ - type: cos_sim_spearman
1963
+ value: 87.15065513706398
1964
+ - type: euclidean_pearson
1965
+ value: 88.26281533633521
1966
+ - type: euclidean_spearman
1967
+ value: 87.51963738643983
1968
+ - type: manhattan_pearson
1969
+ value: 88.25599267618065
1970
+ - type: manhattan_spearman
1971
+ value: 87.58048736047483
1972
+ task:
1973
+ type: STS
1974
+ - dataset:
1975
+ config: en
1976
+ name: MTEB STS22 (en)
1977
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
1978
+ split: test
1979
+ type: mteb/sts22-crosslingual-sts
1980
+ metrics:
1981
+ - type: cos_sim_pearson
1982
+ value: 64.74645319195137
1983
+ - type: cos_sim_spearman
1984
+ value: 65.29996325037214
1985
+ - type: euclidean_pearson
1986
+ value: 67.04297794086443
1987
+ - type: euclidean_spearman
1988
+ value: 65.43841726694343
1989
+ - type: manhattan_pearson
1990
+ value: 67.39459955690904
1991
+ - type: manhattan_spearman
1992
+ value: 65.92864704413651
1993
+ task:
1994
+ type: STS
1995
+ - dataset:
1996
+ config: default
1997
+ name: MTEB STSBenchmark
1998
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
1999
+ split: test
2000
+ type: mteb/stsbenchmark-sts
2001
+ metrics:
2002
+ - type: cos_sim_pearson
2003
+ value: 84.31291020270801
2004
+ - type: cos_sim_spearman
2005
+ value: 85.86473738688068
2006
+ - type: euclidean_pearson
2007
+ value: 85.65537275064152
2008
+ - type: euclidean_spearman
2009
+ value: 86.13087454209642
2010
+ - type: manhattan_pearson
2011
+ value: 85.43946955047609
2012
+ - type: manhattan_spearman
2013
+ value: 85.91568175344916
2014
+ task:
2015
+ type: STS
2016
+ - dataset:
2017
+ config: default
2018
+ name: MTEB SciDocsRR
2019
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2020
+ split: test
2021
+ type: mteb/scidocs-reranking
2022
+ metrics:
2023
+ - type: map
2024
+ value: 85.93798118350695
2025
+ - type: mrr
2026
+ value: 95.93536274908824
2027
+ task:
2028
+ type: Reranking
2029
+ - dataset:
2030
+ config: default
2031
+ name: MTEB SciFact
2032
+ revision: None
2033
+ split: test
2034
+ type: scifact
2035
+ metrics:
2036
+ - type: map_at_1
2037
+ value: 57.594
2038
+ - type: map_at_10
2039
+ value: 66.81899999999999
2040
+ - type: map_at_100
2041
+ value: 67.368
2042
+ - type: map_at_1000
2043
+ value: 67.4
2044
+ - type: map_at_3
2045
+ value: 64.061
2046
+ - type: map_at_5
2047
+ value: 65.47
2048
+ - type: mrr_at_1
2049
+ value: 60.667
2050
+ - type: mrr_at_10
2051
+ value: 68.219
2052
+ - type: mrr_at_100
2053
+ value: 68.655
2054
+ - type: mrr_at_1000
2055
+ value: 68.684
2056
+ - type: mrr_at_3
2057
+ value: 66.22200000000001
2058
+ - type: mrr_at_5
2059
+ value: 67.289
2060
+ - type: ndcg_at_1
2061
+ value: 60.667
2062
+ - type: ndcg_at_10
2063
+ value: 71.275
2064
+ - type: ndcg_at_100
2065
+ value: 73.642
2066
+ - type: ndcg_at_1000
2067
+ value: 74.373
2068
+ - type: ndcg_at_3
2069
+ value: 66.521
2070
+ - type: ndcg_at_5
2071
+ value: 68.581
2072
+ - type: precision_at_1
2073
+ value: 60.667
2074
+ - type: precision_at_10
2075
+ value: 9.433
2076
+ - type: precision_at_100
2077
+ value: 1.0699999999999998
2078
+ - type: precision_at_1000
2079
+ value: 0.11299999999999999
2080
+ - type: precision_at_3
2081
+ value: 25.556
2082
+ - type: precision_at_5
2083
+ value: 16.8
2084
+ - type: recall_at_1
2085
+ value: 57.594
2086
+ - type: recall_at_10
2087
+ value: 83.622
2088
+ - type: recall_at_100
2089
+ value: 94.167
2090
+ - type: recall_at_1000
2091
+ value: 99.667
2092
+ - type: recall_at_3
2093
+ value: 70.64399999999999
2094
+ - type: recall_at_5
2095
+ value: 75.983
2096
+ task:
2097
+ type: Retrieval
2098
+ - dataset:
2099
+ config: default
2100
+ name: MTEB SprintDuplicateQuestions
2101
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2102
+ split: test
2103
+ type: mteb/sprintduplicatequestions-pairclassification
2104
+ metrics:
2105
+ - type: cos_sim_accuracy
2106
+ value: 99.85841584158416
2107
+ - type: cos_sim_ap
2108
+ value: 96.66996142314342
2109
+ - type: cos_sim_f1
2110
+ value: 92.83208020050125
2111
+ - type: cos_sim_precision
2112
+ value: 93.06532663316584
2113
+ - type: cos_sim_recall
2114
+ value: 92.60000000000001
2115
+ - type: dot_accuracy
2116
+ value: 99.85841584158416
2117
+ - type: dot_ap
2118
+ value: 96.6775307676576
2119
+ - type: dot_f1
2120
+ value: 92.69289729177312
2121
+ - type: dot_precision
2122
+ value: 94.77533960292581
2123
+ - type: dot_recall
2124
+ value: 90.7
2125
+ - type: euclidean_accuracy
2126
+ value: 99.86138613861387
2127
+ - type: euclidean_ap
2128
+ value: 96.6338454403108
2129
+ - type: euclidean_f1
2130
+ value: 92.92214357937311
2131
+ - type: euclidean_precision
2132
+ value: 93.96728016359918
2133
+ - type: euclidean_recall
2134
+ value: 91.9
2135
+ - type: manhattan_accuracy
2136
+ value: 99.86237623762376
2137
+ - type: manhattan_ap
2138
+ value: 96.60370449645053
2139
+ - type: manhattan_f1
2140
+ value: 92.91177970423253
2141
+ - type: manhattan_precision
2142
+ value: 94.7970863683663
2143
+ - type: manhattan_recall
2144
+ value: 91.10000000000001
2145
+ - type: max_accuracy
2146
+ value: 99.86237623762376
2147
+ - type: max_ap
2148
+ value: 96.6775307676576
2149
+ - type: max_f1
2150
+ value: 92.92214357937311
2151
+ task:
2152
+ type: PairClassification
2153
+ - dataset:
2154
+ config: default
2155
+ name: MTEB StackExchangeClustering
2156
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2157
+ split: test
2158
+ type: mteb/stackexchange-clustering
2159
+ metrics:
2160
+ - type: v_measure
2161
+ value: 60.77977058695198
2162
+ task:
2163
+ type: Clustering
2164
+ - dataset:
2165
+ config: default
2166
+ name: MTEB StackExchangeClusteringP2P
2167
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2168
+ split: test
2169
+ type: mteb/stackexchange-clustering-p2p
2170
+ metrics:
2171
+ - type: v_measure
2172
+ value: 35.2725272535638
2173
+ task:
2174
+ type: Clustering
2175
+ - dataset:
2176
+ config: default
2177
+ name: MTEB StackOverflowDupQuestions
2178
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2179
+ split: test
2180
+ type: mteb/stackoverflowdupquestions-reranking
2181
+ metrics:
2182
+ - type: map
2183
+ value: 53.64052466362125
2184
+ - type: mrr
2185
+ value: 54.533067014684654
2186
+ task:
2187
+ type: Reranking
2188
+ - dataset:
2189
+ config: default
2190
+ name: MTEB SummEval
2191
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2192
+ split: test
2193
+ type: mteb/summeval
2194
+ metrics:
2195
+ - type: cos_sim_pearson
2196
+ value: 30.677624219206578
2197
+ - type: cos_sim_spearman
2198
+ value: 30.121368518123447
2199
+ - type: dot_pearson
2200
+ value: 30.69870088041608
2201
+ - type: dot_spearman
2202
+ value: 29.61284927093751
2203
+ task:
2204
+ type: Summarization
2205
+ - dataset:
2206
+ config: default
2207
+ name: MTEB TRECCOVID
2208
+ revision: None
2209
+ split: test
2210
+ type: trec-covid
2211
+ metrics:
2212
+ - type: map_at_1
2213
+ value: 0.22
2214
+ - type: map_at_10
2215
+ value: 1.855
2216
+ - type: map_at_100
2217
+ value: 9.885
2218
+ - type: map_at_1000
2219
+ value: 23.416999999999998
2220
+ - type: map_at_3
2221
+ value: 0.637
2222
+ - type: map_at_5
2223
+ value: 1.024
2224
+ - type: mrr_at_1
2225
+ value: 88.0
2226
+ - type: mrr_at_10
2227
+ value: 93.067
2228
+ - type: mrr_at_100
2229
+ value: 93.067
2230
+ - type: mrr_at_1000
2231
+ value: 93.067
2232
+ - type: mrr_at_3
2233
+ value: 92.667
2234
+ - type: mrr_at_5
2235
+ value: 93.067
2236
+ - type: ndcg_at_1
2237
+ value: 82.0
2238
+ - type: ndcg_at_10
2239
+ value: 75.899
2240
+ - type: ndcg_at_100
2241
+ value: 55.115
2242
+ - type: ndcg_at_1000
2243
+ value: 48.368
2244
+ - type: ndcg_at_3
2245
+ value: 79.704
2246
+ - type: ndcg_at_5
2247
+ value: 78.39699999999999
2248
+ - type: precision_at_1
2249
+ value: 88.0
2250
+ - type: precision_at_10
2251
+ value: 79.60000000000001
2252
+ - type: precision_at_100
2253
+ value: 56.06
2254
+ - type: precision_at_1000
2255
+ value: 21.206
2256
+ - type: precision_at_3
2257
+ value: 84.667
2258
+ - type: precision_at_5
2259
+ value: 83.2
2260
+ - type: recall_at_1
2261
+ value: 0.22
2262
+ - type: recall_at_10
2263
+ value: 2.078
2264
+ - type: recall_at_100
2265
+ value: 13.297
2266
+ - type: recall_at_1000
2267
+ value: 44.979
2268
+ - type: recall_at_3
2269
+ value: 0.6689999999999999
2270
+ - type: recall_at_5
2271
+ value: 1.106
2272
+ task:
2273
+ type: Retrieval
2274
+ - dataset:
2275
+ config: default
2276
+ name: MTEB Touche2020
2277
+ revision: None
2278
+ split: test
2279
+ type: webis-touche2020
2280
+ metrics:
2281
+ - type: map_at_1
2282
+ value: 2.258
2283
+ - type: map_at_10
2284
+ value: 10.439
2285
+ - type: map_at_100
2286
+ value: 16.89
2287
+ - type: map_at_1000
2288
+ value: 18.407999999999998
2289
+ - type: map_at_3
2290
+ value: 5.668
2291
+ - type: map_at_5
2292
+ value: 7.718
2293
+ - type: mrr_at_1
2294
+ value: 32.653
2295
+ - type: mrr_at_10
2296
+ value: 51.159
2297
+ - type: mrr_at_100
2298
+ value: 51.714000000000006
2299
+ - type: mrr_at_1000
2300
+ value: 51.714000000000006
2301
+ - type: mrr_at_3
2302
+ value: 47.959
2303
+ - type: mrr_at_5
2304
+ value: 50.407999999999994
2305
+ - type: ndcg_at_1
2306
+ value: 29.592000000000002
2307
+ - type: ndcg_at_10
2308
+ value: 26.037
2309
+ - type: ndcg_at_100
2310
+ value: 37.924
2311
+ - type: ndcg_at_1000
2312
+ value: 49.126999999999995
2313
+ - type: ndcg_at_3
2314
+ value: 30.631999999999998
2315
+ - type: ndcg_at_5
2316
+ value: 28.571
2317
+ - type: precision_at_1
2318
+ value: 32.653
2319
+ - type: precision_at_10
2320
+ value: 22.857
2321
+ - type: precision_at_100
2322
+ value: 7.754999999999999
2323
+ - type: precision_at_1000
2324
+ value: 1.529
2325
+ - type: precision_at_3
2326
+ value: 34.014
2327
+ - type: precision_at_5
2328
+ value: 29.796
2329
+ - type: recall_at_1
2330
+ value: 2.258
2331
+ - type: recall_at_10
2332
+ value: 16.554
2333
+ - type: recall_at_100
2334
+ value: 48.439
2335
+ - type: recall_at_1000
2336
+ value: 82.80499999999999
2337
+ - type: recall_at_3
2338
+ value: 7.283
2339
+ - type: recall_at_5
2340
+ value: 10.732
2341
+ task:
2342
+ type: Retrieval
2343
+ - dataset:
2344
+ config: default
2345
+ name: MTEB ToxicConversationsClassification
2346
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2347
+ split: test
2348
+ type: mteb/toxic_conversations_50k
2349
+ metrics:
2350
+ - type: accuracy
2351
+ value: 69.8858
2352
+ - type: ap
2353
+ value: 13.835684144362109
2354
+ - type: f1
2355
+ value: 53.803351693244586
2356
+ task:
2357
+ type: Classification
2358
+ - dataset:
2359
+ config: default
2360
+ name: MTEB TweetSentimentExtractionClassification
2361
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2362
+ split: test
2363
+ type: mteb/tweet_sentiment_extraction
2364
+ metrics:
2365
+ - type: accuracy
2366
+ value: 60.50650820599886
2367
+ - type: f1
2368
+ value: 60.84357825979259
2369
+ task:
2370
+ type: Classification
2371
+ - dataset:
2372
+ config: default
2373
+ name: MTEB TwentyNewsgroupsClustering
2374
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2375
+ split: test
2376
+ type: mteb/twentynewsgroups-clustering
2377
+ metrics:
2378
+ - type: v_measure
2379
+ value: 48.52131044852134
2380
+ task:
2381
+ type: Clustering
2382
+ - dataset:
2383
+ config: default
2384
+ name: MTEB TwitterSemEval2015
2385
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2386
+ split: test
2387
+ type: mteb/twittersemeval2015-pairclassification
2388
+ metrics:
2389
+ - type: cos_sim_accuracy
2390
+ value: 85.59337187816654
2391
+ - type: cos_sim_ap
2392
+ value: 73.23925826533437
2393
+ - type: cos_sim_f1
2394
+ value: 67.34693877551021
2395
+ - type: cos_sim_precision
2396
+ value: 62.40432237730752
2397
+ - type: cos_sim_recall
2398
+ value: 73.13984168865434
2399
+ - type: dot_accuracy
2400
+ value: 85.31322644096085
2401
+ - type: dot_ap
2402
+ value: 72.30723963807422
2403
+ - type: dot_f1
2404
+ value: 66.47051612112296
2405
+ - type: dot_precision
2406
+ value: 62.0792305930845
2407
+ - type: dot_recall
2408
+ value: 71.53034300791556
2409
+ - type: euclidean_accuracy
2410
+ value: 85.61125350181797
2411
+ - type: euclidean_ap
2412
+ value: 73.32843720487845
2413
+ - type: euclidean_f1
2414
+ value: 67.36549633745895
2415
+ - type: euclidean_precision
2416
+ value: 64.60755813953489
2417
+ - type: euclidean_recall
2418
+ value: 70.36939313984169
2419
+ - type: manhattan_accuracy
2420
+ value: 85.63509566668654
2421
+ - type: manhattan_ap
2422
+ value: 73.16658488311325
2423
+ - type: manhattan_f1
2424
+ value: 67.20597386434349
2425
+ - type: manhattan_precision
2426
+ value: 63.60424028268551
2427
+ - type: manhattan_recall
2428
+ value: 71.2401055408971
2429
+ - type: max_accuracy
2430
+ value: 85.63509566668654
2431
+ - type: max_ap
2432
+ value: 73.32843720487845
2433
+ - type: max_f1
2434
+ value: 67.36549633745895
2435
+ task:
2436
+ type: PairClassification
2437
+ - dataset:
2438
+ config: default
2439
+ name: MTEB TwitterURLCorpus
2440
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2441
+ split: test
2442
+ type: mteb/twitterurlcorpus-pairclassification
2443
+ metrics:
2444
+ - type: cos_sim_accuracy
2445
+ value: 88.33779640625606
2446
+ - type: cos_sim_ap
2447
+ value: 84.83868375898157
2448
+ - type: cos_sim_f1
2449
+ value: 77.16506154017773
2450
+ - type: cos_sim_precision
2451
+ value: 74.62064005753327
2452
+ - type: cos_sim_recall
2453
+ value: 79.88912842623961
2454
+ - type: dot_accuracy
2455
+ value: 88.02732176815307
2456
+ - type: dot_ap
2457
+ value: 83.95089283763002
2458
+ - type: dot_f1
2459
+ value: 76.29635101196631
2460
+ - type: dot_precision
2461
+ value: 73.31771720613288
2462
+ - type: dot_recall
2463
+ value: 79.52725592854944
2464
+ - type: euclidean_accuracy
2465
+ value: 88.44452206310397
2466
+ - type: euclidean_ap
2467
+ value: 84.98384576824827
2468
+ - type: euclidean_f1
2469
+ value: 77.29311047696697
2470
+ - type: euclidean_precision
2471
+ value: 74.51232583065381
2472
+ - type: euclidean_recall
2473
+ value: 80.28949799815214
2474
+ - type: manhattan_accuracy
2475
+ value: 88.47362906042613
2476
+ - type: manhattan_ap
2477
+ value: 84.91421462218432
2478
+ - type: manhattan_f1
2479
+ value: 77.05107637204792
2480
+ - type: manhattan_precision
2481
+ value: 74.74484256243214
2482
+ - type: manhattan_recall
2483
+ value: 79.50415768401602
2484
+ - type: max_accuracy
2485
+ value: 88.47362906042613
2486
+ - type: max_ap
2487
+ value: 84.98384576824827
2488
+ - type: max_f1
2489
+ value: 77.29311047696697
2490
+ task:
2491
+ type: PairClassification
2492
+ tags:
2493
+ - sentence-transformers
2494
+ - feature-extraction
2495
+ - sentence-similarity
2496
+ - transformers
2497
+ - mteb
2498
+ - onnx
2499
+ - teradata
2500
+
2501
+ ---
2502
+ # A Teradata Vantage compatible Embeddings Model
2503
+
2504
+ # BAAI/bge-small-en-v1.5
2505
+
2506
+ ## Overview of this Model
2507
+
2508
+ An Embedding Model which maps text (sentence/ paragraphs) into a vector. The [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) model well known for its effectiveness in capturing semantic meanings in text data. It's a state-of-the-art model trained on a large corpus, capable of generating high-quality text embeddings.
2509
+
2510
+ - 33.36M params (Sizes in ONNX format - "fp32": 127.03MB, "int8": 32.4MB, "uint8": 32.4MB)
2511
+ - 512 maximum input tokens
2512
+ - 384 dimensions of output vector
2513
+ - Licence: mit. The released models can be used for commercial purposes free of charge.
2514
+ - Reference to Original Model: https://huggingface.co/BAAI/bge-small-en-v1.5
2515
+
2516
+
2517
+ ## Quickstart: Deploying this Model in Teradata Vantage
2518
+
2519
+ We have pre-converted the model into the ONNX format compatible with BYOM 6.0, eliminating the need for manual conversion.
2520
+
2521
+ **Note:** Ensure you have access to a Teradata Database with BYOM 6.0 installed.
2522
+
2523
+ For detailed information, refer to the ONNXEmbeddings documentation: TODO
2524
+
2525
+ To get started, clone the pre-converted model directly from the Teradata HuggingFace repository.
2526
+
2527
+
2528
+ ```python
2529
+
2530
+ import teradataml as tdml
2531
+ import getpass
2532
+ from huggingface_hub import hf_hub_download
2533
+
2534
+ model_name = "bge-small-en-v1.5"
2535
+ number_dimensions_output = 384
2536
+ model_file_name = "model.onnx"
2537
+
2538
+ # Step 1: Download Model from Teradata HuggingFace Page
2539
+
2540
+ hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"onnx/{model_file_name}", local_dir="./")
2541
+ hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"tokenizer.json", local_dir="./")
2542
+
2543
+ # Step 2: Create Connection to Vantage
2544
+
2545
+ tdml.create_context(host = input('enter your hostname'),
2546
+ username=input('enter your username'),
2547
+ password = getpass.getpass("enter your password"))
2548
+
2549
+ # Step 3: Load Models into Vantage
2550
+ # a) Embedding model
2551
+ tdml.save_byom(model_id = model_name, # must be unique in the models table
2552
+ model_file = model_file_name,
2553
+ table_name = 'embeddings_models' )
2554
+ # b) Tokenizer
2555
+ tdml.save_byom(model_id = model_name, # must be unique in the models table
2556
+ model_file = 'tokenizer.json',
2557
+ table_name = 'embeddings_tokenizers')
2558
+
2559
+ # Step 4: Test ONNXEmbeddings Function
2560
+ # Note that ONNXEmbeddings expects the 'payload' column to be 'txt'.
2561
+ # If it has got a different name, just rename it in a subquery/CTE.
2562
+ input_table = "emails.emails"
2563
+ embeddings_query = f"""
2564
+ SELECT
2565
+ *
2566
+ from mldb.ONNXEmbeddings(
2567
+ on {input_table} as InputTable
2568
+ on (select * from embeddings_models where model_id = '{model_name}') as ModelTable DIMENSION
2569
+ on (select model as tokenizer from embeddings_tokenizers where model_id = '{model_name}') as TokenizerTable DIMENSION
2570
+ using
2571
+ Accumulate('id', 'txt')
2572
+ ModelOutputTensor('sentence_embedding')
2573
+ EnableMemoryCheck('false')
2574
+ OutputFormat('FLOAT32({number_dimensions_output})')
2575
+ OverwriteCachedModel('true')
2576
+ ) a
2577
+ """
2578
+ DF_embeddings = tdml.DataFrame.from_query(embeddings_query)
2579
+ DF_embeddings
2580
+ ```
2581
+
2582
+
2583
+
2584
+ ## What Can I Do with the Embeddings?
2585
+
2586
+ Teradata Vantage includes pre-built in-database functions to process embeddings further. Explore the following examples:
2587
+
2588
+ - **Semantic Clustering with TD_KMeans:** [Semantic Clustering Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Clustering_Python.ipynb)
2589
+ - **Semantic Distance with TD_VectorDistance:** [Semantic Similarity Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Similarity_Python.ipynb)
2590
+ - **RAG-Based Application with TD_VectorDistance:** [RAG and Bedrock Query PDF Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/RAG_and_Bedrock_QueryPDF.ipynb)
2591
+
2592
+
2593
+ ## Deep Dive into Model Conversion to ONNX
2594
+
2595
+ **The steps below outline how we converted the open-source Hugging Face model into an ONNX file compatible with the in-database ONNXEmbeddings function.**
2596
+
2597
+ You do not need to perform these steps—they are provided solely for documentation and transparency. However, they may be helpful if you wish to convert another model to the required format.
2598
+
2599
+
2600
+ ### Part 1. Importing and Converting Model using optimum
2601
+
2602
+ We start by importing the pre-trained [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) model from Hugging Face.
2603
+
2604
+ To enhance performance and ensure compatibility with various execution environments, we'll use the [Optimum](https://github.com/huggingface/optimum) utility to convert the model into the ONNX (Open Neural Network Exchange) format.
2605
+
2606
+ After conversion to ONNX, we are fixing the opset in the ONNX file for compatibility with ONNX runtime used in Teradata Vantage
2607
+
2608
+ We are generating ONNX files for multiple different precisions: fp32, int8, uint8
2609
+
2610
+ You can find the detailed conversion steps in the file [convert.py](./convert.py)
2611
+
2612
+ ### Part 2. Running the model in Python with onnxruntime & compare results
2613
+
2614
+ Once the fixes are applied, we proceed to test the correctness of the ONNX model by calculating cosine similarity between two texts using native SentenceTransformers and ONNX runtime, comparing the results.
2615
+
2616
+ If the results are identical, it confirms that the ONNX model gives the same result as the native models, validating its correctness and suitability for further use in the database.
2617
+
2618
+
2619
+ ```python
2620
+ import onnxruntime as rt
2621
+
2622
+ from sentence_transformers.util import cos_sim
2623
+ from sentence_transformers import SentenceTransformer
2624
+
2625
+ import transformers
2626
+
2627
+
2628
+ sentences_1 = 'How is the weather today?'
2629
+ sentences_2 = 'What is the current weather like today?'
2630
+
2631
+ # Calculate ONNX result
2632
+ tokenizer = transformers.AutoTokenizer.from_pretrained("BAAI/bge-small-en-v1.5")
2633
+ predef_sess = rt.InferenceSession("onnx/model.onnx")
2634
+
2635
+ enc1 = tokenizer(sentences_1)
2636
+ embeddings_1_onnx = predef_sess.run(None, {"input_ids": [enc1.input_ids],
2637
+ "attention_mask": [enc1.attention_mask]})
2638
+
2639
+ enc2 = tokenizer(sentences_2)
2640
+ embeddings_2_onnx = predef_sess.run(None, {"input_ids": [enc2.input_ids],
2641
+ "attention_mask": [enc2.attention_mask]})
2642
+
2643
+
2644
+ # Calculate embeddings with SentenceTransformer
2645
+ model = SentenceTransformer(model_id, trust_remote_code=True)
2646
+ embeddings_1_sentence_transformer = model.encode(sentences_1, normalize_embeddings=True, trust_remote_code=True)
2647
+ embeddings_2_sentence_transformer = model.encode(sentences_2, normalize_embeddings=True, trust_remote_code=True)
2648
+
2649
+ # Compare results
2650
+ print("Cosine similiarity for embeddings calculated with ONNX:" + str(cos_sim(embeddings_1_onnx[1][0], embeddings_2_onnx[1][0])))
2651
+ print("Cosine similiarity for embeddings calculated with SentenceTransformer:" + str(cos_sim(embeddings_1_sentence_transformer, embeddings_2_sentence_transformer)))
2652
+ ```
2653
+
2654
+ You can find the detailed ONNX vs. SentenceTransformer result comparison steps in the file [test_local.py](./test_local.py)
2655
+
config.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_attn_implementation_autoset": true,
3
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
4
+ "architectures": [
5
+ "BertModel"
6
+ ],
7
+ "attention_probs_dropout_prob": 0.1,
8
+ "classifier_dropout": null,
9
+ "export_model_type": "transformer",
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 384,
13
+ "id2label": {
14
+ "0": "LABEL_0"
15
+ },
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 1536,
18
+ "label2id": {
19
+ "LABEL_0": 0
20
+ },
21
+ "layer_norm_eps": 1e-12,
22
+ "max_position_embeddings": 512,
23
+ "model_type": "bert",
24
+ "num_attention_heads": 12,
25
+ "num_hidden_layers": 12,
26
+ "pad_token_id": 0,
27
+ "position_embedding_type": "absolute",
28
+ "torch_dtype": "float32",
29
+ "transformers_version": "4.47.1",
30
+ "type_vocab_size": 2,
31
+ "use_cache": true,
32
+ "vocab_size": 30522
33
+ }
conversion_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_id": "BAAI/bge-small-en-v1.5",
3
+ "number_of_generated_embeddings": 384,
4
+ "precision_to_filename_map": {
5
+ "fp32": "onnx/model.onnx",
6
+ "fp16": "onnx/model_fp16.onnx",
7
+ "int8": "onnx/model_int8.onnx",
8
+ "uint8": "onnx/model_uint8.onnx"
9
+ },
10
+ "opset": 16,
11
+ "IR": 8
12
+ }
convert.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import shutil
4
+
5
+ from optimum.exporters.onnx import main_export
6
+ import onnx
7
+ from onnxconverter_common import float16
8
+ import onnxruntime as rt
9
+ from onnxruntime.tools.onnx_model_utils import *
10
+ from onnxruntime.quantization import quantize_dynamic, QuantType
11
+
12
+ with open('conversion_config.json') as json_file:
13
+ conversion_config = json.load(json_file)
14
+
15
+
16
+ model_id = conversion_config["model_id"]
17
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
18
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
19
+ opset = conversion_config["opset"]
20
+ IR = conversion_config["IR"]
21
+
22
+
23
+ op = onnx.OperatorSetIdProto()
24
+ op.version = opset
25
+
26
+
27
+ if not os.path.exists("onnx"):
28
+ os.makedirs("onnx")
29
+
30
+ print("Exporting the main model version")
31
+
32
+ main_export(model_name_or_path=model_id, output="./", opset=opset, trust_remote_code=True, task="feature-extraction", dtype="fp32")
33
+
34
+ if "fp32" in precision_to_filename_map:
35
+ print("Exporting the fp32 onnx file...")
36
+
37
+ shutil.copyfile('model.onnx', precision_to_filename_map["fp32"])
38
+
39
+ print("Done\n\n")
40
+
41
+ if "int8" in precision_to_filename_map:
42
+ print("Quantizing fp32 model to int8...")
43
+ quantize_dynamic("model.onnx", precision_to_filename_map["int8"], weight_type=QuantType.QInt8)
44
+ print("Done\n\n")
45
+
46
+ if "uint8" in precision_to_filename_map:
47
+ print("Quantizing fp32 model to uint8...")
48
+ quantize_dynamic("model.onnx", precision_to_filename_map["uint8"], weight_type=QuantType.QUInt8)
49
+ print("Done\n\n")
50
+
51
+ os.remove("model.onnx")
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7a91b627bb26d1d070cfd63c4efc5121b7f5198aad7371647b70b778d42d3249
3
+ size 133195612
onnx/model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a159b205e5e0fcf7ad4fc4c3e4dec947a260cf5df33fc2d8fceb119b84d26e00
3
+ size 33976425
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f73cd6bfeb60c638311baf84937eb1e70b188335499cc3184cdf1e539b4721a8
3
+ size 33976453
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
test_local.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import onnxruntime as rt
2
+
3
+ from sentence_transformers.util import cos_sim
4
+ from sentence_transformers import SentenceTransformer
5
+
6
+ import transformers
7
+
8
+ import gc
9
+ import json
10
+
11
+
12
+ with open('conversion_config.json') as json_file:
13
+ conversion_config = json.load(json_file)
14
+
15
+
16
+ model_id = conversion_config["model_id"]
17
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
18
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
19
+
20
+ sentences_1 = 'How is the weather today?'
21
+ sentences_2 = 'What is the current weather like today?'
22
+
23
+ print(f"Testing on cosine similiarity between sentences: \n'{sentences_1}'\n'{sentences_2}'\n\n\n")
24
+
25
+ tokenizer = transformers.AutoTokenizer.from_pretrained("./")
26
+ enc1 = tokenizer(sentences_1)
27
+ enc2 = tokenizer(sentences_2)
28
+
29
+ for precision, file_name in precision_to_filename_map.items():
30
+
31
+
32
+ onnx_session = rt.InferenceSession(file_name)
33
+ embeddings_1_onnx = onnx_session.run(None, {"input_ids": [enc1.input_ids],
34
+ "attention_mask": [enc1.attention_mask]})[1][0]
35
+
36
+ embeddings_2_onnx = onnx_session.run(None, {"input_ids": [enc2.input_ids],
37
+ "attention_mask": [enc2.attention_mask]})[1][0]
38
+
39
+ del onnx_session
40
+ gc.collect()
41
+ print(f'Cosine similiarity for ONNX model with precision "{precision}" is {str(cos_sim(embeddings_1_onnx, embeddings_2_onnx))}')
42
+
43
+
44
+
45
+
46
+ model = SentenceTransformer(model_id, trust_remote_code=True)
47
+ embeddings_1_sentence_transformer = model.encode(sentences_1, normalize_embeddings=True, trust_remote_code=True)
48
+ embeddings_2_sentence_transformer = model.encode(sentences_2, normalize_embeddings=True, trust_remote_code=True)
49
+ print('Cosine similiarity for original sentence transformer model is '+str(cos_sim(embeddings_1_sentence_transformer, embeddings_2_sentence_transformer)))
test_teradata.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ import teradataml as tdml
3
+ from tabulate import tabulate
4
+
5
+ import json
6
+
7
+
8
+ with open('conversion_config.json') as json_file:
9
+ conversion_config = json.load(json_file)
10
+
11
+
12
+ model_id = conversion_config["model_id"]
13
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
14
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
15
+
16
+ host = sys.argv[1]
17
+ username = sys.argv[2]
18
+ password = sys.argv[3]
19
+
20
+ print("Setting up connection to teradata...")
21
+ tdml.create_context(host = host, username = username, password = password)
22
+ print("Done\n\n")
23
+
24
+
25
+ print("Deploying tokenizer...")
26
+ try:
27
+ tdml.db_drop_table('tokenizer_table')
28
+ except:
29
+ print("Can't drop tokenizers table - it's not existing")
30
+ tdml.save_byom('tokenizer',
31
+ 'tokenizer.json',
32
+ 'tokenizer_table')
33
+ print("Done\n\n")
34
+
35
+ print("Testing models...")
36
+ try:
37
+ tdml.db_drop_table('model_table')
38
+ except:
39
+ print("Can't drop models table - it's not existing")
40
+
41
+ for precision, file_name in precision_to_filename_map.items():
42
+ print(f"Deploying {precision} model...")
43
+ tdml.save_byom(precision,
44
+ file_name,
45
+ 'model_table')
46
+ print(f"Model {precision} is deployed\n")
47
+
48
+ print(f"Calculating embeddings with {precision} model...")
49
+ try:
50
+ tdml.db_drop_table('emails_embeddings_store')
51
+ except:
52
+ print("Can't drop embeddings table - it's not existing")
53
+
54
+ tdml.execute_sql(f"""
55
+ create volatile table emails_embeddings_store as (
56
+ select
57
+ *
58
+ from mldb.ONNXEmbeddings(
59
+ on emails.emails as InputTable
60
+ on (select * from model_table where model_id = '{precision}') as ModelTable DIMENSION
61
+ on (select model as tokenizer from tokenizer_table where model_id = 'tokenizer') as TokenizerTable DIMENSION
62
+
63
+ using
64
+ Accumulate('id', 'txt')
65
+ ModelOutputTensor('sentence_embedding')
66
+ EnableMemoryCheck('false')
67
+ OutputFormat('FLOAT32({number_of_generated_embeddings})')
68
+ OverwriteCachedModel('true')
69
+ ) a
70
+ ) with data on commit preserve rows
71
+
72
+ """)
73
+ print("Embeddings calculated")
74
+ print(f"Testing semantic search with cosine similiarity on the output of the model with precision '{precision}'...")
75
+ tdf_embeddings_store = tdml.DataFrame('emails_embeddings_store')
76
+ tdf_embeddings_store_tgt = tdf_embeddings_store[tdf_embeddings_store.id == 3]
77
+
78
+ tdf_embeddings_store_ref = tdf_embeddings_store[tdf_embeddings_store.id != 3]
79
+
80
+ cos_sim_pd = tdml.DataFrame.from_query(f"""
81
+ SELECT
82
+ dt.target_id,
83
+ dt.reference_id,
84
+ e_tgt.txt as target_txt,
85
+ e_ref.txt as reference_txt,
86
+ (1.0 - dt.distance) as similiarity
87
+ FROM
88
+ TD_VECTORDISTANCE (
89
+ ON ({tdf_embeddings_store_tgt.show_query()}) AS TargetTable
90
+ ON ({tdf_embeddings_store_ref.show_query()}) AS ReferenceTable DIMENSION
91
+ USING
92
+ TargetIDColumn('id')
93
+ TargetFeatureColumns('[emb_0:emb_{number_of_generated_embeddings - 1}]')
94
+ RefIDColumn('id')
95
+ RefFeatureColumns('[emb_0:emb_{number_of_generated_embeddings - 1}]')
96
+ DistanceMeasure('cosine')
97
+ topk(3)
98
+ ) AS dt
99
+ JOIN emails.emails e_tgt on e_tgt.id = dt.target_id
100
+ JOIN emails.emails e_ref on e_ref.id = dt.reference_id;
101
+ """).to_pandas()
102
+ print(tabulate(cos_sim_pd, headers='keys', tablefmt='fancy_grid'))
103
+ print("Done\n\n")
104
+
105
+
106
+ tdml.remove_context()
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff